Bitcoin price prediction using Deep Neural Networks

32
Bitcoin price prediction using Deep Neural Networks Author Michelle Appel 10170359 [email protected] Supervisor Tom Runia PhD student Deep Learning Science Park 904, Room C3.250A A thesis submitted in partial fulfillment for the degree of Bachelor of Science in Beta-Gamma major: Artificial Intelligence University of Amsterdam June 2016

Transcript of Bitcoin price prediction using Deep Neural Networks

Page 1: Bitcoin price prediction using Deep Neural Networks

Bitcoin price prediction usingDeep Neural Networks

AuthorMichelle [email protected]

SupervisorTom RuniaPhD student Deep LearningScience Park 904, Room C3.250A

A thesis submitted in partial fulfillment for thedegree of Bachelor of Science

in

Beta-Gamma major: Artificial IntelligenceUniversity of Amsterdam

June 2016

Page 2: Bitcoin price prediction using Deep Neural Networks

Abstract

Bitcoin is currently a thriving open-source community and paymentnetwork, which is currently used by approximately 10 million people. Asthe value of Bitcoin in US Dollar fluctuates every day, it would be veryinteresting for investors to forecast the Bitcoin value but at the same timemaking it difficult to predict. This work focuses on predicting Bitcoinprices using a Long Short Term Memory (LSTM) algorithm. The effectof scaling methods on prediction effectivity has be investigated and it hasbeen found that scaling the features and target will improve predictioneffectivity. Also, an experiment on the effect of different feature combin-ations on prediction effectivity has been conducted and a certain optimalcombination of 9 features has been found. Finally, the effect of differ-ent sequence lengths and prediction delays on prediction effectivity hasbeen tested and it has been found that the best absolute prediction isdone using a prediction delay of 0 and a sequence length of 1, which isagainst expectations. Overall, the predictions are slightly better than thebaseline, which is by taking the last known Bitcoin price as a predictionfor the day to be predicted.

2

Page 3: Bitcoin price prediction using Deep Neural Networks

Contents

1 Introduction 51.1 Bitcoin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51.2 Time series analysis in financial context . . . . . . . . . . . . . . 51.3 Outline of this thesis . . . . . . . . . . . . . . . . . . . . . . . . . 6

2 Literature review 72.1 Regression techniques for stock prediction . . . . . . . . . . . . . 7

2.1.1 Linear regression . . . . . . . . . . . . . . . . . . . . . . . 72.1.2 Multiple regression . . . . . . . . . . . . . . . . . . . . . . 7

2.2 Machine learning . . . . . . . . . . . . . . . . . . . . . . . . . . . 72.3 Artificial Neural Networks . . . . . . . . . . . . . . . . . . . . . . 7

2.3.1 Gradient descent . . . . . . . . . . . . . . . . . . . . . . . 82.3.2 Recurrent Neural Networks and Long Short-Term Memory 8

2.4 Drivers of the Bitcoin price . . . . . . . . . . . . . . . . . . . . . 92.4.1 Popularity . . . . . . . . . . . . . . . . . . . . . . . . . . . 92.4.2 Economic drivers . . . . . . . . . . . . . . . . . . . . . . . 92.4.3 Technical drivers . . . . . . . . . . . . . . . . . . . . . . . 10

3 Method 113.1 Data retrieval . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113.2 Data pre-processing . . . . . . . . . . . . . . . . . . . . . . . . . 13

3.2.1 Match by date . . . . . . . . . . . . . . . . . . . . . . . . 133.2.2 Bitcoin price equal to 0 . . . . . . . . . . . . . . . . . . . 133.2.3 Split data in train set, validation set and test set . . . . . 13

3.3 Evaluation metrics . . . . . . . . . . . . . . . . . . . . . . . . . . 143.4 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

3.4.1 Sequence length . . . . . . . . . . . . . . . . . . . . . . . 143.4.2 Prediction delay . . . . . . . . . . . . . . . . . . . . . . . 143.4.3 The baseline . . . . . . . . . . . . . . . . . . . . . . . . . 15

3.5 Proposed training architecture . . . . . . . . . . . . . . . . . . . 153.5.1 Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153.5.2 Architecture . . . . . . . . . . . . . . . . . . . . . . . . . 163.5.3 Fit model . . . . . . . . . . . . . . . . . . . . . . . . . . . 163.5.4 Optimisation . . . . . . . . . . . . . . . . . . . . . . . . . 16

4 Experiments 174.1 Data scaling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174.2 Feature selection . . . . . . . . . . . . . . . . . . . . . . . . . . . 184.3 Testing different prediction delays and sequence lengths . . . . . 18

5 Results 205.1 Data scaling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205.2 Feature selection . . . . . . . . . . . . . . . . . . . . . . . . . . . 215.3 Testing different prediction delays and sequence lengths . . . . . 24

3

Page 4: Bitcoin price prediction using Deep Neural Networks

6 Discussion and conclusion 306.1 Data scaling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 306.2 Feature selection . . . . . . . . . . . . . . . . . . . . . . . . . . . 306.3 Testing different prediction delays and sequence lengths . . . . . 306.4 Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

4

Page 5: Bitcoin price prediction using Deep Neural Networks

1 Introduction

1.1 Bitcoin

Satoshi Nakamoto, a pseudonym for the mysterious developer of the Bitcoin,published a paper in 2008 called Bitcoin: A Peer-to-Peer Electronic Cash System[6] which describes the mechanism of the Bitcoin network, its transactions andhow the double-spending problem can be solved. Nakamoto released the Bit-coin software in 2009, which is currently a thriving open-source community andpayment network, which is currently used by approximately 10 million people 1.

Bitcoin is a cryptocurrency, which is a digital currency that is entirely decent-ralised, meaning it is based on peer-to-peer transactions without going througha financial institution. It has several advantages such as a controlled and knownalgorithm for currency creation and an transparency for all transactions mean-ing all transactions including transaction size, time stamp, sender wallet addressand receiver wallet address are stored in the block chain, however, it has somekind of anonymity as well, because only wallet addresses, but no names arestored. Altogether it makes a possibly desirable alternative for classic curren-cies like US Dollar, Euro or Chinese Yen.

The anonymity of Bitcoin transactions is a desirable feature for people thatwant to commit illegal activities as they can’t be traced down. Also, no taxesare payed over Bitcoin transactions, which may possibly lead to a collapse ofsociety when it will be the only used payment method and no other solution isfound to this problem.

1.2 Time series analysis in financial context

The value of Bitcoin in US Dollar fluctuates every day, which makes it diffi-cult to predict future Bitcoin prices. Bitcoin prices are in particular difficultto predict, as it has an even more volatile character than most stock markets[8]. Knowing future Bitcoin values, of course, yields profit for investors as theyknow when to buy or sell Bitcoin in order to gain maximum profit. If investorshave knowledge of future Bitcoin prices, they can perform actions based on thisknowledge: when the Bitcoin price will rise they can invest and when prices willdrop they can sell it.

Prediction can be possible if there exists a relation between historic data andfuture Bitcoin value. Machine learning can then be used to recognise patternsbetween historic data and future Bitcoin prices in order to predict them. Whenmachine learning succeeds to effectively predict Bitcoin prices, the method canbe implemented for Bitcoin prediction software.

1Based on wallet addresses richer than 1 USD, retrieved from https://bitinfocharts.

com/top-100-richest-bitcoin-addresses.html

5

Page 6: Bitcoin price prediction using Deep Neural Networks

Prediction can be achieved using classification or regression. The majordifference between classification and regression is that classification dependson variables which are not ordered whereas regression depends on continuous-valued ordered variables. Regression allows for exact future Bitcoin prices tobe predicted whereas classification allows for a limited amount of classes tobe predicted. This work aims at predicting Bitcoin values in US Dollar usingregression.

1.3 Outline of this thesis

In order to predict Bitcoin price as effective as possible, first in subsection 2.2will be investigated what regression machine learning method has proven to beeffective at similar problems, such as stock predictions. This regression methodwill be tested on predicting future Bitcoin prices. Then, in subsection 2.4, willbe investigated which factors are related with future Bitcoin prices, which canbe collected and used for the predictions.

Next, in subsection 4.1, the effect of feature scaling and target scaling onfuture Bitcoin price predictions will be investigated. The hypothesis is that scal-ing features allows for regression to find a solution faster, but not in particularmore accurate. Predictions will be made using different scaling methods anderrors with respect to the actual values will be calculated. The method withthe lowest average error will be the most effective method.

Then, in subsection 4.2 the effect of different features will be tested by train-ing a model and make predictions using different combinations of features. Thecombinations of features that causes the least amount of error with its predic-tions will be the best combination of features.

Finally, in subsection 4.3, the delay between known data and predictionswill be tested. The first hypothesis is that the closer to the future, the betterthe predictions, because earlier predictions allow for less uncertainty. However,it is possible that some features have a relation with Bitcoin prices further inthe future than 1 day. The effect of the prediction delay will be tested togetherwith the length of the input sequence. The expectation is that the longer theinput length, the better the prediction as there is more information available tobase a prediction on. Different combinations of prediction delays and sequencelengths will be used to make a prediction with and the method with the lowesterror with respect to the actual value will be the best combination of predictiondelay and sequence length.

These experiments will possibly help to achieve an effective future Bitcoinprice prediction method.

6

Page 7: Bitcoin price prediction using Deep Neural Networks

2 Literature review

2.1 Regression techniques for stock prediction

Regression can be used to for predicting an output based on a given input. Thesimplest form of regression is linear regression, where a single variable is usedto base a prediction on. More advanced regression is multiple regression, wheremore variables are used to base a prediction on [1].

2.1.1 Linear regression

Linear regression is used to predict a relationship between a dependent and aindependent variable, which can be represented as

Y = C + WX

where Y is the dependent variable, X is the independent variable, C is a constantand W is the slope of the regression line.

2.1.2 Multiple regression

Multiple regression uses multiple variables to predict a relationship between thedependent variable Y and independent variables, which can be represented as

Y = w0 + w1x1 + ... + wnxn

where wn to wn are the coefficients, or weights, and x1 to xn are the independ-ent variables, or features.

2.2 Machine learning

Machine learning can be used to find the optimal weights of this regression for-mula such that it describes the relation between features and target as well aspossible. The algorithm learns from a bunch of examples by minimising theerror of the regression line with respect to the true values of these examples,with the intention it will describe the relation of future or unseen cases as well.

2.3 Artificial Neural Networks

Artificial Neural Networks (ANNs) is a form of machine learning which is basedon a collection of connected units which can perform operations. The inform-ation flow trough the units and the operations performed are the weights thatneed to be optimised in order for a model to be fitted to the data. Learning isdone using gradient descent.

7

Page 8: Bitcoin price prediction using Deep Neural Networks

2.3.1 Gradient descent

Machine learning can find the best fitting values for weights of the regressionfunction using gradient descent. Gradient descent is used to find the minimumerror of a regression function with respect to the true values it has to approach,by taking small steps towards a (local) optimum, which is where the error func-tion is at minimum. Such error functions are described at subsection 3.3.

2.3.2 Recurrent Neural Networks and Long Short-Term Memory

Recurrent Neural Networks (RNNs) are a collection of machine learning meth-ods, which has become a widely used method for extracting patterns from tem-poral sequences [7], making it possibly effective for predicting time series likethe Bitcoin price trend. A RNN is an ANN equipped with temporal memory,as it takes a sequence as input. As can be seen in Figure 1, every element ofthe input sequence will be fed to a separate RNN cell, classified and its outputwill be passed to the next RNN cell, until the last cell is reached, and a finalprediction is made.

Figure 1: Image courtesy of Chris Ola. The unrolled architecture of a RecurrentNeural Network with x0 to xt are the input values of the sequence, A is arecurrent cell and h0 to ht are the output values.

The memory of a RNN will quickly fade over time, due to the method ofpassing information over time, which is through ordinary nodes when using aclassic RNN, making time series analysis less effective, in particular for longerinput and/or output sequences. This is called the vanishing gradient problem,which can be solved by introducing a Long Short Term Memory (LSTM) [9].

An LSTM is an RNN which has a memory cell which can maintain its stateover time by using non-linear gating units which regulate the information flowinto and out of the cell [2], as can be seen in figure 2. This allows for thetemporal dimension of the data to be better taken into account rather thanusing a classic RNN, which may be the reason for the effective results on stockmarket prediction gained in earlier studies [9].

8

Page 9: Bitcoin price prediction using Deep Neural Networks

Figure 2: Image courtesy of Chris Ola. The architecture of a LSTM cell withits forget gate, input gate layer and an output gate layer.

2.4 Drivers of the Bitcoin price

In order to predict Bitcoin prices using machine learning, there has to exist datathat hold a relation with the Bitcoin price in such a way that when this datafluctuates Bitcoin price fluctuates as well. To make effective future Bitcoin pricepredictions, the data must have a relation with future Bitcoin prices, makingthem leading towards Bitcoin price. Such factors may be effective Bitcoin pricepredicting features.

Multiple studies have shown there exist certain factors that hold a relation-ship with Bitcoin price [3] [4] [5] [10].

2.4.1 Popularity

The popularity of Bitcoin seems to hold such a leading relation [3] [5]. A meas-urement of popularity is query volume, which is the amount of times a subjecthas been queried.

Google Trends’ query volume of the query “Bitcoin” 2 represents the numberof times “Bitcoin” is searched with Google search engine per day. Google Trends’query volume of the query “Bitcoin” holds a positive and leading relation withthe Bitcoin price [3]. Wikipedia query volume seems to hold a positive andleading relation as well [3].

2.4.2 Economic drivers

Bitcoin is a currency, which indicates there are economic drivers of Bitcoin price.The demand of Bitcoin holds a positive relation with the price of Bitcoin and

the number of transactions being done with Bitcoin has a positive and leadingrelation with the price of Bitcoin [3].

2https://trends.google.nl/trends/explore?q=Bitcoin

9

Page 10: Bitcoin price prediction using Deep Neural Networks

2.4.3 Technical drivers

Technical factors are unique for the Bitcoin market, as other stock markets donot have technical factors, making them very interesting for predicting futureBitcoin prices.

Madan et al. [4] used a selection of 16 features related to the Bitcoin networkretrieved from Blockchain Info3, which can be seen in Table 1. These featureswere selected manually based the significance of solving the problem of predict-ing Bitcoin trends using classification.

Feature Definition

Average Confirmation Time Average time to accept transaction in block

Block Size Average block size in MB

Cost per transaction percent Miners revenue divided by the number of transactions

Difficulty How difficult it is to find a new block

Estimated Transaction Volume Total output volume without change from value

Hash Rate Bitcoin network giga hashes per second

Market Capitalization Number of Bitcoins in circulation * the market price

Miners Revenue (number of BTC mined/day * market price) + transaction fees

Number of Orphaned Blocks Number of blocks mined / day not off blockchain

Number of TXN per block Average number of transactions per block

Number of TXN Total number of unique Bitcoin transactions per day

Number of unique addresses Number of unique Bitcoin addresses used per day

Total Bitcoins Historical total Number of Bitcoins mined

TXN Fees Total BTC value of transaction fees miners earn/day

Trade Volume USD trade volume from the top exchanges

Transactionto trade ratio Relationship of BTC transaction volume and USD volume

Table 1: Names and descriptions of the 16 features chosen in the research ofMadan et al. [4] that relate to the Bitcoin network

The data used is a 24-hour time series, which seems to minimize noise con-cerns from higher granularity measurements and minute volatility [4].

The best result obtained by the research of Madan et al. is a predictionaccuracy at the test setof 0.9879 using a Binomial Generalised Linear Model,which indicates that technical features, and in particular this selection of fea-tures, is successful for predicting future Bitcoin prices.

3https://blockchain.info/

10

Page 11: Bitcoin price prediction using Deep Neural Networks

3 Method

First, the data is retrieved and pre-processed. Next, the machine learning en-vironment is set up. Then, the best feature scaling technique will be tested, thebest combination of features will be selected and finally different combinationsof sequence lengths and prediction delays are tested.

3.1 Data retrieval

The data is retrieved using APIs from Blockchain Info4, which is a 24-hour timeseries. The BTC price in USD is visualised in Figure 3, which will be the targetvalues for prediction. The data starts at 01-03-2009 and ends one day beforethe present day as the data is retrieved, which is at 24-06-2017 when the datais retrieved in this work.

Figure 3: The average Bitcoin price in USD per day, retrieved from BlockchainInfo at 24-06-2017.

Other data retrieved from blockchain.info is described in Table 2, includingfeatures described in subsection 2.4.3 and other available data.

4https://blockchain.info/nl/charts

11

Page 12: Bitcoin price prediction using Deep Neural Networks

Feature Definition

Average USD price Average USD market price across major bitcoin exchanges.

Market capitalization The total USD value of bitcoin supply in circulation, as calculatedby the daily average market price across major exchanges.

BTC in ciculation The total number of bitcoins that have already been mined; inother words, the current supply of bitcoins on the network.

USD exchange trade volume The total USD value of trading volume on major bitcoin exchanges.

Blockchain size The total size of all block headers and transactions in MB. Notincluding database indexes.

Average block size The average block size in MB.

No. orphaned blocks The total number of blocks mined but ultimately not attached tothe main Bitcoin blockchain.

TXN per block The average number of transactions per block.

Median confirmation time The median time for a transaction to be accepted into a minedblock and added to the public ledger (note: only includes transac-tions with miner fees).

BTC unlimited support Percentage of blocks signalling Bitcoin Unlimited support.

Hash rate The estimated number of tera hashes per second (trillions of hashesper second) the Bitcoin network is performing.

Difficulty A relative measure of how difficult it is to find a new block. Thedifficulty is adjusted periodically as a function of how much hashingpower has been deployed by the network of miners.

Miners revenue The estimated number of tera hashes per second (trillions of hashesper second) the Bitcoin network is performing.

Total TXN fees The total value of all transaction fees paid to miners in BTC (notincluding the coinbase value of block rewards).

Total TXN fees USD The total value of all transaction fees paid to miners in USD (notincluding the coinbase value of block rewards).

Cost per TXN percent Miners revenue as percentage of the transaction volume.

Cost per TXN Miners revenue divided by the number of transactions.

N unique adresses The total number of unique addresses used on the Bitcoin block-chain.

N transactions per day The number of daily confirmed Bitcoin transactions per day.

Total number of transactions Total number of transactions

N TXN The total number of Bitcoin transactions, excluding those involvingany of the network’s 100 most popular addresses.

N TXN exc chains longer than 100 The total number of Bitcoin transactions per day excluding thosepart of long transaction chains. There are many legitimate reas-ons to create long transaction chains; however, they may also becaused by coin mixing or possible attempts to manipulate transac-tion volume.

Output value The total value of all transaction outputs per day (includes coinsreturned to the sender as change).

Estimated USD transaction value The Estimated Transaction Value in USD value.

Table 2: Features and their definition retrieved from Blockchain Info that willbe used to predict the average USD price of the Bitcoin with. These will be thetarget values that are to be predicted.

12

Page 13: Bitcoin price prediction using Deep Neural Networks

3.2 Data pre-processing

After the data has been retrieved, some modification needs to be done beforemachine learning can be applied. First, the data has to be matched by date.Then, the part where the value of the Bitcoin is 0 USD is removed. Finally, thedata will be split in a train and test set and the data will be ready for machinelearning to train a model with.

3.2.1 Match by date

The retrieved data needs to be matched to date to make the input vectors formachine learning. A part of the data has a daily resolution whereas other datahas hourly resolution, making matching not possible by just inserting all datain a matrix. All dates of the data set are converted to Unix time stamps, bywhich the data is matched and inserted in a matrix.

3.2.2 Bitcoin price equal to 0

In the first 294 entries of the data set the BTC price in USD equal 0. This partof the data will not contribute to training a model, thus will be left out.

3.2.3 Split data in train set, validation set and test set

The data is split in a train set, validation set and test set according to a givenratio. The train set is the first part of the data, the validation set is the secondpart of the data and the test set is the last part of the data. By using a train-testratio of 0.9, the data is split as shown in Figure 4. From the train set is the last0.1 part used as validation set.

(a) Train and validation set (b) Test set

Figure 4: The train-validation target set and the test target set with a train-testratio of 0.9.

13

Page 14: Bitcoin price prediction using Deep Neural Networks

3.3 Evaluation metrics

The L1 and L2 errors are used to calculate the difference between the predictedvalues and the true values as a measurement of how well the prediction are: thesmaller the error, the better the predictions.

L1 error The L1 error of a prediction is the absolute average error, which iscalculated by:

L1 =

n∑i=1

|yi − f(xi)|

where yi is the target value and f(xi) is the predicted value.

L2 error The L2 error of a prediction is the mean squared error, which iscalculated by:

L2 =

n∑i=1

(yi − f(xi))2

where yi is the target value and f(xi) is the predicted value.

The main difference is that using the L2 error will be much larger in the caseof outliers compared to the L1 error: when an error is already large, squaringit makes it even larger.

3.4 Definitions

3.4.1 Sequence length

The sequence length is the length of the input sequence to the LSTM, whichcan vary between 1 and the length of the entire data set - prediction length.The sequence length can be seen as the amount of inputs x0 to xn, as showedin Figure 6. In the case of this work every element of the sequence holds thevalues of one day. When the sequence length is 1 day the LSTM will act like aregular artificial neural network as there is no time series to analyse but onlya single day. With the sequence length at maximum, the LSTM will only haveone example to train with, so this will not be desirable. A balance betweenamount of training examples and sequence length needs to be used.

3.4.2 Prediction delay

The prediction delay is the amount of steps, in this case days, between knowndata and prediction, where a prediction delay of 0 days means that the nextday with respect to the known data is predicted.

14

Page 15: Bitcoin price prediction using Deep Neural Networks

3.4.3 The baseline

The baseline is the simplest prediction possible. For this problem the baselineis defined by taking the Bitcoin price of the last day of the sequence of input asprediction. The the L1 error and L2 error can be calculated accordingly. Thealgorithm is useful if it predicts at least better than this baseline and thus theerrors of the prediction are lower than those of the baseline.

A visualisation of the baseline prediction of the test set with a predictiondelay equal to 0 can be seen in Figure 5.

Figure 5: The baseline prediction of the test set, where the baseline is the lastday of the sequence, which is in this figure one day before the prediction.

3.5 Proposed training architecture

3.5.1 Tools

To implement a LSTM, Python 3.5 is used to import Keras 5, which is a Pythonlibrary which uses Tensor Flow as a back end. The Keras sequential model isused to build a model with and LSTM layers and dense layers are used to formthe architecture with and use regression to predict future Bitcoin values.3.0

5Documentation at: https://keras.io/layers/recurrent

15

Page 16: Bitcoin price prediction using Deep Neural Networks

3.5.2 Architecture

The architecture consist of input LSTM layers, which contain 256 units, andan output dense layer, as is visualised in Figure 6. The amount of 256 units isexperimentally chosen as it is an balance between enough units to fit the modelwell and not too many too just ’store’ the train data so that is would lead tooverfitting.

Figure 6: The architecture of the LSTM used, where x0 to xn are the elementsof the input sequence, with LSTM layers A which each contain 256 units, D thedense layer and h the predicted output value of the Bitcoin price in US Dollars.

3.5.3 Fit model

The model is fit using the train data, which is 0.9 part of the data set asdescribed in as described in subsection 3.2.3, of which the last 0.1 part is usedas validation set. A batch size of 32 is used, which is the number of samplesthat is going to be propagated through the network.

3.5.4 Optimisation

For gradient descent to find optimum as effective as possible, as described in sub-section 2.3.1, RMSprop is the optimiser algorithm used in this work to performgradient descent, which is usually a good choice for recurrent neural networks.

16

Page 17: Bitcoin price prediction using Deep Neural Networks

4 Experiments

To predict Bitcoin price as effective as possible, three experiments will be con-ducted. First, three methods of data scaling are compared on effectivity. Next,the effect of using different features combinations on prediction effectivity willbe tested. Finally, the effect of prediction delay and sequence length on theprediction effectivity are investigated.

4.1 Data scaling

In order to potentially optimise gradient descent data scaling is applied. Theformula by which data is scaled is given as:

x′ =x−min(xtrain)

max(xtrain)−min(xtrain)

where x is the original data and x′ is the scaled data. The min(xtrain) is theminimum value of the training set and max(xtrain) is the maximum value of thetraining set. The minimum values and maximum values of the training set areused to scale all data (including the test set), because when predicting unknowndata the maximum and minimum value will be unknown thus can not be scaledaccordingly. After scaling the data of the train set will have a range in [0, 1].

Usually feature scaling is done before applying machine learning as it oftenleads to better gradient descent performance and quicker convergence to a solu-tion. Target scaling is less usual, but seems to affect the effectiveness of thepredictions. After prediction, the results are denormalised to obtain a usefullprediction value using the following formula:

x = x′ × (max(xtrain)−min(xtrain)) + min(xtrain)

The effect of both feature and target scaling on prediction effectiveness willbe investigated by comparing predictions made by a trained model to the realvalues. By training a model the only feature used is the Bitcoin price in USDwith a sequential length of 1 and no prediction delay.

Models will be trained using a initial learning rate of 0.005, using the Re-duceLROnPlateau callback, which reduces the learning rate if no improvement isseen for 10 epochs. When converged, the EarlyStopping callback is used, whichstops the training when no improvement is seen for 25 epochs. A maximumamount of 800 epochs is set.

The results can be seen in subsection 5.1 of the result section.

17

Page 18: Bitcoin price prediction using Deep Neural Networks

4.2 Feature selection

In order to investigate which features contribute most to predicting Bitcoinprices, results of different combinations of features can be compared. All pos-sible combinations are given by the power set of these features. Having theBitcoin price as a feature is desirable as the price prediction can be based onthe last price, so this will always be the first element of the feature matrix, sothe power set of feature combinations will be the power set of all features ex-cluding the Bitcoin price and appending these to the Bitcoin price feature vector.

The power set of these 18 features with Bitcoin price feature pinned as firstfeature vector of all combinations has a length of 262144, making that it wouldtake too long to train the entire set, so a more efficient method must be usedto test effectiveness of feature combinations.

The method that is being used instead is a greedy method, by taking thefirst feature vector, which is average USD price, and in turn appending everyother feature vector to it. Using 19 features, 18 different combinations can bemade this way. Each combination is used to create a model with and predictthe test set of which the errors are calculated. The feature combination withthe smallest error is ’pinned down’ and the process is repeated by appendingthe remaining features to those 2 feature vectors in the next round. This isrepeated until the length of 19 features is reached after 18 rounds. This reducedthe problem to a size of 180 combinations. The appended features can be seenin Table 5 and the pinned down sets can be seen in Table 6, in subsection 5.2.

This method may cause some feature combinations that are possibly effect-ive, not to be found, because this process may not test certain combinations.The effect of some feature combinations may only appear in certain combina-tions, but not separated.

The different feature combinations are used to train a model with and testedon the test set. The prediction delay is set to 0 and the sequence length is setto 1, which would be the most basic setting. This may cause other sequencelengths and delays to be more effective with other feature combinations.

The results can be viewed in subsection 5.2.

4.3 Testing different prediction delays and sequence lengths

To investigate what prediction delay and sequence length will gain the mosteffective results, all different combinations must be used to train a model andpredict the test set. The prediction delay is the amount of days between knowndata and prediction, where a prediction delay of 0 days means that the nextday of the known data is predicted.

18

Page 19: Bitcoin price prediction using Deep Neural Networks

A prediction delay between 0 and 60 days is chosen, because leading factorsseem to correlate up to 30 days [3] where the extra 30 days are there to in-clude possible correlations outside of the 30 days boundary. A sequence lengthbetween 1 and 20 is chosen.

The data is split using a train-test ratio of 0.9, with 0.1 of the train setused as validation set. Both the target and features are scaled as described insection Data scaling. The resulting features found in section Feature selectionare used to predict with, described in subsection 5.2. A batch size of 32 is usedto train the models with and 256 units in the LSTM layer. An initial learningrate of 10−3 and decreases with factor 0.4 every time the validation loss doesnot decrease within 5 epochs. Early stopping is used when the validation lossdoes not decrease within 25 epochs.

The results can be found in subsection 5.3.

19

Page 20: Bitcoin price prediction using Deep Neural Networks

5 Results

5.1 Data scaling

After convergence, using all three methods, the L1 and L2 errors of predictionswith respect to the true value of the test set are calculated, as can be viewed inTable 3. The difference in performance can clearly be seen in Figure 7 as thepredictions of this method are closer to the real values than those of the othermethods. As a result of this outcome the method of both scaling the featuresand target will be used in the rest of this research.

Feature scaling Target scaling L1 error L2 errorFalse False 382.65 344430.24True False 245.86 202890.29True True 64.99 13004.66

Table 3: Scaling methods and their errors with respect to true values of the testset.

Figure 7: BTC price prediction of the test set in USD using different scalingmethods.

20

Page 21: Bitcoin price prediction using Deep Neural Networks

5.2 Feature selection

After running all combinations, the best results is to be found in round 8, powerset number 5, which is the combination of the features:

• average USD price

• BTC in circulation

• median confirmation time

• no orphaned blocks

• output value

• estimated USD transaction value

• transactions per block

• market capitalization

• cost per transaction

The L2 error of the predictions of the test set is 6753.22, as can be seen inTable 4 which is visually represented by Figure 8.

The exact combinations of features per round and set number can be foundin Table 5 and Table 6, where Table 5 contains the features that are appendedper round and set number and Table 6 contains the features that are pinneddown per round.

21

Page 22: Bitcoin price prediction using Deep Neural Networks

round \set no 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18Baseline 7316.79 7316.79 7316.79 7316.79 7316.79 7316.79 7316.79 7316.79 7316.79 7316.79 7316.79 7316.79 7316.79 7316.79 7316.79 7316.79 7316.79 7316.79 7316.791 10398.86 13220.72 13531.71 12985.02 14951.17 7351.05 9416.30 18583.32 14097.23 10344.81 10020.29 12812.12 16566.17 39393.31 14130.65 17691.68 14372.63 14904.22 18127.592 7309.82 16208.69 9385.91 11391.40 9662.67 7481.27 15092.97 10481.11 7318.79 7435.80 8510.33 11524.70 37036.92 10741.94 14684.43 10990.45 11133.68 16650.213 7321.70 16095.12 8773.30 10319.26 9952.08 7054.75 12453.11 8789.31 7262.08 8215.36 10728.69 35137.73 10118.61 12359.44 10197.67 10711.17 15634.184 7211.18 15944.04 8708.27 10574.39 9860.73 12017.57 9010.01 7357.12 7885.80 10005.67 26580.76 10109.07 11941.35 10560.06 10236.18 13522.065 7277.08 16599.60 8912.57 10666.75 9945.27 12413.04 9345.54 8068.77 10476.92 41841.29 9560.82 12132.32 10876.14 9738.67 15022.476 7861.70 12700.01 10223.26 14246.74 13425.93 16596.14 12321.77 12488.54 32514.82 13472.18 14747.38 12698.33 13479.34 20102.337 10334.04 8459.10 15067.49 16760.36 16595.91 12483.56 14443.62 103824.72 14096.80 15343.22 15440.76 15732.95 25634.748 8229.77 6969.40 7245.78 7054.94 7016.18 6753.22 16274.46 7922.35 7395.98 7055.87 7084.31 12724.949 7010.27 7026.70 8441.65 7053.65 7132.18 15913.75 7288.37 7663.94 7129.18 7693.55 12129.0810 7025.67 8486.93 7203.43 7186.08 12554.88 7139.28 8968.21 7217.74 7559.51 12017.3711 7179.25 8998.59 7324.81 7314.82 10742.13 8746.12 7351.84 7248.95 9425.9012 7402.76 8707.04 7482.07 16790.67 8381.26 8406.26 8352.71 11561.5413 9556.99 12160.97 37445.38 8984.66 10280.09 8610.59 14834.5414 8232.15 9061.42 24618.21 11937.05 9772.44 10906.6615 9823.88 41753.73 11442.57 9316.75 15555.2216 11132.90 23020.09 10747.88 10783.8017 9708.93 41596.43 11032.3118 21421.63 42023.35

Table 4: L2 errors of the prediction of the test set of different feature combina-tions using a prediction delay of 0 and a sequence length of 1.

Figure 8: Visualisation of the L2 errors of the prediction of the test set ofdifferent feature combinations using a prediction delay of 0 and a sequencelength of 1. The lowest L2 error is at round 8, power set number 5.

22

Page 23: Bitcoin price prediction using Deep Neural Networks

1 2 3 4 5 61 market capitalization market capitalization market capitalization market capitalization market capitalization market capitalization2 transactions per block transactions per block transactions per block transactions per block transactions per block transactions per block3 cost per transaction cost per transaction cost per transaction cost per transaction cost per transaction cost per transaction4 total transaction fees total transaction fees total transaction fees total transaction fees total transaction fees total transaction fees5 BTC in circulation no orphaned blocks no orphaned blocks blockchain size blockchain size blockchain size6 no orphaned blocks blockchain size blockchain size n transactions per day n transactions per day n transactions per day7 blockchain size n transactions per day n transactions per day output value estimated USD transaction value average block size8 n transactions per day median confirmation time output value estimated USD transaction value average block size miners revenue9 median confirmation time output value estimated USD transaction value average block size miners revenue n unique addresses10 output value estimated USD transaction value average block size miners revenue n unique addresses total number of transactions11 estimated USD transaction value average block size miners revenue n unique addresses total number of transactions n transactions12 average block size miners revenue n unique addresses total number of transactions n transactions n transactions exc chains longer than 10013 miners revenue n unique addresses total number of transactions n transactions n transactions exc chains longer than 100 hash rate14 n unique addresses total number of transactions n transactions n transactions exc chains longer than 100 hash rate15 total number of transactions n transactions n transactions exc chains longer than 100 hash rate16 n transactions n transactions exc chains longer than 100 hash rate17 n transactions exc chains longer than 100 hash rate18 hash rate

7 8 9 10 11 121 market capitalization cost per transaction total transaction fees blockchain size blockchain size blockchain size2 cost per transaction total transaction fees blockchain size n transactions per day n transactions per day n transactions per day3 total transaction fees blockchain size n transactions per day average block size average block size miners revenue4 blockchain size n transactions per day average block size miners revenue miners revenue total number of transactions5 n transactions per day average block size miners revenue n unique addresses total number of transactions n transactions6 average block size miners revenue n unique addresses total number of transactions n transactions n transactions exc chains longer than 1007 miners revenue n unique addresses total number of transactions n transactions n transactions exc chains longer than 100 hash rate8 n unique addresses total number of transactions n transactions n transactions exc chains longer than 100 hash rate9 total number of transactions n transactions n transactions exc chains longer than 100 hash rate10 n transactions n transactions exc chains longer than 100 hash rate11 n transactions exc chains longer than 100 hash rate12 hash rate

13 14 15 16 17 181 n transactions per day n transactions per day miners revenue miners revenue miners revenue miners revenue2 miners revenue miners revenue total number of transactions total number of transactions hash rate3 total number of transactions total number of transactions n transactions hash rate4 n transactions n transactions hash rate5 n transactions exc chains longer than 100 hash rate6 hash rate

Table 5: The elements that are appended per round, per powerset number.

1 2 3 4 5 6average USD price average USD price average USD price average USD price average USD price average USD price

BTC in circulation BTC in circulation BTC in circulation BTC in circulation BTC in circulationmedian confirmation time median confirmation time median confirmation time median confirmation time

no orphaned blocks no orphaned blocks no orphaned blocksoutput value output value

estimated USD transaction value7 8 9 10 11 12average USD price average USD price average USD price average USD price average USD price average USD priceBTC in circulation BTC in circulation BTC in circulation BTC in circulation BTC in circulation BTC in circulationmedian confirmation time median confirmation time median confirmation time median confirmation time median confirmation time median confirmation timeno orphaned blocks no orphaned blocks no orphaned blocks no orphaned blocks no orphaned blocks no orphaned blocksoutput value output value output value output value output value output valueestimated USD transaction value estimated USD transaction value estimated USD transaction value estimated USD transaction value estimated USD transaction value estimated USD transaction valuetransactions per block transactions per block transactions per block transactions per block transactions per block transactions per block

market capitalization market capitalization market capitalization market capitalization market capitalizationcost per transaction cost per transaction cost per transaction cost per transaction

total transaction fees total transaction fees total transaction feesn unique addresses n unique addresses

average block size13 14 15 16 17 18average USD price average USD price average USD price average USD price average USD price average USD priceBTC in circulation BTC in circulation BTC in circulation BTC in circulation BTC in circulation BTC in circulationmedian confirmation time median confirmation time median confirmation time median confirmation time median confirmation time median confirmation timeno orphaned blocks no orphaned blocks no orphaned blocks no orphaned blocks no orphaned blocks no orphaned blocksoutput value output value output value output value output value output valueestimated USD transaction value estimated USD transaction value estimated USD transaction value estimated USD transaction value estimated USD transaction value estimated USD transaction valuetransactions per block transactions per block transactions per block transactions per block transactions per block transactions per blockmarket capitalization market capitalization market capitalization market capitalization market capitalization market capitalizationcost per transaction cost per transaction cost per transaction cost per transaction cost per transaction cost per transactiontotal transaction fees total transaction fees total transaction fees total transaction fees total transaction fees total transaction feesn unique addresses n unique addresses n unique addresses n unique addresses n unique addresses n unique addressesaverage block size average block size average block size average block size average block size average block sizeblockchain size blockchain size blockchain size blockchain size blockchain size blockchain size

n transactions exc chains longer than 100 n transactions exc chains longer than 101 n transactions exc chains longer than 102 n transactions exc chains longer than 103 n transactions exc chains longer than 104n transactions per day n transactions per day n transactions per day n transactions per day

n transactions n transactions n transactionstotal number of transactions total number of transactions

hash rate

Table 6: The pinned down list of features per round.

23

Page 24: Bitcoin price prediction using Deep Neural Networks

5.3 Testing different prediction delays and sequence lengths

With all combinations of sequence length 1-20 and prediction delay 0-60 aremodels trained and tested on the test set. The L2 errors of the test set arecalculated as can be seen in Table 7 and are visualised in Figure 9 and in figure11. The difference between baselines and results can be seen in Table 8 and isvisualised in Figure 10.

The absolute best prediction of the test set is done using a sequence lengthof 1 and a prediction delay of 0. The L2 error of these predictions is 7082.21.The prediction is shown in Figure 12.

The best prediction with respect to the baseline is done using a sequencelength of 15 and a prediction delay of 60 days. The difference in L2 error withrespect to the baseline is -439372.67. The prediction is shown in Figure 13.

24

Page 25: Bitcoin price prediction using Deep Neural Networks

Seq len \Pred delay 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15Baseline 7316.79 14187.09 22112.63 28786.10 36446.29 42255.01 50550.14 62877.24 75758.81 91297.56 106514.81 121977.64 132526.88 148489.17 166783.35 183796.341 7082.21 16764.99 35436.30 61333.86 87540.75 99222.07 174220.09 197316.71 252978.68 282450.48 313013.44 385277.56 387115.48 418221.68 440574.17 447795.312 7746.27 22029.83 43453.71 91814.82 139324.84 179615.90 263481.54 308202.99 362905.18 464771.65 418967.69 496948.59 475109.70 523849.91 502634.47 490334.233 15907.95 15554.75 37388.22 64106.03 131380.21 180894.52 294366.21 291555.38 423145.33 468941.41 521630.46 539567.99 623381.80 576690.29 677197.28 669592.224 10416.10 21779.61 53728.15 87168.51 171597.50 249498.83 327613.06 309263.66 682354.81 388419.50 573407.29 420839.45 625915.18 469424.64 494077.97 442910.095 11533.71 18068.64 32531.55 91715.15 122218.72 173209.87 304442.77 564976.63 447408.50 429513.99 457991.00 474433.41 560258.24 526733.60 499683.78 492056.256 11463.75 25453.82 62779.98 89703.61 144938.15 164772.49 255775.80 309169.29 403376.12 419907.72 469071.63 457444.93 509364.93 481931.62 524424.89 484567.937 15389.06 34294.31 52199.73 78025.05 99660.97 158797.28 269207.60 255919.54 389520.90 439633.10 439645.53 447157.34 528647.93 486645.71 488087.58 485687.658 41134.64 69892.58 119076.20 203413.95 226213.63 271834.38 406684.93 350962.37 400823.09 453452.76 543819.57 570062.88 548799.73 541614.06 484526.99 526566.249 101711.33 333763.86 362756.06 517136.26 481560.91 470185.80 519683.37 552310.22 509157.22 669077.61 481513.84 557968.22 606988.59 577747.76 555828.75 583403.7210 80621.00 115052.73 167612.95 212163.57 311660.93 372961.86 414191.45 438011.11 470405.24 543442.97 631089.01 639906.90 727313.55 566881.91 588197.34 613674.3411 38357.07 80727.78 112293.83 152721.02 298129.11 342361.07 339026.07 470016.81 466184.24 483926.97 542308.18 693279.29 516644.32 525616.73 528618.38 540979.5512 20996.14 63598.82 116481.45 147550.17 227071.43 256695.72 299481.80 354298.87 420253.35 452320.97 493795.43 503968.44 516481.32 545966.80 663302.91 578560.4013 20390.67 52793.38 92982.21 207993.66 200754.81 292934.85 339396.39 370196.82 459533.74 386549.87 663508.43 472122.81 514982.07 716976.71 583191.00 702278.1214 16356.18 48594.98 78914.99 108959.81 140518.98 216784.75 272832.84 353388.94 419662.78 437740.68 447498.36 509049.02 553987.64 560912.26 513661.95 776075.3615 26592.24 47167.49 86687.90 107651.42 169500.70 200503.59 343580.64 338443.82 410694.05 428298.96 460765.63 512798.80 536120.89 718963.56 573689.72 626272.3416 33856.08 76224.79 105270.62 144042.46 164343.77 198187.18 325545.91 358993.89 352297.14 470301.91 601540.85 622212.06 687088.01 531514.68 657415.55 689872.4717 73304.78 148191.97 135069.16 241422.85 244467.61 288823.53 333888.96 434887.53 532943.68 570018.34 604516.63 694423.85 537056.55 732617.07 747032.61 755823.5618 273154.35 257363.34 301179.00 342873.67 453934.89 417986.68 454920.31 547467.80 518362.53 754924.00 761977.05 711069.86 839643.12 832558.14 772717.85 856873.9519 177803.68 336327.21 289262.57 386911.75 377473.19 435550.90 494947.27 529786.11 644599.08 677575.15 780766.46 801482.41 769412.53 824667.07 834039.02 766943.0520 92431.60 127738.15 246482.02 328990.30 358445.67 463725.70 506841.50 563060.42 639569.24 701732.54 742159.18 735256.71 709869.04 727289.30 784353.87 818872.62

Seq len \Pred delay 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30Baseline 203862.51 220356.84 235782.32 252768.0 271499.43 289583.46 304155.39 318705.00 332807.95 345200.49 356135.51 369818.43 378936.00 389512.29 400302.971 465018.50 464018.74 498213.05 520992.74 518259.20 559413.49 560825.06 589616.98 600966.16 637464.81 650806.83 694900.57 716247.85 711559.98 733427.002 508735.03 516464.88 544618.53 535204.06 561790.77 587443.87 581202.89 613474.50 655846.76 676901.85 708429.21 720449.22 742855.87 771583.19 762453.623 702993.79 731473.38 643794.64 664007.95 722086.76 926664.14 694654.99 1008764.30 800790.94 968380.29 778741.97 921215.13 965336.73 1071960.90 1081181.344 503243.92 465439.76 484374.00 526499.41 526061.87 674247.79 709763.41 778140.17 671232.84 758226.86 791630.48 786896.07 760550.18 841316.65 815892.985 524968.81 499938.71 535337.56 516742.66 556024.64 662578.89 583794.41 676541.45 673022.32 696082.81 740338.03 682567.58 625559.21 812878.45 514020.656 493373.23 511553.66 554080.59 558922.83 554785.50 573241.77 585651.62 611655.16 653284.73 658257.78 712444.54 610123.32 628625.24 581677.97 650709.627 510258.99 486906.53 505059.24 534378.80 545777.51 610249.98 585939.58 561316.03 612864.12 637533.69 612082.57 592098.87 682665.53 665693.74 674784.568 595969.28 604085.54 623633.50 666930.85 626944.64 653382.16 672234.08 720847.94 695466.83 748473.14 719484.43 740046.00 757656.92 782583.52 828083.169 600842.44 653714.08 650798.35 662610.85 699298.56 817336.56 695489.96 770042.61 748371.95 929737.62 804405.19 797312.26 911902.58 1025095.04 999628.1310 615730.23 657926.49 661954.43 656348.19 656776.41 731845.63 690111.72 752913.68 719077.46 753922.90 847615.22 793421.88 799075.66 810909.78 877902.2511 573812.35 663554.68 735564.69 708954.11 694698.91 716828.06 753579.60 643582.36 758246.46 787057.59 794466.80 805783.02 805693.73 836930.57 831355.5812 675487.13 609932.33 637335.37 664105.26 729095.32 719496.11 667497.79 720601.75 761690.64 769737.48 870555.69 821674.12 836087.26 797179.86 846977.1913 684053.64 712470.83 564748.02 721630.18 657856.05 738567.73 791985.94 757747.11 892588.02 855396.09 765022.03 775623.40 870901.64 772078.72 911957.3214 594119.67 612758.98 587899.70 625988.66 714258.99 700156.58 870937.20 832972.87 939987.46 840500.38 868857.27 784062.89 731374.07 656221.16 758703.8215 586764.37 674179.01 623464.48 712339.99 722550.50 754066.89 822219.30 886447.01 804988.99 866735.82 897851.80 677509.96 858976.22 627199.86 671564.7316 677400.93 713181.82 688821.41 732790.10 740193.65 780021.62 763165.56 779940.64 877623.12 881314.13 867867.76 772169.23 792980.36 724815.84 726459.1017 761179.85 679171.16 791224.80 843581.19 841680.02 871536.16 869053.19 870345.09 1022355.42 894217.46 661445.58 944482.67 711995.25 755063.36 950209.6718 860155.79 769733.63 927147.22 938084.23 800876.90 1017844.43 880343.21 1009178.40 1179220.89 1253130.47 754982.89 808261.23 786221.28 894466.47 685521.6019 831866.10 865393.09 818193.80 912353.27 731456.26 950059.49 948294.16 698699.88 929961.04 802406.83 893976.85 775074.30 742080.66 1224464.87 711889.3420 784020.48 852447.60 867154.95 831562.48 885873.66 899789.37 730618.95 1048732.00 693482.02 763481.00 812617.18 729273.55 626913.03 725414.63 653858.54

Seq len \Pred delay 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45Baseline 409453.83 419214.39 428314.20 435894.95 449262.58 464487.60 474937.63 481525.70 490660.37 497143.08 502482.92 509077.93 511429.60 513962.67 520778.321 760507.75 785675.82 787872.36 784783.19 825410.53 851829.01 857614.83 884186.68 913852.36 917328.98 924707.72 937355.95 962280.60 969311.58 974272.242 791088.25 858765.46 838750.09 871133.85 925333.66 898322.67 922330.20 917593.61 945255.89 939083.48 965970.87 1005919.14 1022575.19 1007066.24 1030185.933 959339.93 1359390.55 1269793.15 1251849.98 1374755.30 1478084.94 1473762.72 1331065.63 1312827.03 1434639.18 1233751.87 1544063.31 1487546.73 1317436.09 1545380.864 821383.37 851401.63 873233.40 899035.42 864308.96 849566.46 946864.90 936822.27 948942.85 863807.78 1009157.97 973136.63 1001111.97 963898.18 950551.485 835693.88 461709.88 517739.66 523830.21 506987.29 554376.07 609792.90 560113.25 597174.06 948335.84 689863.22 480158.72 559994.17 502392.24 961724.696 701083.30 599432.26 539857.35 559524.65 574633.39 538496.62 572174.00 558134.29 537519.87 526179.84 539691.54 525675.96 552239.42 561930.95 502461.197 641462.45 732494.33 754222.82 751275.28 761989.47 803166.01 774341.49 752510.97 794976.61 726267.89 696558.91 699992.77 705937.27 693618.31 702088.218 847906.60 849056.39 883112.54 856640.36 859562.71 935471.91 884899.30 921791.32 880561.53 873379.66 893231.08 879173.99 883532.10 923839.08 856348.519 978041.29 902049.03 990642.43 1002009.49 1002929.20 934367.45 1005391.50 1014807.48 1072382.74 1010742.51 1054349.56 972790.17 846491.42 883798.37 933458.5410 787019.50 875106.48 801396.87 921471.32 889289.04 796200.98 934718.65 1005520.20 955981.18 916030.04 896321.78 978883.84 944765.80 928628.51 715666.8111 915478.87 854044.21 847834.71 928215.56 844374.24 831976.84 868463.14 875201.18 886344.19 817118.57 861822.30 941138.80 911827.94 760446.13 862147.1212 951115.15 826710.16 844872.58 881523.53 880432.35 833510.74 802127.68 922348.14 880611.11 833713.67 877624.77 836441.39 868988.92 896040.09 960588.4313 713381.21 673903.22 732647.64 700593.62 674193.59 684313.97 993770.98 731617.22 629684.60 671518.43 752815.07 1178478.24 843104.44 857452.82 827953.1114 750245.33 775296.21 663131.21 821504.56 682599.69 676747.81 834246.56 826255.06 701333.89 605846.14 684910.02 837300.87 841915.14 883977.18 822658.0215 852270.82 749256.29 774268.53 841549.02 812423.30 859970.37 842311.72 739421.53 864746.97 923480.99 841967.28 758162.68 928735.26 861929.32 839735.5316 783297.03 853094.75 733203.97 765906.57 1109133.48 651011.00 841294.25 794670.19 845188.90 797164.64 753877.06 892716.75 891717.34 771937.31 764169.9417 843072.79 834920.76 682223.80 745638.36 755367.04 821988.89 664015.02 789193.82 732234.90 781032.21 851219.36 700430.83 713581.26 686718.65 784790.3118 868499.73 1100052.62 735484.02 697370.03 614741.06 892650.59 709112.87 832627.05 636513.62 672066.03 726424.03 722068.77 1234088.60 1465394.59 635129.6019 690543.61 1332536.09 753198.60 746482.70 826189.85 751982.88 694252.41 763648.90 878385.22 772588.51 681869.62 786660.22 730559.77 834668.62 931304.3120 697481.29 651552.72 644181.46 687370.88 848269.23 626335.70 1167349.77 807595.10 961936.29 650700.56 1004900.05 987796.89 667679.28 705625.35 795956.28

Seq len \Pred delay 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60Baseline 529057.97 538527.13 554611.66 575083.60 596309.17 611686.70 629897.62 654966.80 681411.24 709437.02 743835.24 768599.14 793421.46 816462.56 837213.291 1001048.53 977113.45 995731.74 1017462.00 994743.42 1013997.62 1008002.52 1019169.40 1011015.00 1014860.33 1055054.14 1044167.74 1070891.32 1087562.03 1045589.312 1005197.12 1027349.53 1064231.92 1015901.31 1029975.96 1042544.85 1023461.29 1050658.38 1085112.20 1073864.39 1109756.19 1070984.97 1097591.54 1112534.12 1098227.223 1212807.83 1325989.05 1288512.67 1514403.05 1382778.75 1713175.99 1648632.78 1634307.64 1688130.14 1714404.91 1351469.98 1517994.08 1692157.55 1784349.21 1703928.634 497683.42 1153125.19 1101699.04 1065366.95 817104.07 1133860.22 1111110.66 1148402.90 1178297.16 1159329.01 1161347.30 1201886.10 1200173.93 1212308.78 1299476.645 1035664.06 427127.61 811420.16 1032481.87 720900.14 580875.06 485948.94 1103415.80 990419.27 615215.74 468073.24 650545.85 1107633.16 1248078.22 1302700.826 551390.06 428732.58 442335.40 457719.09 466281.31 502880.29 437112.43 494627.91 593946.88 547816.67 536625.53 478016.57 588263.75 577668.08 727087.887 630955.39 610354.12 590603.26 631333.75 786052.34 792107.71 557037.16 766961.22 616501.65 682448.44 694424.31 621032.15 619460.03 568190.64 618370.998 849749.76 802259.91 737969.39 814907.13 748374.02 823536.63 985450.84 829204.30 790699.71 831912.75 847605.97 786816.60 842208.51 774311.20 764341.389 823550.85 863420.17 817457.53 872826.01 895779.19 940512.79 834618.71 989910.34 978484.20 783615.00 1052215.10 1020913.75 913896.06 781045.17 865955.5410 787976.84 864747.42 889377.80 841413.35 845991.68 794817.46 901290.42 964366.50 877689.25 780986.47 762486.40 824747.25 902407.47 783411.40 799647.9511 892871.36 909895.23 873063.88 980280.51 857807.95 792949.32 859525.04 868594.05 796708.79 764526.26 728405.10 708347.52 627371.17 730107.69 513804.8112 905268.50 867376.43 807043.43 831595.38 680315.06 848066.26 642019.60 772323.26 676936.53 665975.12 739664.81 668912.33 677497.76 652740.51 470979.4313 641931.82 712086.72 762166.04 735535.60 646630.85 821092.53 1049264.49 603180.49 762232.53 545654.56 806877.85 743871.97 877410.95 866223.23 788464.1614 880924.10 806225.82 673111.91 575472.92 928057.28 1031389.89 831417.38 696805.27 659841.11 925782.19 568920.75 696934.40 464486.72 689255.65 903153.5015 865926.42 831658.59 580126.67 728736.37 969770.94 836355.96 807992.29 743941.14 834324.77 737492.52 684564.80 797835.53 660932.56 467023.53 397840.6216 833202.25 810083.07 648769.67 544008.72 748848.20 802023.71 627068.03 795925.88 727246.71 594887.52 533105.44 746921.89 664459.35 421678.10 815168.0217 662625.49 1035132.85 568225.67 831267.24 904826.61 913893.84 841991.25 732173.68 793970.97 812308.50 463760.97 539350.64 782221.70 620146.85 839382.9518 863531.16 1179134.39 973754.74 860835.74 613231.34 720852.37 666622.18 862027.85 972764.41 998595.31 968152.21 939857.77 452806.18 882445.01 789274.6219 729926.73 575648.76 833609.66 816397.84 874686.02 791443.80 843429.20 377782.41 652381.73 767797.03 406688.32 826826.55 678787.23 681171.56 849442.3320 866793.74 848905.53 598495.94 676503.12 895548.10 613288.93 716007.16 873194.23 791368.97 605052.39 872703.37 717007.06 792284.81 681038.90 673726.48

Table 7: The L2 errors of the predictions of the test set using different sequence lengths and different prediction delays.25

Page 26: Bitcoin price prediction using Deep Neural Networks

Seq length \Pred delay 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15Baseline 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.001 -234.58 2577.89 13323.67 32547.76 51094.46 56967.06 123669.95 134439.47 177219.86 191152.92 206498.63 263299.92 254588.61 269732.52 273790.82 263998.982 429.49 7842.73 21341.08 63028.72 102878.55 137360.89 212931.39 245325.74 287146.37 373474.09 312452.88 374970.95 342582.82 375360.74 335851.11 306537.893 8591.16 1367.66 15275.59 35319.93 94933.92 138639.51 243816.07 228678.14 347386.52 377643.85 415115.65 417590.35 490854.92 428201.12 510413.93 485795.884 3099.31 7592.52 31615.53 58382.42 135151.21 207243.83 277062.92 246386.42 606596.0 297121.94 466892.48 298861.81 493388.30 320935.47 327294.62 259113.755 4216.92 3881.55 10418.93 62929.05 85772.43 130954.86 253892.63 502099.39 371649.68 338216.43 351476.19 352455.76 427731.36 378244.43 332900.42 308259.916 4146.96 11266.73 40667.36 60917.51 108491.87 122517.49 205225.66 246292.05 327617.31 328610.16 362556.82 335467.29 376838.05 333442.45 357641.53 300771.607 8072.27 20107.22 30087.10 49238.95 63214.68 116542.27 218657.46 193042.30 313762.09 348335.54 333130.72 325179.70 396121.05 338156.54 321304.23 301891.318 33817.85 55705.49 96963.58 174627.85 189767.35 229579.37 356134.79 288085.13 325064.28 362155.20 437304.76 448085.24 416272.85 393124.90 317743.63 342769.909 94394.54 319576.77 340643.43 488350.16 445114.62 427930.79 469133.22 489432.98 433398.41 577780.05 374999.03 435990.58 474461.72 429258.59 389045.40 399607.3810 73304.21 100865.64 145500.32 183377.47 275214.64 330706.85 363641.31 375133.87 394646.43 452145.41 524574.21 517929.26 594786.67 418392.74 421413.99 429878.0011 31040.28 66540.68 90181.20 123934.92 261682.82 300106.07 288475.93 407139.56 390425.42 392629.41 435793.37 571301.65 384117.45 377127.57 361835.02 357183.2112 13679.36 49411.73 94368.83 118764.07 190625.14 214440.71 248931.66 291421.63 344494.54 361023.41 387280.63 381990.80 383954.44 397477.63 496519.56 394764.0613 13073.88 38606.28 70869.58 179207.56 164308.53 250679.84 288846.25 307319.58 383774.93 295252.31 556993.62 350145.17 382455.20 568487.55 416407.64 518481.7814 9039.39 34407.89 56802.36 80173.71 104072.69 174529.74 222282.70 290511.70 343903.97 346443.12 340983.55 387071.38 421460.76 412423.10 346878.59 592279.0315 19275.45 32980.40 64575.27 78865.32 133054.41 158248.58 293030.50 275566.58 334935.24 337001.40 354250.82 390821.16 403594.01 570474.39 406906.37 442476.0116 26539.29 62037.70 83157.99 115256.36 127897.48 155932.17 274995.77 296116.65 276538.33 379004.35 495026.04 500234.42 554561.13 383025.51 490632.20 506076.1317 65987.99 134004.88 112956.53 212636.75 208021.33 246568.53 283338.82 372010.28 457184.87 478720.78 498001.82 572446.20 404529.67 584127.91 580249.26 572027.2318 265837.56 243176.25 279066.37 314087.57 417488.61 375731.67 404370.17 484590.56 442603.72 663626.44 655462.25 589092.22 707116.24 684068.97 605934.49 673077.6119 170486.89 322140.11 267149.94 358125.65 341026.90 393295.90 444397.12 466908.87 568840.27 586277.59 674251.65 679504.77 636885.65 676177.90 667255.67 583146.7120 85114.81 113551.06 224369.39 300204.20 321999.38 421470.70 456291.36 500183.18 563810.43 610434.98 635644.37 613279.07 577342.16 578800.13 617570.51 635076.28

Seq length \Pred delay 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30Baseline 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.001 261155.99 243661.90 262430.73 268224.74 246759.77 269830.03 256669.67 270911.98 268158.21 292264.32 294671.32 325082.13 337311.84 322047.69 333124.042 304872.52 296108.04 308836.21 282436.07 290291.34 297860.41 277047.50 294769.50 323038.81 331701.36 352293.70 350630.79 363919.86 382070.90 362150.653 499131.27 511116.55 408012.32 411239.96 450587.33 637080.68 390499.60 690059.30 467982.98 623179.81 422606.46 551396.70 586400.72 682448.62 680878.374 299381.40 245082.92 248591.68 273731.41 254562.44 384664.33 405608.02 459435.17 338424.88 413026.37 435494.97 417077.64 381614.18 451804.36 415590.015 321106.30 279581.87 299555.24 263974.67 284525.21 372995.44 279639.02 357836.45 340214.37 350882.32 384202.52 312749.15 246623.21 423366.16 113717.686 289510.72 291196.82 318298.28 306154.83 283286.07 283658.31 281496.23 292950.16 320476.77 313057.29 356309.03 240304.89 249689.24 192165.68 250406.657 306396.47 266549.69 269276.92 281610.80 274278.08 320666.52 281784.19 242611.03 280056.17 292333.20 255947.06 222280.43 303729.52 276181.46 274481.598 392106.77 383728.70 387851.18 414162.86 355445.21 363798.71 368078.69 402142.94 362658.87 403272.65 363348.92 370227.56 378720.92 393071.23 427780.199 396979.93 433357.24 415016.04 409842.86 427799.13 527753.11 391334.57 451337.61 415564.0 584537.13 448269.68 427493.82 532966.58 635582.75 599325.1710 411867.72 437569.65 426172.12 403580.20 385276.98 442262.17 385956.33 434208.68 386269.51 408722.41 491479.71 423603.44 420139.65 421397.50 477599.2811 369949.84 443197.84 499782.37 456186.12 423199.48 427244.60 449424.21 324877.36 425438.50 441857.10 438331.29 435964.59 426757.73 447418.28 431052.6212 471624.62 389575.49 401553.05 411337.27 457595.89 429912.65 363342.40 401896.75 428882.69 424536.99 514420.18 451855.69 457151.26 407667.58 446674.2213 480191.13 492113.99 328965.70 468862.19 386356.62 448984.27 487830.55 439042.11 559780.07 510195.61 408886.52 405804.96 491965.63 382566.43 511654.3514 390257.16 392402.14 352117.38 373220.67 442759.56 410573.12 566781.81 514267.87 607179.50 495299.89 512721.76 414244.46 352438.06 266708.87 358400.8515 382901.85 453822.17 387682.16 459572.0 451051.07 464483.43 518063.91 567742.01 472181.03 521535.33 541716.28 307691.53 480040.22 237687.57 271261.7616 473538.41 492824.98 453039.09 480022.11 468694.22 490438.16 459010.17 461235.64 544815.16 536113.64 511732.25 402350.80 414044.35 335303.55 326156.1317 557317.34 458814.32 555442.48 590813.19 570180.59 581952.70 564897.80 551640.09 689547.46 549016.97 305310.07 574664.24 333059.25 365551.07 549906.7018 656293.28 549376.79 691364.90 685316.23 529377.47 728260.98 576187.82 690473.40 846412.94 907929.98 398847.38 438442.80 407285.28 504954.18 285218.6319 628003.59 645036.25 582411.48 659585.27 459956.83 660476.04 644138.77 379994.88 597153.09 457206.34 537841.33 405255.86 363144.66 834952.58 311586.3820 580157.97 632090.76 631372.63 578794.48 614374.23 610205.91 426463.56 730027.0 360674.07 418280.51 456481.67 359455.11 247977.03 335902.34 253555.57

Seq length \Pred delay 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45Baseline 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.001 351053.92 366461.43 359558.16 348888.23 376147.95 387341.41 382677.20 402660.98 423191.98 420185.90 422224.80 428278.02 450851.0 455348.91 453493.912 381634.43 439551.08 410435.88 435238.90 476071.09 433835.06 447392.57 436067.91 454595.51 441940.40 463487.96 496841.21 511145.59 493103.57 509407.613 549886.11 940176.17 841478.95 815955.03 925492.72 1013597.34 998825.09 849539.92 822166.65 937496.10 731268.95 1034985.38 976117.12 803473.42 1024602.544 411929.55 432187.25 444919.20 463140.47 415046.38 385078.86 471927.27 455296.57 458282.48 366664.70 506675.05 464058.71 489682.37 449935.51 429773.165 426240.05 42495.49 89425.45 87935.25 57724.72 89888.47 134855.27 78587.55 106513.69 451192.76 187380.30 -28919.20 48564.57 -11570.44 440946.376 291629.48 180217.88 111543.15 123629.69 125370.81 74009.02 97236.37 76608.58 46859.49 29036.76 37208.62 16598.03 40809.82 47968.28 -18317.137 232008.63 313279.94 325908.62 315380.32 312726.90 338678.41 299403.86 270985.27 304316.24 229124.81 194076.0 190914.85 194507.67 179655.64 181309.898 438452.78 429842.01 454798.33 420745.41 410300.14 470984.30 409961.67 440265.62 389901.16 376236.58 390748.16 370096.06 372102.50 409876.41 335570.199 568587.46 482834.64 562328.23 566114.54 553666.62 469879.85 530453.87 533281.77 581722.37 513599.43 551866.64 463712.24 335061.82 369835.69 412680.2210 377565.67 455892.09 373082.67 485576.36 440026.46 331713.38 459781.02 523994.50 465320.80 418886.96 393838.86 469805.91 433336.20 414665.84 194888.4911 506025.05 434829.82 419520.51 492320.61 395111.67 367489.23 393525.51 393675.48 395683.82 319975.49 359339.38 432060.88 400398.34 246483.45 341368.8012 541661.33 407495.77 416558.38 445628.57 431169.78 369023.14 327190.05 440822.44 389950.73 336570.59 375141.85 327363.46 357559.32 382077.42 439810.1113 303927.39 254688.83 304333.44 264698.66 224931.02 219826.36 518833.34 250091.52 139024.23 174375.35 250332.15 669400.31 331674.84 343490.15 307174.7914 340791.50 356081.82 234817.01 385609.60 233337.11 212260.21 359308.93 344729.36 210673.52 108703.06 182427.10 328222.94 330485.54 370014.51 301879.7015 442816.99 330041.90 345954.32 405654.07 363160.72 395482.76 367374.09 257895.83 374086.60 426337.91 339484.36 249084.76 417305.66 347966.65 318957.2116 373843.20 433880.37 304889.77 330011.62 659870.90 186523.40 366356.62 313144.49 354528.52 300021.56 251394.14 383638.82 380287.74 257974.63 243391.6117 433618.97 415706.37 253909.60 309743.41 306104.46 357501.28 189077.39 307668.12 241574.53 283889.13 348736.44 191352.90 202151.65 172755.98 264011.9918 459045.91 680838.24 307169.82 261475.08 165478.48 428162.99 234175.24 351101.34 145853.25 174922.95 223941.11 212990.85 722659.0 951431.92 114351.2719 281089.78 913321.70 324884.40 310587.75 376927.27 287495.28 219314.78 282123.20 387724.85 275445.43 179386.70 277582.29 219130.16 320705.95 410525.9820 288027.47 232338.34 215867.26 251475.93 399006.65 161848.09 692412.14 326069.40 471275.92 153557.48 502417.14 478718.96 156249.68 191662.68 275177.95

Seq length \Pred delay 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60Baseline 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.001 471990.56 438586.32 441120.08 442378.40 398434.24 402310.92 378104.90 364202.60 329603.76 305423.31 311218.90 275568.60 277469.86 271099.47 208376.022 476139.15 488822.39 509620.25 440817.72 433666.79 430858.15 393563.66 395691.59 403700.96 364427.37 365920.95 302385.83 304170.08 296071.57 261013.933 683749.86 787461.91 733901.01 939319.45 786469.58 1101489.29 1018735.15 979340.84 1006718.90 1004967.89 607634.74 749394.95 898736.09 967886.66 866715.344 -31374.55 614598.06 547087.38 490283.36 220794.89 522173.52 481213.04 493436.10 496885.93 449891.98 417512.06 433286.96 406752.48 395846.23 462263.355 506606.09 -111399.52 256808.49 457398.28 124590.97 -30811.64 -143948.68 448449.0 309008.03 -94221.29 -275762.0 -118053.29 314211.71 431615.67 465487.536 22332.10 -109794.55 -112276.26 -117364.50 -130027.86 -108806.41 -192785.19 -160338.89 -87464.36 -161620.35 -207209.71 -290582.57 -205157.71 -238794.48 -110125.417 101897.42 71826.98 35991.60 56250.15 189743.17 180421.01 -72860.47 111994.42 -64909.58 -26988.58 -49410.93 -147566.99 -173961.43 -248271.91 -218842.308 320691.80 263732.78 183357.73 239823.53 152064.85 211849.93 355553.21 174237.50 109288.47 122475.73 103770.73 18217.47 48787.06 -42151.35 -72871.919 294492.89 324893.03 262845.87 297742.42 299470.01 328826.09 204721.09 334943.54 297072.96 74177.98 308379.86 252314.61 120474.60 -35417.38 28742.2510 258918.87 326220.28 334766.14 266329.76 249682.51 183130.76 271392.80 309399.71 196278.01 71549.45 18651.15 56148.12 108986.01 -33051.16 -37565.3411 363813.40 371368.09 318452.22 405196.91 261498.78 181262.62 229627.42 213627.26 115297.55 55089.23 -15430.14 -60251.61 -166050.28 -86354.87 -323408.4812 376210.53 328849.30 252431.77 256511.78 84005.89 236379.56 12121.98 117356.46 -4474.71 -43461.91 -4170.43 -99686.81 -115923.7 -163722.05 -366233.8613 112873.85 173559.59 207554.38 160452.0 50321.67 209405.83 419366.86 -51786.30 80821.29 -163782.46 63042.61 -24727.17 83989.49 49760.67 -48749.1314 351866.13 267698.68 118500.25 389.32 331748.11 419703.18 201519.75 41838.47 -21570.13 216345.17 -174914.50 -71664.73 -328934.73 -127206.90 65940.2115 336868.45 293131.46 25515.01 153652.78 373461.77 224669.26 178094.67 88974.34 152913.53 28055.50 -59270.44 29236.40 -132488.90 -349439.03 -439372.6716 304144.28 271555.93 94158.01 -31074.88 152539.03 190337.0 -2829.59 140959.09 45835.47 -114549.51 -210729.81 -21677.24 -128962.11 -394784.45 -22045.2717 133567.52 496605.72 13614.01 256183.64 308517.44 302207.13 212093.62 77206.88 112559.73 102871.48 -280074.27 -229248.50 -11199.75 -196315.71 2169.6618 334473.19 640607.25 419143.08 285752.14 16922.17 109165.67 36724.56 207061.05 291353.17 289158.29 224316.97 171258.64 -340615.28 65982.46 -47938.6719 200868.76 37121.63 278997.99 241314.24 278376.85 179757.10 213531.58 -277184.39 -29029.51 58360.01 -337146.92 58227.42 -114634.23 -135290.99 12229.0420 337735.78 310378.40 43884.28 101419.52 299238.93 1602.23 86109.53 218227.44 109957.73 -104384.63 128868.13 -51592.07 -1136.65 -135423.65 -163486.81

Table 8: The L2 errors of the baselines minus the L2 errors of the predictions with different sequence lengths and differentprediction delays.

26

Page 27: Bitcoin price prediction using Deep Neural Networks

Figure 9: The L2 losses of the test set with different prediction delays andsequence lengths.

Figure 10: The L2 losses of the test set with different prediction delays andsequence lengths, relative to the baseline.

27

Page 28: Bitcoin price prediction using Deep Neural Networks

Figure 11: The L2 losses of the test set per different sequence lengths with onthe x-axis the prediction delays and the baseline.

28

Page 29: Bitcoin price prediction using Deep Neural Networks

Figure 12: Prediction of the test set using sequence length 1 and predictiondelay 0, which is the prediction with the lowest L2 error.

Figure 13: Prediction of the test set using sequence length 15 and predictiondelay 60, which is the prediction with the lowest L2 error minus the baseline.

29

Page 30: Bitcoin price prediction using Deep Neural Networks

6 Discussion and conclusion

6.1 Data scaling

It is clear to see in Table 3 and Figure 7 that feature scaling and target scalingoutperforms the other two methods, possibly because it is more difficult to fitto very large values than it is to fit to scaled values. This is not the result asexpected: it was expected that scaling improves the convergence speed, but notparticularly its performance. This effect can not be due to the learning rate, asall three methods used the same initial learning rate of 10−3 and decreases withfactor 0.4 every time the validation loss does not decrease within 5 epochs, sogradient descent led through all three methods to convergence, but not all threeas effectively. Possibly fitting a model is more difficult for large values than itis for scaled values.

6.2 Feature selection

It was expected that using more features would lead more effective predictions,as machine learning has more data to base a prediction on and when a featuredoes not contribute to the prediction, machine learning would set its weights to0 during fitting, so that the feature is ignored. However, the best combinationof features found is the following combination of 9 features: average USD price,BTC in circulation, median confirmation time, no orphaned blocks, output value,estimated USD transaction value, transactions per block, market capitalizationand cost per transaction, which performs better than using all 19 features. Itmay be the case that some features show different relations to the target valuesin the train set than they do in the test set, which causes the predictions todeteriorate.

Using the greedy method of finding the best combination of features is incom-plete as many combinations are not tested using this method. In future studiesother methods may be used for finding the optimal combination of features.

6.3 Testing different prediction delays and sequence lengths

The best absolute result is obtained by a sequence length of 1 and a predic-tion delay of 0. This result is not as expected as using a sequence length of1 is basically similar to making predictions using a ordinary artificial neuralnetwork. The expectation was that longer the sequence length would lead tobetter prediction results, but experiment shows that longer sequence length doesnot particularly leads to better results. This may be caused by the selectionof features in subsection 5.2, which are tested using a sequence length of 1 andprediction delay of 0, such that the selection of features works optimal at thosesettings but another selection of features may have led to different results.

30

Page 31: Bitcoin price prediction using Deep Neural Networks

The result is only slightly better than the baseline, as the results has a L2error of 7082.21 and the baseline has a L2 error of 7316.79.

The best result with respect to the baseline is achieved using sequence length15 and prediction delay 60, as it performs better with a difference in L2 of439372.67. This may be due to the fact that the baseline prediction with a largeprediction delay has already a great error such that even a ’random’ predictionmay be better than the baseline.

6.4 Future work

Future studies may focus on the effects of different combinations of features,sequence lengths and prediction delays simultaneously, as those different com-binations may lead to very interesting and effective prediction results.

Also, future work may investigate prediction of the Bitcoin price using adifferent timescale than the 24-hour time series, for example using data with ahourly resolution, which may lead to more effective prediction results.

31

Page 32: Bitcoin price prediction using Deep Neural Networks

References

[1] B.S. Bini and Tessy Mathew. “Clustering and Regression Techniques forStock Prediction”. In: Procedia Technology 24 (2016), pp. 1248–1255.issn: 22120173. doi: 10.1016/j.protcy.2016.05.104. url: http:

//linkinghub.elsevier.com/retrieve/pii/S2212017316301931.

[2] Klaus Greff et al. “LSTM: A Search Space Odyssey”. In: IEEE Transac-tions on Neural Networks and Learning Systems (2016). issn: 21622388.doi: 10.1109/TNNLS.2016.2582924. arXiv: 1503.04069.

[3] Ladislav Kristoufek. “What are the main drivers of the bitcoin price?Evidence from wavelet coherence analysis”. In: PLoS ONE 10.4 (2015),pp. 1–15. issn: 19326203. doi: 10.1371/journal.pone.0123923. arXiv:1406.0268.

[4] Isaac Madan, Shaurya Saluja and Aojia Zhao. “Automated Bitcoin Trad-ing via Machine Learning Algorithms”. In: (2014), pp. 1–5.

[5] Martina Matta, Ilaria Lunesu and Michele Marchesi. “Bitcoin Spread Pre-diction Using Social And Web Search Media”. In: (2015).

[6] Satoshi Nakamoto. “Bitcoin: A Peer-to-Peer Electronic Cash System”. In:Www.Bitcoin.Org (2008), p. 9. issn: 09254560. doi: 10.1007/s10838-008-9062-0. arXiv: 43543534534v343453. url: https://bitcoin.org/bitcoin.pdf.

[7] Daniel Neil, Michael Pfeiffer and Shih-Chii Liu. “Phased LSTM: Acceler-ating Recurrent Network Training for Long or Event-based Sequences”.In: Nips Nips (2016), pp. 3882–3890. arXiv: 1610.09513. url: https://papers.nips.cc/paper/6310-phased-lstm-accelerating-recurrent-

network-training-for-long-or-event-based-sequences.

[8] Nathalie Tjernstr and Malin Janne. “The Price Volatility of Bitcoin”. In:(2014).

[9] Gaowei Zhang, Lingyu Xu and Yunlan Xue. “Model and forecast stockmarket behavior integrating investor sentiment analysis and transactiondata”. In: Cluster Computing 20.1 (2017), pp. 789–803. issn: 1386-7857.doi: 10.1007/s10586-017-0803-x. url: http://link.springer.com/10.1007/s10586-017-0803-x.

[10] Yechen Zhu, David Dickinson and Jianjun Li. “Analysis on the influencefactors of Bitcoin’s price based on VEC model”. In: Financial Innovation3.1 (2017), p. 3. issn: 2199-4730. doi: 10.1186/s40854-017-0054-0. url:http://jfin-swufe.springeropen.com/articles/10.1186/s40854-

017-0054-0.

32