Learning Financial Market Data with Recurrent Autoencoders and TensorFlow

Learning Financial Market Datawith Recurrent Autoencoders and TensorFlow.

Steven Hutt steven.c.hutt@gmail.comJuly 28, 2016

TensorFlow London Meetup

The Limit Order Book

A financial exchange provides a central location for electronically trading contracts.Trading is implemented on a Matching Engine, which matches buy and sell ordersaccording to certain priority rules and queues orders which do not immediately match.

A. Booth©

A Marketable limit order is immediately matchedand results in a trade

A Passive limit order cannot be matched and joinsthe queue

Data In: Client ID tagged order flow

Data Out: LOB state (quantity at price)

In either case need to learn data sequences

Learning Sequences with RNNs

In a Recurrent Neural Network there are nodes which feedback on themselves. Belowxt ∈ Rnx , ht ∈ Rnh and zt ∈ Rnz , for t = 1, 2, . . . , T. Each node is a vector or 1-dim array.

output

.state .

. Whh.

. h1. h2. h3.

· · ·

. · · ·.

· · ·

The RNN is defined by update equations:

ht+1 = f(Whhht +Whxxt + bh), zt = g(Wyhht + bz).

where Wuv ∈ Rnu×nv and bu ∈ Rnu are the RNN weights and biases respectively. Thefunctions f and g are generally nonlinear.

Long Short Term Memory Units

An LSTM [3] learns long term dependencies by adding a memory cell and controlling whatgets modified and what gets forgotten:

..memory ct .

state ht

.∗.forget

. +.modify

.. ⃝.. ct+1.

= copy

= join then copy

An LSTM can remember state dependencies further back than a standard RNN. It iscapable of learning real grammars.

Unitary RNNs

The behaviour of the LSTM can be difficult to analyse.

An alternative approach [1] is to ensure there are no contractive directions, that isrequire Whh ∈ U(n), the space of unitary matrices.

This preserves the simple form of an RNN but ensures the state dependencies go furtherback in the sequence.A key problem is that gradient descent does not preservethe unitary property:

x ∈ U(n) 7→ x+ λυ /∈ U(n)

The solution in [1] is to choose Whh from asubset generated by parametrized unitarymatrices:

Whh = D3R2F−1D2ΠR1FD1.

Here D is diagonal, R is a reflection, F is the Fourier trans-form and Π is a permutation. 4

RNNs as Programs addressing External Memory

RNNs are Turing complete but the theoretical capabilities are not matched in practicedue to training inefficiencies.

The Neural Turing Machine emulates a differentiable Turing Machine by adding externaladdressable memory and using the RNN state as a memory controller [2].

Input xt

Output yt

.RNN Controller.RNN Controller.RNN Controller.

Read Heads

Write Heads

Memory

Vanila NTM Architecture

NTM equivalent 'code'initialise: move head to start locationwhile input delimiter not seen do

receive input vectorwrite input to head locationincrement head location by 1

end whilereturn head to start locationwhile true do

read output vector from head locationemit outputincrement head location by 1

end while 5

Deep RNNs

. Wh1h1

. h10.

. h11. h12. h13.

. h1T.

· · ·

. · · ·.

· · ·

The input to hidden layer Li is the state hi−1 of hidden layer Li−1. Deep RNNs can learn atdifferent time scales.

Autoencoders

The more regularities there are in data the more it can be compressed. The morewe are able to compress the data, the more we have learned about the data.

(Grünwald, 1998)

reconstruct

encode

decode

.Wh2h1

.Wh3h2

Autoencoder7

Recurrent Autoencoders

.. he0. he1. he2. he3.

. heT.

. · · ·.

· · ·

. hd0.

. hd1. hd2. hd3.

. hdT.

zT−1

· · ·

. · · ·.

· · ·

encoder decoder

A Recurrent Autoencoder is trained to output a reconstruction z of the the inputsequence x. All data must pass through the narrow heT . Compression between input andoutput forces pattern learning.

Recurrent Autoencoder Updates

.. he0. he1. he2. he3.

. heT.

. · · ·.

· · ·

................ hd0.

. hd1. hd2. hd3.

. hdT.

zT−1

· · ·

. · · ·.

· · ·

. =...............

encoder decoder

he1 = f(Wehhhe0 +We

hxx1 + beh)he2 = f(We

hhhe1 +Wehxx2 + beh)

he3 = f(Wehhhe2 +We

hxx3 + beh)· · ·

encoder updates

hd0 = heT

hd1 = f(Wdhhhd0 +Wd

hxz0 + bdh)hd2 = f(Wd

hhhd1 +Wdhxz1 + bdh)

hd3 = f(Wdhhhd2 +Wd

hxz2 + bdh)· · ·

decoder updates

Easy RNNs with TensorFlow VariableScope

def linear(in_array, out_size):in_size = in_array.get_shape()# creates or returns a variableW = vs.get_variable(shape=[in_size, out_size], name='W')# out_array = in_array * Wout_array = math_ops.matmul(in_array, W)return out_array

with tf.variable_scope('encoder_0') as scope:# creates variable encoder_0/Wout_array_1 = linear(in_array_0, out_size_1)

# error: encoder_0/W already exists# out_array_2 = linear(out_array_1, out_size_2)

# reuse encoder_0/Wscope.reuse_variables()out_array_2 = linear(out_array_1, out_size_2)

TensorFlow encoder

with tf.variable_scope('encoder_0') as scope:for t in range(1, seq_len):

# placeholder for input data at time txs_0_enc[t] = tf.placeholder(shape=[None, nx_enc])# encoder update at time ths_0_enc[t] = lstm_cell_l0_enc(xs_0_enc[t], hs_0_enc[t-1])scope.reuse_variables() 11

Deep Recurrent Autoencoder

encoder_2

encoder_1

.encoder_0.

. he10.

. he11. he12. he13.

. he1T.

· · ·

. · · ·.

· · ·

decoder_2

decoder_1

decoder_0

zT−1

· · ·

TensorFlow Deep Recurrent AutoEncoder

# placeholder for input dataxs_0_enc[t] = tf.placeholder(shape=[None, nx_enc])hs_0_enc[t] = lstm_cell_l0_enc(xs_0_enc[t], hs_0_enc[t-1])scope.reuse_variables()

# encoder_1 input is encoder_0 hidden statexs_1_enc[t] = hs_0_enc[t]hs_1_enc[t] = lstm_cell_l1_enc(xs_1_enc[t], hs_1_enc[t-1])scope.reuse_variables()

# encoder_2 input is encoder_1 hidden statexs_2_enc[t] = hs_1_enc[t]hs_2_enc[t] = lstm_cell_l2_enc(xs_2_enc[t], hs_2_enc[t-1])scope.reuse_variables() 13

Variational Autoencoder

We assume the market data {xk} is sampled from a probability distribution with a 'small'number of latent variables h.

Assume pθ(x,h) = pθ(x |h)pθ(h) where θ are

weights of a NN/RNN. Then

pθ(x) =∑h

pθ(x |h)pθ(h)

Train network to minimize

Lθ({xk}) = minθ

− log(pθ(xk)).

Compute pθ(h | x) and cluster.

But pθ(h | x) is intractable!

. pθ(h).

pθ(x)

. prior.

likelihood

marginal

posterior

p(x | h)

pθ(h | x)

Variational Inference

Variational Inference learns an NN/RNN approximation qϕ(h | x) to pθ(h | x) duringtraining.

Think of qϕ as encoder and pθ decoder networks: an autoencoder.

Then logpθ(xk) ≥ Lθ,ϕ(xk) where

Lθ,ϕ(xk) = −DKL(qϕ(h | xk)||pθ(h))︸︷︷︸regularization term

+ Eqϕ(logpθ(xk |h))︸︷︷︸reconstruction term

is the variational lower bound.

Reconstruction term = log-likelihood wrt qϕ

Regularization term = target prior on encoder

. pθ(h).

pθ(x)

. prior.

likelihood

marginal

approx

pθ(x | h)

qϕ(h | x)

Search

Train a Deep Recurrent Autoencoder onpublic market data (book state over time)

Use the trained encoder as hash function

Search based on Euclidean or Hammingdistance (after binary projection)

Classify

EUR vs AUD

Train a Variational Recurrent Autoencoder with Gaus-sian prior on public market data

Include two contracts: EUR and AUD in training data

Use trained encoder to infer Gaussian means fromtraining data

Predict

Train an LSTM on sequences of states of the LOB to predict changes in best bid.

Target data: change in best bid over time t→ t+ 1.

LSTM output: zt = softmax distribution for best bid moves (−1, 0,+1) over time t→ t+ 1.

It's tough to make predictions, especially about the future.

target

predict

Questions?

References

M. Arjovsky, A. Shah, and Y. Bengio.Unitary evolution recurrent neural networks.arXiv:1511.06464v4, 2016.A. Graves, G. Wayne, and I. Danihelka.Neural turing machines.arXiv:1410.5401v2, 2014.S. Hochreiter and J. Schmidhuber.Long short term memory.Neural Computation, 9:1735--1780, 1997.

Learning Financial Market Data with Recurrent Autoencoders and TensorFlow

Technology

Transcript of Learning Financial Market Data with Recurrent Autoencoders and TensorFlow

Weichen (TensorFlow)

Introduction of TensorFlow - SKKUnlp.skku.edu/usefulthings/TensorFlow-Basic-Concept_Ko.pdf · Overview of TensorFlow 9 To use TensorFlow, you need to understand how TensorFlow: Represents

Tensorflow - Aalto · Tensorflow API TensorFlow has APIs available in several languages both for constructing and executing a TensorFlow graph. The Python API is at present the most

Teaching Recurrent Neural Networks using Tensorflow (Webinar: August 2016)

Comparative Analysis of using Recurrent Autoencoders for ...ceur-ws.org/Vol-2711/paper28.pdfThe subject of study is unsupervised machine and deep learning-based algorithms for one-class

Welcome to TensorFlow! - Stanford University...Welcome to TensorFlow! CS 20SI: TensorFlow for Deep Learning Research Lecture 1 1/13/2017 1 2 Agenda Welcome Overview of TensorFlow Graphs

Teaching Recurrent Neural Networks using Tensorflow (May 2016)

Autoencoders for image_classification

Topological Autoencoders - arXiv

TensorFlow Extended (TFX) · TensorFlow Transform Estimator or Keras Model TensorFlow Model Analysis TensorFlow Serving Logging Shared Utilities for Garbage Collection, Data Access

TensorFlow and Keras - Statinfer...Contents •Deep Learning frameworks •What is TensorFlow •Key terms in TensorFlow •Working with TensorFlow •Regression Model building •MNIST

TensorFlow Tutorial

Autoencoders - University at Buffalo

TensorFlow Graph Optimizations - Stanford Universityweb.stanford.edu/class/cs245/slides/TFGraphOptimizationsStanford.… · TensorFlow Graph concepts TensorFlow (v1.x) programs generate

Survey of Google’s TensorFlow · Survey of Google’s TensorFlow What is Tensorflow? At it’s core Tensorflow is a library for large-scale computing. It excels at performing ...

Dive Deeper in Finance · 2017. 5. 18. · Today Generative models for financial time series –Sequential latent Gaussian Variational Autoencoder Implementation in TensorFlow –Recurrent

TensorFlow w/XLA: TensorFlow, Compiled! - Autodiff · PDF fileTensorFlow w/XLA: TensorFlow, Compiled! Expressiveness with performance Jeff Dean Google Brain team ... XLA:TPU TensorFlow

A Tutorial on Deep Learning Part 2: Autoencoders ...robotics.stanford.edu/~quocle/tutorial2.pdfA Tutorial on Deep Learning Part 2: Autoencoders, Convolutional Neural Networks and Recurrent

Variational Autoencoders

TensorFlow SYCL with triSYCL - The Khronos Group Inc · 2017-11-27 · TensorFlow CUDA is written with GPU target in mind… TensorFlow SYCL implementation –Keeps the TensorFlow