Dive Deeper in Finance · 2017. 5. 18. · Today Generative models for financial time series...

43
Dive Deeper in Finance GTC 2017 San José California Daniel Egloff Dr. sc. math. Managing Director QuantAlea May 7, 2017

Transcript of Dive Deeper in Finance · 2017. 5. 18. · Today Generative models for financial time series...

Page 1: Dive Deeper in Finance · 2017. 5. 18. · Today Generative models for financial time series –Sequential latent Gaussian Variational Autoencoder Implementation in TensorFlow –Recurrent

Dive Deeper in Finance

GTC 2017 – San José – California

Daniel EgloffDr. sc. math.Managing Director QuantAleaMay 7, 2017

Page 2: Dive Deeper in Finance · 2017. 5. 18. · Today Generative models for financial time series –Sequential latent Gaussian Variational Autoencoder Implementation in TensorFlow –Recurrent

Today

▪ Generative models for financial time series

– Sequential latent Gaussian Variational Autoencoder

▪ Implementation in TensorFlow

– Recurrent variational inference using TF control flow operations

▪ Applications to FX data

– 1s to 10s OHLC aggregated data

– Event based models for tick data is work in progress

Page 3: Dive Deeper in Finance · 2017. 5. 18. · Today Generative models for financial time series –Sequential latent Gaussian Variational Autoencoder Implementation in TensorFlow –Recurrent

Generative Models and GPUs

▪ What I cannot create, I do not understand (Richard Feynman)

▪ Generative models are recent innovation in Deep Learning

– GANs – Generative adversarial networks

– VAE – Variational autoencoders

▪ Training is computationally demanding

– Explorative modelling not possible without GPUs

Page 4: Dive Deeper in Finance · 2017. 5. 18. · Today Generative models for financial time series –Sequential latent Gaussian Variational Autoencoder Implementation in TensorFlow –Recurrent

Deep Learning

▪ Deep Learning in finance is complementary to existing models and not a replacement

▪ Deep Learning benefits

– Richer functional relationship between explanatory and response variables

– Model complicated interactions

– Automatic feature discovery

– Capable to handle large amounts of data

– Standard training procedures with back propagation and SGD

– Frameworks and tooling

Page 5: Dive Deeper in Finance · 2017. 5. 18. · Today Generative models for financial time series –Sequential latent Gaussian Variational Autoencoder Implementation in TensorFlow –Recurrent

Latent Variable – Encoding/Decoding

▪ Latent variable can be thought of a encoded representation of x

Encoder

𝑝 𝑧 𝑥𝑥

Decoder

𝑝 𝑥 𝑧𝑥𝑧

𝑝 𝑧

▪ Likelihood serves as decoder

▪ Posterior provides encoder

Page 6: Dive Deeper in Finance · 2017. 5. 18. · Today Generative models for financial time series –Sequential latent Gaussian Variational Autoencoder Implementation in TensorFlow –Recurrent

Intractable Maximum Likelihood

▪ Maximum likelihood standard model fitting approach

𝑝 𝑥 = 𝑝 𝑥 𝑧 𝑝 𝑧 𝑑𝑧 → max

▪ Problem: marginal 𝑝 𝑥 and posterior

𝑝 𝑧 𝑥 =𝑝 𝑥 𝑧 𝑝 𝑧

𝑝 𝑥

are intractable and their calculation suffers from exponential complexity

▪ Solutions

– Markov Chain MC, Hamiltonian MC

– Approximation and variational inference

Page 7: Dive Deeper in Finance · 2017. 5. 18. · Today Generative models for financial time series –Sequential latent Gaussian Variational Autoencoder Implementation in TensorFlow –Recurrent

Variational Autoencoders

▪ Assume latent space with prior 𝑝 𝑧

𝑥 𝑥𝑧

𝑝 𝑧

𝑝 𝑧 𝑥 𝑝 𝑥 𝑧

Page 8: Dive Deeper in Finance · 2017. 5. 18. · Today Generative models for financial time series –Sequential latent Gaussian Variational Autoencoder Implementation in TensorFlow –Recurrent

Variational Autoencoders

▪ Parameterize likelihood 𝑝 𝑥 𝑧 with a deep neural network

𝑥 𝑥𝑧

𝑝 𝑧

𝜇

𝜎

𝑝𝜑 𝑥 𝑧

𝑝 𝑧 𝑥

Page 9: Dive Deeper in Finance · 2017. 5. 18. · Today Generative models for financial time series –Sequential latent Gaussian Variational Autoencoder Implementation in TensorFlow –Recurrent

Variational Autoencoders

▪ Parameterize likelihood 𝑝 𝑥 𝑧 with a deep neural network

▪ Approximate intractable posterior 𝑝 𝑧 𝑥 with a deep neural network

𝑥 𝑥𝑧

𝑝 𝑧

𝜇

𝜎

𝑝𝜑 𝑥 𝑧

𝑝 𝑧 𝑥

𝑞𝜃 𝑧 𝑥

𝜇

𝜎

Page 10: Dive Deeper in Finance · 2017. 5. 18. · Today Generative models for financial time series –Sequential latent Gaussian Variational Autoencoder Implementation in TensorFlow –Recurrent

Variational Autoencoders

▪ Parameterize likelihood 𝑝 𝑥 𝑧 with a deep neural network

▪ Approximate intractable posterior 𝑝 𝑧 𝑥 with a deep neural network

▪ Learn the parameters 𝜃 and 𝜑 with backpropagation

𝑥 𝑥𝑧

𝑞𝜃 𝑧 𝑥

𝜇

𝜎

𝜇

𝜎

𝑝𝜑 𝑥 𝑧

𝑝 𝑧

Page 11: Dive Deeper in Finance · 2017. 5. 18. · Today Generative models for financial time series –Sequential latent Gaussian Variational Autoencoder Implementation in TensorFlow –Recurrent

▪ Which loss to optimize?

▪ Can we choose posterior from a flexible family of distributions Q byminimizing a distance to real posterior?

Variational Inference

𝑞∗ 𝑧 𝑥 = argmin𝜃∈𝑄

𝐾𝐿 𝑞𝜃 𝑧 𝑥 ฮ𝑝𝜑 𝑧 𝑥

𝐾𝐿 𝑞𝜃 𝑧 𝑥 ฮ𝑝𝜑 𝑧 𝑥 = 𝐸𝑞𝜃 𝑧 𝑥 log 𝑞𝜃 𝑧 𝑥 − 𝐸𝑞𝜃 𝑧 𝑥 log 𝑝𝜑 𝑥, 𝑧 +

log 𝑝𝜑 𝑥≥ 0

Can be made small if Q is flexible enough

▪ Problem: not computable because it involves marginal 𝑝𝜑 𝑥

Page 12: Dive Deeper in Finance · 2017. 5. 18. · Today Generative models for financial time series –Sequential latent Gaussian Variational Autoencoder Implementation in TensorFlow –Recurrent

▪ Which loss to optimize?

▪ Can we choose posterior from a flexible family of distributions Q byminimizing a distance to real posterior?

Variational Inference

𝑞∗ 𝑧 𝑥 = argmin𝜃∈𝑄

𝐾𝐿 𝑞𝜃 𝑧 𝑥 ฮ𝑝𝜑 𝑧 𝑥

0 ≤ 𝐸𝑞𝜃 𝑧 𝑥 log 𝑞𝜃 𝑧 𝑥 − 𝐸𝑞𝜃 𝑧 𝑥 log 𝑝𝜑 𝑥, 𝑧 + log 𝑝𝜑 𝑥

−𝐸𝐿𝐵𝑂(𝜃, 𝜑)

▪ Drop left hand side because positive

Page 13: Dive Deeper in Finance · 2017. 5. 18. · Today Generative models for financial time series –Sequential latent Gaussian Variational Autoencoder Implementation in TensorFlow –Recurrent

▪ Which loss to optimize?

▪ Can we choose posterior from a flexible family of distributions Q byminimizing a distance to real posterior?

Variational Inference

𝑞∗ 𝑧 𝑥 = argmin𝜃∈𝑄

𝐾𝐿 𝑞𝜃 𝑧 𝑥 ฮ𝑝𝜑 𝑧 𝑥

𝐸𝐿𝐵𝑂(𝜃, 𝜑) ≤ log 𝑝𝜑 𝑥

▪ Obtain tractable lower bound for marginal

▪ Training criterion: maximize evidence lower bound

Page 14: Dive Deeper in Finance · 2017. 5. 18. · Today Generative models for financial time series –Sequential latent Gaussian Variational Autoencoder Implementation in TensorFlow –Recurrent

▪ To interpret lower bound, write it as

Variational Inference

= 𝐸𝑞𝜃(𝑧|𝑥) log 𝑝𝜑 𝑥 𝑧 − 𝐾𝐿 𝑞𝜃 𝑧 𝑥 ԡ𝑝 𝑧

Reconstruction score

𝑧~𝑞𝜃 𝑧 𝑥𝑥

𝑝𝜑 𝑥 𝑧

Penalty of deviation from prior

log 𝑝𝜑 𝑥 ≥ 𝐸𝐿𝑂𝐵 𝜃, 𝜑

▪ The smaller the tighter the lower bound𝐾𝐿 𝑞𝜃 𝑧 𝑥 ฮ𝑝𝜑 𝑧 𝑥

Page 15: Dive Deeper in Finance · 2017. 5. 18. · Today Generative models for financial time series –Sequential latent Gaussian Variational Autoencoder Implementation in TensorFlow –Recurrent

Applications to Time Series

▪ Sequence structure for observable and latent factor

▪ Model setup

– Gaussian distributions with parameters calculated from deep recurrent neural network

– Prior standard Gaussian

– Model training with variational inference

Page 16: Dive Deeper in Finance · 2017. 5. 18. · Today Generative models for financial time series –Sequential latent Gaussian Variational Autoencoder Implementation in TensorFlow –Recurrent

Inference and Training

𝜇𝑡𝜎𝑡

𝜇𝑡−1𝜎𝑡−1

ℎ𝑡+1

ℎ𝑡+1

𝑧𝑡+1

𝑥𝑡+1

𝜇𝑡+1𝜎𝑡+1

𝜇𝑡𝜎𝑡

𝑥𝑡−1

𝑧𝑡−1

ℎ𝑡−1

ℎ𝑡−1

𝜇𝑡−1𝜎𝑡−1

𝜇𝑡+1𝜎𝑡+1

𝑥𝑡

ℎ𝑡

ℎ𝑡

𝑧𝑡

𝑞𝜃 𝑧 𝑥

Page 17: Dive Deeper in Finance · 2017. 5. 18. · Today Generative models for financial time series –Sequential latent Gaussian Variational Autoencoder Implementation in TensorFlow –Recurrent

▪ Probability distributions factorize

Implied Factorization

𝑝𝜑 𝑥≤𝑇 𝑧≤𝑇 =ෑ

𝑡=1

𝑇

𝑝𝜑 𝑥𝑡 𝑥<𝑡 , 𝑧≤𝑡 =ෑ

𝑡=1

𝑇

𝑁 𝑥𝑡 𝜇𝜑 𝑥<𝑡, 𝑧≤𝑡 , 𝜎𝜑 𝑥<𝑡, 𝑧≤𝑡

▪ Loss calculation

– Distributions can be easily simulated to calculate expectation term

– Kullback Leibler term can be calculated analytically

𝑞𝜃 𝑧≤𝑇 𝑥≤𝑇 =ෑ

𝑡=1

𝑇

𝑞𝜃 𝑧𝑡 𝑥<𝑡, 𝑧<𝑡 =ෑ

𝑡=1

𝑇

𝑁 𝑧𝑡 𝜇𝜃 𝑥<𝑡, 𝑧<𝑡 , 𝜎𝜃 𝑥<𝑡, 𝑧<𝑡

Page 18: Dive Deeper in Finance · 2017. 5. 18. · Today Generative models for financial time series –Sequential latent Gaussian Variational Autoencoder Implementation in TensorFlow –Recurrent

▪ Loss calculation

– Kullback Leibler term can be calculated analytically

– For fixed 𝑡 the quantities 𝜇𝜑, 𝜇𝜃, 𝜎𝜑, 𝜎𝜃 depend on

𝑧𝑡~𝑁 𝑧𝑡 𝜇𝜃 𝑥<𝑡, 𝑧<𝑡 , 𝜎𝜃 𝑥<𝑡, 𝑧<𝑡

– Simulate from this distribution to estimate expectation with a sample mean

Calculating ELBO

𝐸𝐿𝐵𝑂 𝜃, 𝜑 = −𝐸𝑞 ቂ

σ𝑡 ቄ

𝑥𝑡 − 𝜇𝜑𝑇𝜎𝜑

−1 𝑥𝑡 − 𝜇𝜑 + logdet 𝜎𝜑 +

𝜇𝜃𝑇𝜇𝜃 + 𝑡𝑟𝜎𝜃 − log det 𝜎𝜃

Approximate with Monte Carlo sampling from 𝑞𝜃 𝑧≤𝑇 𝑥≤𝑇

Page 19: Dive Deeper in Finance · 2017. 5. 18. · Today Generative models for financial time series –Sequential latent Gaussian Variational Autoencoder Implementation in TensorFlow –Recurrent

Generation

𝜇𝑡𝜎𝑡

ℎ𝑡+1

𝑧𝑡+1

𝑥𝑡+1

𝜇𝑡+1𝜎𝑡+1

𝑥𝑡−1

𝑧𝑡−1

ℎ𝑡−1

𝜇𝑡−1𝜎𝑡−1

𝑧𝑡 𝑝(𝑧)

ℎ𝑡

𝑥𝑡

𝑝𝜑 𝑥 𝑧

Page 20: Dive Deeper in Finance · 2017. 5. 18. · Today Generative models for financial time series –Sequential latent Gaussian Variational Autoencoder Implementation in TensorFlow –Recurrent

Time Series Embedding

▪ Single historical value not predictive enough

▪ Embedding

– Use lag of ~20 historical observations at every time step

Time steps

Batcht

t +1t +2

Page 21: Dive Deeper in Finance · 2017. 5. 18. · Today Generative models for financial time series –Sequential latent Gaussian Variational Autoencoder Implementation in TensorFlow –Recurrent

Implementation

▪ Implementation in TensorFlow

▪ Running on P100 GPUs for model training

▪ Long time series and large batch sizes require substantial GPU memory

Page 22: Dive Deeper in Finance · 2017. 5. 18. · Today Generative models for financial time series –Sequential latent Gaussian Variational Autoencoder Implementation in TensorFlow –Recurrent

TensorFlow Dynamic RNN

▪ Unrolling rnn with tf.nn.dynamic_rnn

– Simple to use

– Can handle variable sequence length

▪ Not flexible enough for generative networks

Page 23: Dive Deeper in Finance · 2017. 5. 18. · Today Generative models for financial time series –Sequential latent Gaussian Variational Autoencoder Implementation in TensorFlow –Recurrent

TensorFlow Control Structures

▪ Using tf.while_loop

– More to program, need to understand control structures in more detail

– Much more flexible

Page 24: Dive Deeper in Finance · 2017. 5. 18. · Today Generative models for financial time series –Sequential latent Gaussian Variational Autoencoder Implementation in TensorFlow –Recurrent

Implementation

▪ Notations

Page 25: Dive Deeper in Finance · 2017. 5. 18. · Today Generative models for financial time series –Sequential latent Gaussian Variational Autoencoder Implementation in TensorFlow –Recurrent

Implementation

▪ Variable and Weight Setup

Recurrent neuralnetwork definition

Page 26: Dive Deeper in Finance · 2017. 5. 18. · Today Generative models for financial time series –Sequential latent Gaussian Variational Autoencoder Implementation in TensorFlow –Recurrent

Implementation

▪ Allocate TensorArray objects

▪ Fill input TensorArray objects with data

Page 27: Dive Deeper in Finance · 2017. 5. 18. · Today Generative models for financial time series –Sequential latent Gaussian Variational Autoencoder Implementation in TensorFlow –Recurrent

Implementation

▪ While loop body inference part

Update inferencernn state

Page 28: Dive Deeper in Finance · 2017. 5. 18. · Today Generative models for financial time series –Sequential latent Gaussian Variational Autoencoder Implementation in TensorFlow –Recurrent

Implementation

▪ While loop body inference part

Update generatorrnn state

Page 29: Dive Deeper in Finance · 2017. 5. 18. · Today Generative models for financial time series –Sequential latent Gaussian Variational Autoencoder Implementation in TensorFlow –Recurrent

Implementation

▪ Call while loop

▪ Stacking TensorArray objects

Page 30: Dive Deeper in Finance · 2017. 5. 18. · Today Generative models for financial time series –Sequential latent Gaussian Variational Autoencoder Implementation in TensorFlow –Recurrent

Implementation

▪ Loss Calculation

Page 31: Dive Deeper in Finance · 2017. 5. 18. · Today Generative models for financial time series –Sequential latent Gaussian Variational Autoencoder Implementation in TensorFlow –Recurrent

FX Market

▪ FX market is largest and most liquid market in the world

▪ Decentralized over the counter market

– Not necessary to go through a centralized exchange

– No single price for a currency at a given point in time

▪ Fierce competition between market participants

▪ 24 hours, 5 ½ days per week

– As one major forex market closes, another one opens

Page 32: Dive Deeper in Finance · 2017. 5. 18. · Today Generative models for financial time series –Sequential latent Gaussian Variational Autoencoder Implementation in TensorFlow –Recurrent

FX Data

▪ Collect tick data from major liquidity provider e.g. LMAX

▪ Aggregation to OHLC bars (1s, 10s, …)

▪ Focus on US trading session

8am – 5pm EST

3am – 12am EST

5pm – 2am EST (Sidney)

London sessionUS session Asian session

7pm – 4am EST (Tokyo)

5 4 3 2 1 12 11 10 9 8 7 6 5 4 3 2 1 12 11 10 9 8 7 6

Page 33: Dive Deeper in Finance · 2017. 5. 18. · Today Generative models for financial time series –Sequential latent Gaussian Variational Autoencoder Implementation in TensorFlow –Recurrent

EURUSD 2016

Page 34: Dive Deeper in Finance · 2017. 5. 18. · Today Generative models for financial time series –Sequential latent Gaussian Variational Autoencoder Implementation in TensorFlow –Recurrent

Single Day

Page 35: Dive Deeper in Finance · 2017. 5. 18. · Today Generative models for financial time series –Sequential latent Gaussian Variational Autoencoder Implementation in TensorFlow –Recurrent

One Hour

Page 36: Dive Deeper in Finance · 2017. 5. 18. · Today Generative models for financial time series –Sequential latent Gaussian Variational Autoencoder Implementation in TensorFlow –Recurrent

10 Min Sampled at 1s

5 pips

1/10 pips = 1 deci-pip

At high frequency FX prices fluctuate in rangeof deci-pips

Larger jumps in the order ofmultiple pipsand more

Page 37: Dive Deeper in Finance · 2017. 5. 18. · Today Generative models for financial time series –Sequential latent Gaussian Variational Autoencoder Implementation in TensorFlow –Recurrent

Setup

▪ Normalize data with std deviation ො𝜎 over training interval

▪ 260 trading days in 2016, one model per day

▪ 60 dim embedding, 2 dim latent space

ො𝜎Training

Out of sample test

Page 38: Dive Deeper in Finance · 2017. 5. 18. · Today Generative models for financial time series –Sequential latent Gaussian Variational Autoencoder Implementation in TensorFlow –Recurrent

Results

Training

Page 39: Dive Deeper in Finance · 2017. 5. 18. · Today Generative models for financial time series –Sequential latent Gaussian Variational Autoencoder Implementation in TensorFlow –Recurrent

Out of Sample

Page 40: Dive Deeper in Finance · 2017. 5. 18. · Today Generative models for financial time series –Sequential latent Gaussian Variational Autoencoder Implementation in TensorFlow –Recurrent

Volatility of Prediction

Page 41: Dive Deeper in Finance · 2017. 5. 18. · Today Generative models for financial time series –Sequential latent Gaussian Variational Autoencoder Implementation in TensorFlow –Recurrent

Latent Variables

Page 42: Dive Deeper in Finance · 2017. 5. 18. · Today Generative models for financial time series –Sequential latent Gaussian Variational Autoencoder Implementation in TensorFlow –Recurrent

Pricing in E-Commerce

▪ Attend our talk on our latest work on AI and GPU accelerated genetic algorithms with Jet.com

Page 43: Dive Deeper in Finance · 2017. 5. 18. · Today Generative models for financial time series –Sequential latent Gaussian Variational Autoencoder Implementation in TensorFlow –Recurrent

Daniel EgloffDr. sc. math.Phone: +41 79 430 03 [email protected]

Contact details