Post on 30-May-2020
Topological DataAnalysis of
Financial TimeSeries
Koundinya Vajjha
Background andTheory
Persistent Homology
PersistenceLandscapes
Algorithm
Analysis
US Stock MarketIndices
Cryptocurrencies
High-Frequency Data
Summary
Topological Data Analysis of Financial TimeSeries
TDA Learning Seminar
Koundinya Vajjha
June 1, 2018
Topological DataAnalysis of
Financial TimeSeries
Koundinya Vajjha
Background andTheory
Persistent Homology
PersistenceLandscapes
Algorithm
Analysis
US Stock MarketIndices
Cryptocurrencies
High-Frequency Data
Summary
References
M. Gidea,Y .Katz.Topological data analysis of financial time series:Landscapes of crashes.Physica A: Statistical Mechanics and its Applications,491:820 - 834, 2018.
J. Kim et al.Introduction to the R package TDA.http://arxiv.org/abs/1411.1830
V. Kovacev-Nikolic,P. Bubenik,D. Nikolec,G. HeoUsing persistent homology and dynamical distances toanalyze protein binding.arXiv:1412.1394v2 [stat.ME], 2015
Topological DataAnalysis of
Financial TimeSeries
Koundinya Vajjha
Background andTheory
Persistent Homology
PersistenceLandscapes
Algorithm
Analysis
US Stock MarketIndices
Cryptocurrencies
High-Frequency Data
Summary
Outline
Background and TheoryPersistent HomologyPersistence Landscapes
Algorithm
AnalysisUS Stock Market IndicesCryptocurrenciesHigh-Frequency Data
Topological DataAnalysis of
Financial TimeSeries
Koundinya Vajjha
Background andTheory
Persistent Homology
PersistenceLandscapes
Algorithm
Analysis
US Stock MarketIndices
Cryptocurrencies
High-Frequency Data
Summary
Persistent HomologyVietoris-Rips Filtration
Given point cloud data X = {x1, . . . , xn} ∈ Rd . Associatethe Vietoris-Rips complex at resolution ε: VR(X , ε)
I For each k = 0, 1, . . . a k-simplex of vertices{xi1 , . . . , xik} is in VR(X , ε) if and only if the mutualdistance between any pair of vertices is less than ε.
d(xij , xil ) < ε for all j , l
I A k-simplex is included in VR(X , ε) for every set of kdata points that are indistinguishable from one anotherat resolution level ε.
Topological DataAnalysis of
Financial TimeSeries
Koundinya Vajjha
Background andTheory
Persistent Homology
PersistenceLandscapes
Algorithm
Analysis
US Stock MarketIndices
Cryptocurrencies
High-Frequency Data
Summary
Persistent HomologyBirth and Death
Given X = {x1, . . . , xn} ∈ Rd , if ε < ε′ then
VR(X , ε) ⊆ VR(X , ε′)
and soHk(VR(X , ε)) ↪→ Hk(VR(X , ε′))
for every k . Due to this, for every non-zero homology classα, there is a pair bα = ε1 < ε2 = dα such that α is
I not in the image of any Hk(VR(X , ε′)) for ε′ < ε1
I is non-zero in Hk(VR(X , ε′)) for ε1 < ε′ < ε2 (“birth”)
I is zero in in Hk(VR(X , ε′)) for ε′ > ε2 (“death”)
Topological DataAnalysis of
Financial TimeSeries
Koundinya Vajjha
Background andTheory
Persistent Homology
PersistenceLandscapes
Algorithm
Analysis
US Stock MarketIndices
Cryptocurrencies
High-Frequency Data
Summary
Persistence Diagrams
The information on the k-dimensional homology generatorsat all scales can be encoded in a “Persistence Diagram” Pk ,which consists of:
I For each k-dimensional homology class α, one assigns apoint zα = (bα, dα) ∈ R2, together with it’s multiplicityµ(bα, dα) (the number of classes that are born at bαand die at dα).
I All points on the positive diagonal in R2: representstrivail homology generators that are born and instantlydie at every level. Each point on the diagonal hasinfinite multiplicity.
Topological DataAnalysis of
Financial TimeSeries
Koundinya Vajjha
Background andTheory
Persistent Homology
PersistenceLandscapes
Algorithm
Analysis
US Stock MarketIndices
Cryptocurrencies
High-Frequency Data
Summary
Persistence DiagramsBarcode and Diagram
Topological DataAnalysis of
Financial TimeSeries
Koundinya Vajjha
Background andTheory
Persistent Homology
PersistenceLandscapes
Algorithm
Analysis
US Stock MarketIndices
Cryptocurrencies
High-Frequency Data
Summary
Persistence DiagramsSpace of all Diagrams
I The space (multiset) of all such persistent diagrams Pcan be endowed with a metric Wp called the degree pWasserstein distance (p ≥ 1) or the Bottleneck distance(p =∞).
I But these metric spaces (P,Wp) are not complete!Which is inconvenient for statistical purposes. (ForSLLN and CLT type results.)
I A workaround is to embed the space P into the BanachSpace Lp(N× R) via persistence landscapes.
Topological DataAnalysis of
Financial TimeSeries
Koundinya Vajjha
Background andTheory
Persistent Homology
PersistenceLandscapes
Algorithm
Analysis
US Stock MarketIndices
Cryptocurrencies
High-Frequency Data
Summary
Persistence Landscapes
For each birth and death point (bα, dα) ∈ Pk , first define
f(bα,dα)(x) =
x − bα if x ∈
(bα,
bα+dα2
]−x + dα if x ∈
(bα+dα
2 , dα)
0 if x /∈ (bα, dα)
To a persistence diagram Pk , we associate a sequence offunctions λ = (λn)n∈N where λn : R→ [0, 1] is given by
λj(x) = j −max{f(bα,dα)(x)|(bα, dα) ∈ Pk}
where j-max denotes the j-th largest value of a function.λk(x) = 0 if the k-th largest value does not exist.
Topological DataAnalysis of
Financial TimeSeries
Koundinya Vajjha
Background andTheory
Persistent Homology
PersistenceLandscapes
Algorithm
Analysis
US Stock MarketIndices
Cryptocurrencies
High-Frequency Data
Summary
Persistence Diagrams
This is a picture of a function f(1,7) associated to a barcode.(Images taken from [3])
Topological DataAnalysis of
Financial TimeSeries
Koundinya Vajjha
Background andTheory
Persistent Homology
PersistenceLandscapes
Algorithm
Analysis
US Stock MarketIndices
Cryptocurrencies
High-Frequency Data
Summary
Persistence Landscapes
This is a picture of the persistence landscape associated to abarcode.(Images taken from [3])
Topological DataAnalysis of
Financial TimeSeries
Koundinya Vajjha
Background andTheory
Persistent Homology
PersistenceLandscapes
Algorithm
Analysis
US Stock MarketIndices
Cryptocurrencies
High-Frequency Data
Summary
Persistence Landscapes
I We have associated to a persistence diagram Pk asequence of functions λ = (λn)n∈N ∈ Lp(N× R) whichis a Banach space.
I In general it is not possible to go back and forthbetween diagrams and landscapes.
I However, this whole exercise makes persistencelandscapes suitable for treatment via statisticalmethods!
Henceforth, we shall only consider L1, L2 norms and only1-dimensional homology.
Topological DataAnalysis of
Financial TimeSeries
Koundinya Vajjha
Background andTheory
Persistent Homology
PersistenceLandscapes
Algorithm
Analysis
US Stock MarketIndices
Cryptocurrencies
High-Frequency Data
Summary
TDA on Time Series
A time series is a series of data points indexed (or listed orgraphed) in time order. Here are the general steps of thealgorithm in [1].
I Consider d time series {xkn }n, k = 1, . . . , d . So for eachtime instance tn, we have a pointx(tn) = (x1n , . . . , x
dn ) ∈ Rd .
I Pick a sliding window w . For each time-window of sizew we get a point cloud data set consisting of w pointsin Rd , namely Xn = (x(tn), x(tn+1), . . . , x(tn+w−1))
I TDA is then applied on top of the time-orderedsequence of point clouds to study the time-varyingtopological properties of the multidimensional timeseries, from window to window.
Topological DataAnalysis of
Financial TimeSeries
Koundinya Vajjha
Background andTheory
Persistent Homology
PersistenceLandscapes
Algorithm
Analysis
US Stock MarketIndices
Cryptocurrencies
High-Frequency Data
Summary
TDA on Time Series
I For each point cloud, we compute the Vietoris-RipsFiltration, the corresponding persistence landscape, andit’s Lp-norms for p = 1, 2.
I We plot the Lp-norms and observe how they behavearound market crashes. General observation is thenorms are sensitive to to transitions in the state of asystem from regular to ’heated’.
I Using the R package “TDA”, all this can be done in fewlines of code!
Topological DataAnalysis of
Financial TimeSeries
Koundinya Vajjha
Background andTheory
Persistent Homology
PersistenceLandscapes
Algorithm
Analysis
US Stock MarketIndices
Cryptocurrencies
High-Frequency Data
Summary
Empirical Analysis of Stock Market Indices
I set out to replicate the results in the paper.
I Downloaded adjusted closing prices for four time series:S&P 500, NASDAQ, DJIA, Russel 2000. Calculated thelog-returns.
I Sliding window length w=100 days.
I Applied TDA and plotted the L1 and L2 norms.
Topological DataAnalysis of
Financial TimeSeries
Koundinya Vajjha
Background andTheory
Persistent Homology
PersistenceLandscapes
Algorithm
Analysis
US Stock MarketIndices
Cryptocurrencies
High-Frequency Data
Summary
Topological DataAnalysis of
Financial TimeSeries
Koundinya Vajjha
Background andTheory
Persistent Homology
PersistenceLandscapes
Algorithm
Analysis
US Stock MarketIndices
Cryptocurrencies
High-Frequency Data
Summary
TDA on Cryptocurrencies
I The cryptocurrency market is extremely volatile -frequent crashes. Most cryptocurrencies seem to behighly correlated. Perfect candidate for TDA!
I Bitcoin lost nearly 70% between December 2017 andFebruary 2018!
I What do the Lp norms show during this period?
Topological DataAnalysis of
Financial TimeSeries
Koundinya Vajjha
Background andTheory
Persistent Homology
PersistenceLandscapes
Algorithm
Analysis
US Stock MarketIndices
Cryptocurrencies
High-Frequency Data
Summary
TDA on Cryptocurrencies
Point cloud now consists of four cryptocurrencies: Bitcoin,Ethereum, Ripple and Bitcoin Cash.
Topological DataAnalysis of
Financial TimeSeries
Koundinya Vajjha
Background andTheory
Persistent Homology
PersistenceLandscapes
Algorithm
Analysis
US Stock MarketIndices
Cryptocurrencies
High-Frequency Data
Summary
High-Frequency TDA
I High frequency data is time series of stock price datawith intervals of a few minutes.
I Time Series Analysis is difficult and usually bears littleresemblance to lower frequency data.
I Does TDA tell us anything for high frequency data?
Topological DataAnalysis of
Financial TimeSeries
Koundinya Vajjha
Background andTheory
Persistent Homology
PersistenceLandscapes
Algorithm
Analysis
US Stock MarketIndices
Cryptocurrencies
High-Frequency Data
Summary
High-Frequency TDA
Point cloud data consists of 10 minute stock prices of fivecompanies listed on the Bombay Stock Exchange: CIPLA,TATA STEEL, RELIANCE, INDIGO, SPICEJET
Topological DataAnalysis of
Financial TimeSeries
Koundinya Vajjha
Background andTheory
Persistent Homology
PersistenceLandscapes
Algorithm
Analysis
US Stock MarketIndices
Cryptocurrencies
High-Frequency Data
Summary
High-Frequency TDAResults
Took the sliding window to be b = 5 days. This chart showsresults for SPICEJET and INDIGO.
No conclusive findings!
Topological DataAnalysis of
Financial TimeSeries
Koundinya Vajjha
Background andTheory
Persistent Homology
PersistenceLandscapes
Algorithm
Analysis
US Stock MarketIndices
Cryptocurrencies
High-Frequency Data
Summary
Summary
I TDA for time series shows promise, however, robustjustification for findings is needed to rule outcorrelation-causation fallacies.
I Further workI Does volatility in the markets cause topological patterns
in the returns data? Do known models show this?I Can these empirical findings be explained by theory?