Allison Gilmore, Data Scientist, Ayasdi at MLconf ATL - 9/18/15

18
Topological Learning September 18, 2015 Allison Gilmore, Ph.D.

Transcript of Allison Gilmore, Data Scientist, Ayasdi at MLconf ATL - 9/18/15

Board Meeting

Topological LearningSeptember 18, 2015Allison Gilmore, Ph.D.

Company Confidential & Proprietary2Why Topology?Data has shape.Shape has meaning.You already know this.

Company Confidential & Proprietary

Company Confidential & ProprietaryShape Encodes Structure and Meaning3LineClustersFlares

v

v

v

v

v

v

v

v

v

v

v

v

v

v

v

v

v

v

v

Loop

Company Confidential & ProprietaryIf you know the shape, you dont need to query everything significant is represented in the shape.

General ML algorithms only provide access to handful of shapes.

Often, data exhibits multiple shapes simultaneously

Complex data requires methods that can represent all shapes and combinations of shape = TDA

3

Complex Data Has ALL These Shapes4

Company Confidential & ProprietaryData Comes With the Wrong Coordinates5

Company Confidential & ProprietaryTopological Networks Capture Shape6

Color by a real-valued function

Bin in the image of that function

Cluster within bins, in the original space

Connect clusters that share data points

Company Confidential & ProprietaryCapturing Complex Shapes

Company Confidential & Proprietary8Shape shows underlying properties of data that yield insights and meaning.

Nodes are groups of similar objects Edges connect similar nodesColors let you see values of interestNode position on the screen does not matter

Network Orientation

Company Confidential & Proprietary

So, objects close together in the network (in the same or nearby nodes) are similar (on whatever features you used to build the network).

Disease State & Model Choice

Company Confidential & Proprietary

9

10

19822015

30 years of market and economic data and 150+ variables

The Shape of a MarketRegimesProgressionsCycles

Company Confidential & ProprietaryIn reality, all these shapes exist in any complex data set. Heres a data set that is being used by a major hedge fund to guide their investment strategy. Applying Ayasdi Machine Intelligence to 30 years of market and macroeconomic data, results in this topological summary. Each node in this diagram represents a collection of time periods where equity prices and other market indicators behaved similarly.

From this data, you can discern specific market regimes, progressions and cycles. Imagine if you can reliably understand the current regime in which we exist today and what are the most likely next progressions in market behavior based on analyzing historical trends. This technique reduces and simplifies thousands of man-hours of systematic and quantitative market research.10

11

All Recessions 1980-PresentRecessions Are Similar

Company Confidential & ProprietaryIf we interrogate for all the recessions included in the set, the result shows that the months that were in a recession cluster quite nicely (lower, right corner of the network)So what other information can we extract from this shape? 11

121982 Recovery2002 RecoveryBut Recoveries Are Different

Company Confidential & ProprietaryThe paths to recovery can be differentLets look at the path out of the 1982 recession (left panel) and out of the 2002 recession (right panel)They took very different paths and that should affirm our understanding that in 30 years of market cycles, the path taken out of each recession has been quite differentThe next step is to see if we can gain a better understanding of the current state and the implications for the future 12

Build Better Models13Identify market regimes that are analogous to the current state.

Profit from precise asset allocation and liquidity forecasting strategies.

Company Confidential & ProprietaryThe type of advanced discovery enables the traders to build precise trading models that fit current market regimes. They can interrogate time periods of interest, and see exactly what are the driving indicators of similarity, and determine what indicators typically followed next. From this they can build models to drive trading strategies.

These same types of analyses can also be used to assess risk by the CRO within the bank, to anticipate periods of increased uncertainty, or the likelihood of volatility.13

14Understanding Shape Improves Models14HighLowGround Truth Fraud

Model Predicted FraudHighLow

Company Confidential & Proprietary

14

Topology Guides Model Creation15

Company Confidential & ProprietaryTopology Yields Principled Local Models16

Company Confidential & Proprietary17Questions?

[email protected]

Company Confidential & Proprietary

Company Confidential & Proprietary

Corporate Headquarters4400 Bohannon DriveSuite #200Menlo Park, CA 94025ayasdi.com

Company Confidential & ProprietaryPCA Finds 3 Clusters

Company Confidential & ProprietaryPCA captures 98.4% of variance.TDA with PCA lenses shows 4 clusters.19

TDA Finds 4 Clusters

Company Confidential & Proprietary