Applications of Persistence
Transcript of Applications of Persistence
APPLICATIONS OF PERSISTENCE
August 13, 2018
Lori ZiegelmeierTutorial on Multiparameter Persistence, Computation, and ApplicationsInstitute for Mathematics and its Applications
INTRODUCTION
MOTIVATION:
Overview:
Topological data analysis has been applied to a wide variety ofapplications. This talk will sample a few:
# Sensor networks# Signal analysis# Natural imagery# Country development statistics# Simulations of partial differential equations# Biological aggregations
3
DATA COMES IN MANY FORMS
Data Type Topological MethodsPoint Cloud (PC) Vietoris-Rips Complex,
α-Complex,Cech Complex,Witness Complex
Function Sub or super-level set persistenceImage Cubical complex,
Treat as surface – sub or super-level set persistence,Treat pixels as points in a PC,
Embed as points in GrassmanianTime Series Time-delay embedding into PC
Family of Time Series Filtration tied to pairwise correlationsTime Varying Data Zig-zag persistence
CROCKER plot
4
WHAT CAN WE DO ONCE WE’VE RUN PERSISTENCE?
# Verify intuition of topological space was correct# Use union find algorithm to find elements of clusters# Look at generators of homology classes# Compute confidence intervals to determine noise/signal# Compute distances between persistence diagrams# Transform persistence diagram into new space for machine
learning tasks5
CLASSIC APPLICATIONS OFPERSISTENT HOMOLOGY
SENSOR NETWORKS
(de Silva, Ghrist 2007) Coverage in Sensor Networks via PersistentHomology
Determine when static nodes with minimal and localized sensingcapabilities completely cover a bounded domain of unknowntopological type.
7
SIGNAL ANALYSIS
(Perea, Harer 2013) Sliding Windows and Persistence: An Application ofTopological Methods to Signal Analysis
A sliding window point cloud for signal f :
SWM ,τf (t) �
f (t)
f (t + τ)...
f (t +Mτ)
creates time-delayed embedding {SWM ,τf (t1), . . . ,SWM ,τf (tS)}, a finitepoint cloud in RM+1
1-dimensional persistence provides measure of periodicity in signal
8
SIGNAL ANALYSIS
(Perea, Harer 2013) Sliding Windows and Persistence: An Application ofTopological Methods to Signal Analysis
A sliding window point cloud for signal f :
SWM ,τf (t) �
f (t)
f (t + τ)...
f (t +Mτ)
creates time-delayed embedding {SWM ,τf (t1), . . . ,SWM ,τf (tS)}, a finitepoint cloud in RM+1
1-dimensional persistence provides measure of periodicity in signal8
NATURAL IMAGERY
(Carlsson et al. 2008) On the Local Behavior of Spaces of NaturalImages
Images in next few slides from presentation by Henry Adams
9
NATURAL IMAGERY PERSISTENCE
Densest patches according to a global estimate
Interpretation: nature prefers linearity
10
NATURAL IMAGERY PERSISTENCE
Densest patches according to a global estimate
Interpretation: nature prefers linearity
10
NATURAL IMAGERY PERSISTENCE
Densest patches according to an intermediate estimate
Interpretation: nature prefers horizontal and vertical directions
11
NATURAL IMAGERY PERSISTENCE
Densest patches according to an intermediate estimate
Interpretation: nature prefers horizontal and vertical directions
11
NATURAL IMAGERY PERSISTENCE
Densest patches according to a local estimate
12
NATURAL IMAGERY PERSISTENCE
Densest patches according to a local estimate
12
NATURAL IMAGERY PERSISTENCE
13
ANALYZING GLOBAL DEVELOPMENT
GAPMINDER PROJECT
The Gapminder project set out to use statistics to dispel simplisticnotions about global development.
Goal:
Use TDA to understand global development by looking at topolog-ical structure of health and wealth statistics in unbiased approach.
(Banman, Z. 2018) Mind the Gap: A Study in Global Development throughPersistent Homology 15
HEALTH AND WEATH DATA
Development indicators:
# 194 countries with most recently available data (usually from2015, 2016, with others as early as 2005)
Indicator Max Min Median Mean Stand Dev
Scaled Mean
GDP 148374 599 11903 18972 21523
-0.476
Life Exp. 84.8 48.86 74.5 72.56 7.74
0.296
Infant Mor. 96 1.5 23.89 15 21.9
0.528
GNI 87030 350 8360 13596 15399
-0.431
Preprocessing:
# Modulate value of GDP outliers to two standard deviationsfrom mean
# Re-scale each indicator to [−1, 1]
16
HEALTH AND WEATH DATA
Development indicators:
# 194 countries with most recently available data (usually from2015, 2016, with others as early as 2005)
Indicator Max Min Median Mean Stand Dev
Scaled Mean
GDP 148374 599 11903 18972 21523
-0.476
Life Exp. 84.8 48.86 74.5 72.56 7.74
0.296
Infant Mor. 96 1.5 23.89 15 21.9
0.528
GNI 87030 350 8360 13596 15399
-0.431
Preprocessing:
# Modulate value of GDP outliers to two standard deviationsfrom mean
# Re-scale each indicator to [−1, 1]16
HEALTH AND WEATH DATA
Development indicators:
# 194 countries with most recently available data (usually from2015, 2016, with others as early as 2005)
Indicator Max Min Median Mean Stand Dev Scaled Mean
GDP 148374 599 11903 18972 21523 -0.476Life Exp. 84.8 48.86 74.5 72.56 7.74 0.296Infant Mor. 96 1.5 23.89 15 21.9 0.528GNI 87030 350 8360 13596 15399 -0.431
Preprocessing:
# Modulate value of GDP outliers to two standard deviationsfrom mean
# Re-scale each indicator to [−1, 1]16
TWO ANALYSIS METHODS
1. Analyze clusters of ‘health’ and ‘wealth’◦ Define distance between countries with indicators I as
dI : R|I | → R
dI(x , y) �√∑
i∈I(xi − yi)2
2. Add geographic structure to the data by constructing aweighted graph over the countries and their borders.◦ Define adjacency matrix A
A(i ,j) �{
1 if countries i , j share a border,0 if countries i , j do not share a border
◦ Define distance between countries as
Di ,j �
{d(i , j) if Ai ,j � 1,∞ if Ai ,j � 0
17
CLUSTERING OF DEVELOPMENT GROUPS
18
CLUSTERING OF DEVELOPMENT GROUPS
(a) ε � 0.08with 54, 52, 14, 10, 8countries among 41 total clusters
(b) ε � 0.10with 132, 18, 10, 6, 2countries among 25 total clusters
(c) ε � 0.12 with 164, 6, 2, 2, 2countries among 19 total clusters
(d) ε � 0.14 with 170, 8, 3, 2, 2countries among 13 total clusters
Figure: Five largest clusters at varying scale parameters
19
CLUSTERING OF DEVELOPMENT GROUPS
Table: Countries (138) comprising the 6 largest clusters at ε � 0.08 and means of scaledindicators, GDP/capita (GDP) and life expectancy (LE).
Countries (ISO2) GDP LEBangladesh, Kyrgyzstan, Cambodia, Mauritania, Micronesia Fed. Sts., Nepal, Syria, Gam-bia, Comoros, Myanmar, Sudan, Sao Tome and Principe, India, Laos, Marshall Islands,Guyana, Pakistan, Ghana, Nigeria, Yemen Rep., Djibouti, Kenya, Senegal, Tanzania, Vanu-atu, Haiti, Liberia, Madagascar, Solomon Islands, Ethiopia, Rwanda, Benin, Kiribati, BurkinaFaso, Burundi, Congo Dem. Rep., Niger, Papua New Guinea, Togo, Uganda, Zimbabwe,Eritrea, Mali, Malawi, Guinea, Cote d’Ivoire, Cameroon, Sierra Leone, Mozambique, Chad,Zambia, South Sudan, Guinea-Bissau, Fiji
-0.93 -0.15
Albania, Bosnia and Herzegovina, Colombia, Jordan, Sri Lanka, Tunisia, Peru, MacedoniaFYR, Barbados, China, Dominican Rep., Algeria, Ecuador, Montenegro, Serbia, Thailand,Bulgaria, Brazil, Iran, Venezuela, Mauritius, Mexico, Romania, Argentina, Saint Lucia, Ar-menia, Jamaica, Paraguay, El Salvador, Morocco, Vietnam, Bolivia, Bhutan, Cape Verde,Georgia, Guatemala, Honduras, Moldova, Samoa, Belize, Ukraine, Indonesia, Philippines,Saint Vincent and the Grenadines, Egypt, Grenada, Tonga, Uzbekistan, Tajikistan, KoreaDem. Rep., Timor-Leste, Palestine
-0.69 0.44
Antigua and Barbuda, Croatia, Uruguay, Cuba, Panama, Turkey, Lebanon -0.37 0.63Estonia, Poland, Slovak Republic, Hungary, Latvia, Malaysia, Lithuania, Seychelles -0.19 0.53Cyprus, Malta, Slovenia, Israel, Spain, Italy, Korea Rep., New Zealand, Portugal, Greece -0.02 0.83Austria, Australia, Canada, Germany, Denmark, Netherlands, Sweden, Belgium, Taiwan,Finland, France, United Kingdom, Bahrain, Ireland
0.38 0.80
20
GEOGRAPHICDEVELOPMENTPATTERNS (WEIGHTEDNETWORK)
21
GEOGRAPHIC DEVELOPMENT PATTERNS
Table: South America cycle withinterval [0.34, 0.62)
Country GDP LE
Chile -0.29 0.71Peru -0.63 0.72Bolivia -0.81 0.37Brazil -0.52 0.43Argentina -0.45 0.55
Table: North Africa cycle withinterval [0.85, 0.97)
Country GDP LE
Libya -0.46 0.36Niger -0.99 -0.31Mali -0.96 -0.36Mauritania -0.89 0.17Algeria -0.58 0.54
22
ENCODING A PERSISTENCE DIAGRAMIN A NEW SPACE
PERSISTENCE DIAGRAMS AS A METRIC SPACE
The space of Persistence Diagrams (PDs) live in a metric space.
death
birth
Definition
The p-Wasserstein distance between two PDs B and B′ is given by
Wp(B,B′) � infγ:B→B′
(∑u∈B| |u − γ(u)| |p∞
)1/p,
where 1 ≤ p < ∞ and γ ranges over bijections between B and B′.
24
PERSISTENCE DIAGRAMS AS A METRIC SPACE
The space of Persistence Diagrams (PDs) can be endowed with ametric.
death
birth
Definition
The bottleneck distance between two PDs B and B′ is given by
W∞(B,B′) � infγ:B→B′
supu∈B| |u − γ(u)| |∞ ,
where ranges over bijections between B and B′.
25
TRANSFORMATIONS OF PERSISTENCE DIAGRAMS
# Rouse et al. (2015) create a vector representation by superimposing a grid over a PDand counting number of points in each bin.
# Carriere et al. (2015) develop a stable vector representation by rearranging the entriesof the distance matrix between points in a PD.
# Reininghaus et al. (2015) produce a stable surface from a PD by taking sum of apositive Gaussian centered on each PD point together with a negative Gaussiancentered on its reflection below the diagonal.
# Bubenik (2015) develops persistence landscape (PL), a stable functionalrepresentation of a PD that lies in a Banach space.
26
TRANSFORMATIONS OF PERSISTENCE DIAGRAMS
# Rouse et al. (2015) create a vector representation by superimposing a grid over a PDand counting number of points in each bin.
# Carriere et al. (2015) develop a stable vector representation by rearranging the entriesof the distance matrix between points in a PD.
# Reininghaus et al. (2015) produce a stable surface from a PD by taking sum of apositive Gaussian centered on each PD point together with a negative Gaussiancentered on its reflection below the diagonal.
# Bubenik (2015) develops persistence landscape (PL), a stable functionalrepresentation of a PD that lies in a Banach space.
26
TRANSFORMATIONS OF PERSISTENCE DIAGRAMS
# Rouse et al. (2015) create a vector representation by superimposing a grid over a PDand counting number of points in each bin.
# Carriere et al. (2015) develop a stable vector representation by rearranging the entriesof the distance matrix between points in a PD.
# Reininghaus et al. (2015) produce a stable surface from a PD by taking sum of apositive Gaussian centered on each PD point together with a negative Gaussiancentered on its reflection below the diagonal.
# Bubenik (2015) develops persistence landscape (PL), a stable functionalrepresentation of a PD that lies in a Banach space.
26
TRANSFORMATIONS OF PERSISTENCE DIAGRAMS
# Rouse et al. (2015) create a vector representation by superimposing a grid over a PDand counting number of points in each bin.
# Carriere et al. (2015) develop a stable vector representation by rearranging the entriesof the distance matrix between points in a PD.
# Reininghaus et al. (2015) produce a stable surface from a PD by taking sum of apositive Gaussian centered on each PD point together with a negative Gaussiancentered on its reflection below the diagonal.
# Bubenik (2015) develops persistence landscape (PL), a stable functionalrepresentation of a PD that lies in a Banach space.
26
TRANSFORMATIONS OF PERSISTENCE DIAGRAMS
# Rouse et al. (2015) create a vector representation by superimposing a grid over a PDand counting number of points in each bin.
# Carriere et al. (2015) develop a stable vector representation by rearranging the entriesof the distance matrix between points in a PD.
# Reininghaus et al. (2015) produce a stable surface from a PD by taking sum of apositive Gaussian centered on each PD point together with a negative Gaussiancentered on its reflection below the diagonal.
# Bubenik (2015) develops persistence landscape (PL), a stable functionalrepresentation of a PD that lies in a Banach space.
26
PERSISTENCE IMAGE
Persistence image, a transformation of a persistence diagram thatlies in Rn
birth
death
persistence
birth
data diagram B diagram T(B) surface image
(Adams et al. 2017) Persistence Images: A Stable Vector Representation ofPersistent Homology
27
ANISOTROPICKURAMOTO-SIVASHINSKY (AKS)
EQUATION
ANISOTROPIC KURAMOTO-SIVASHINSKY (AKS) EQUATION
# Partial differential equation for a function u(x , y , t) of spatialvariables x, y, and time t.
# Kuramoto-Sivashinsky equation derived in problems involvingpattern formation such as surface nanopatterning by ion-beamerosion, epitaxial growth, and solidification from a melt.
# Anisotropic Kuramoto-Sivashinsky (aKS) Equation is given by
∂∂t
u � −∇2u − ∇2∇2u + r(∂∂x
u)2
+
(∂∂y
u)2,
where ∇2 � ∂2
∂x2 +∂2
∂y2 , and the real parameter r controls thedegree of anisotropy.
# For a fixed time t∗, u(x , y , t∗) is a patterned surface (periodic inboth x and y) defined over the (x , y)-plane.
29
PARAMETER OF THE AKS EQUATION
Figure: Plots of u(x , y , ·) from simulations of the aKS equation. Columnsrepresent parameters r � 1, 1.25, 1.5, 1.75 and 2. Rows represent time:t � 3 (top) and t � 5 (bottom).
Figure: Surfaces u(x , y , 3) for r � 1.75 or r � 2. Can you group theimages by eye?Answer: (from left) r � 1.75, 2, 1.75, 2, 2.
30
PARAMETER OF THE AKS EQUATION
Figure: Plots of u(x , y , ·) from simulations of the aKS equation. Columnsrepresent parameters r � 1, 1.25, 1.5, 1.75 and 2. Rows represent time:t � 3 (top) and t � 5 (bottom).
Figure: Surfaces u(x , y , 3) for r � 1.75 or r � 2. Can you group theimages by eye?
Answer: (from left) r � 1.75, 2, 1.75, 2, 2.
30
PARAMETER OF THE AKS EQUATION
Figure: Plots of u(x , y , ·) from simulations of the aKS equation. Columnsrepresent parameters r � 1, 1.25, 1.5, 1.75 and 2. Rows represent time:t � 3 (top) and t � 5 (bottom).
Figure: Surfaces u(x , y , 3) for r � 1.75 or r � 2. Can you group theimages by eye?Answer: (from left) r � 1.75, 2, 1.75, 2, 2.
30
AKS CLASSIFICATION METHODS
Goal:
Classify (150, 30 for each parameter) trials of the aKS Equation byparameter (5 values) using snapshots of the surfaces u(x , y , ·) asthey evolve in time (5 values).
Methods of classification:
1. Surfaces viewed as points in R266144. Reduce resolution to 10 × 10 bycoarsening the discretization of the spatial domain. Classify with SubspaceDiscriminant Ensemble.
2. Parameter r influences mean and amplitude of pattern. Use a normaldistribution-based classifier built on the variances of the surface heights.
3. Sublevel set filtration PD. Generate PIs with resolution 10 × 10 and variance0.01. Classify H0 ,H1 and concatenated PIs using Subspace DiscriminantEnsemble.
31
AKS CLASSIFICATION METHODS
Goal:
Classify (150, 30 for each parameter) trials of the aKS Equation byparameter (5 values) using snapshots of the surfaces u(x , y , ·) asthey evolve in time (5 values).
Methods of classification:
1. Surfaces viewed as points in R266144. Reduce resolution to 10 × 10 bycoarsening the discretization of the spatial domain. Classify with SubspaceDiscriminant Ensemble.
2. Parameter r influences mean and amplitude of pattern. Use a normaldistribution-based classifier built on the variances of the surface heights.
3. Sublevel set filtration PD. Generate PIs with resolution 10 × 10 and variance0.01. Classify H0 ,H1 and concatenated PIs using Subspace DiscriminantEnsemble.
31
AKS CLASSIFICATION METHODS
Goal:
Classify (150, 30 for each parameter) trials of the aKS Equation byparameter (5 values) using snapshots of the surfaces u(x , y , ·) asthey evolve in time (5 values).
Methods of classification:
1. Surfaces viewed as points in R266144. Reduce resolution to 10 × 10 bycoarsening the discretization of the spatial domain. Classify with SubspaceDiscriminant Ensemble.
2. Parameter r influences mean and amplitude of pattern. Use a normaldistribution-based classifier built on the variances of the surface heights.
r = 1.00r = 1.25r = 1.50r = 1.75r = 2.00
variance5040 60 70 80 90 100
0.02
0.06
0.10
0.14
2019 21 22 23 24 25
0.10
0.30
0.50
dens
ity
variance
(b)(a)
3. Sublevel set filtration PD. Generate PIs with resolution 10 × 10 and variance0.01. Classify H0 ,H1 and concatenated PIs using Subspace DiscriminantEnsemble.
31
AKS CLASSIFICATION METHODS
Goal:
Classify (150, 30 for each parameter) trials of the aKS Equation byparameter (5 values) using snapshots of the surfaces u(x , y , ·) asthey evolve in time (5 values).
Methods of classification:
1. Surfaces viewed as points in R266144. Reduce resolution to 10 × 10 bycoarsening the discretization of the spatial domain. Classify with SubspaceDiscriminant Ensemble.
2. Parameter r influences mean and amplitude of pattern. Use a normaldistribution-based classifier built on the variances of the surface heights.
3. Sublevel set filtration PD. Generate PIs with resolution 10 × 10 and variance0.01. Classify H0 ,H1 and concatenated PIs using Subspace DiscriminantEnsemble.
31
AKS EQUATION CLASSIFICATION ACCURACY
Classification ApproachTimet=3
Timet=5
Timet=10
Subspace Discriminant Ensemble, Resized Surfaces 26.0 % 19.3% 19.3 %Variance Normal Distribution Classifier 20.74% 75.2% 77.62 %
Subspace Discriminant Ensemble, H0 PIs 58.3 % 96.0 % 94.7 %Subspace Discriminant Ensemble, H1 PIs 67.7 % 87.3 % 93.3%
Subspace Discriminant Ensemble, H0 and H1 PIs 72.7 % 95.3 % 97.3 %
32
BIOLOGICAL AGGREGATIONS:TOPOLOGY EVOLVING IN TIME
BIOLOGICAL AGGREGATIONS
In many natural systems, particles, organisms, or agents interactlocally according to rules that produce aggregate behavior.
34
CLASSIC WAY TO ANALYZE BIOLOGICAL AGGREGATIONS
Alignment Order Parameter: ϕ(t) � 1Nv0
����� N∑i�1
®vi(t)�����
(B)
0
7
0 7
Goal:
Use topology to analyze the collective behavior of interacting agents’positions and velocities as time evolves.
35
CLASSIC WAY TO ANALYZE BIOLOGICAL AGGREGATIONS
Alignment Order Parameter: ϕ(t) � 1Nv0
����� N∑i�1
®vi(t)�����
(B)
0
7
0 7
Goal:
Use topology to analyze the collective behavior of interacting agents’positions and velocities as time evolves.
35
CLASSIC WAY TO ANALYZE BIOLOGICAL AGGREGATIONS
Alignment Order Parameter: ϕ(t) � 1Nv0
����� N∑i�1
®vi(t)�����
(B)
0
7
0 7
Goal:
Use topology to analyze the collective behavior of interacting agents’positions and velocities as time evolves.
35
CLASSIC WAY TO ANALYZE BIOLOGICAL AGGREGATIONS
Alignment Order Parameter: ϕ(t) � 1Nv0
����� N∑i�1
®vi(t)�����
(B)
0
7
0 7
Goal:
Use topology to analyze the collective behavior of interacting agents’positions and velocities as time evolves.
35
COMPUTE PERSISTENT HOMOLOGY
36
COMPUTE PERSISTENT HOMOLOGY
36
COMPUTE PERSISTENT HOMOLOGY
36
EVOLVE IN TIME
Compute the kth Betti number bk(ε, t),
CROCKER plot
Contour Realization Of Computed K -dimensional hole Evolutionin the Rips complex (CROCKER)
(Topaz, Z., Halverson 2015) Topological Data Analysis of Biological AggregationModels
37
EVOLVE IN TIME
Compute the kth Betti number bk(ε, t),
CROCKER plot
Contour Realization Of Computed K -dimensional hole Evolutionin the Rips complex (CROCKER)
(Topaz, Z., Halverson 2015) Topological Data Analysis of Biological AggregationModels 37
VICSEK MODEL
VICSEK MODEL
# Highly cited dynamical system in discrete time and continuousspace.
# Describes motion of interacting point particles in a square withperiodic boundary conditions.
# Model written as:
θi(t + ∆t) �1N
©«∑
|®xi−®xj |≤Rθj(t)
ª®¬ + U(−η/2, η/2)
®vi(t + ∆t) � v0(cos θi(t + ∆t), sin θi(t + ∆t)
)®xi(t + ∆t) � ®xi(t) + ®vi(t + ∆t)∆t
39
VICSEK MODEL
# Highly cited dynamical system in discrete time and continuousspace.
# Describes motion of interacting point particles in a square withperiodic boundary conditions.
39
VICSEK MODEL
# Highly cited dynamical system in discrete time and continuousspace.
# Describes motion of interacting point particles in a square withperiodic boundary conditions.
39
TOPOLOGY OF INITIAL CONDITION
Three-torus T3: b � (1, 3, 3, 1, 0, . . .)
40
TOPOLOGY OF INITIAL CONDITION
(A)0
25
0 25
(B)
0.0 0.5 1.0 1.5 2.0
Proximity Parameter ε
(C)
1.0 1.5 2.0
Proximity Parameter ε
(D)0.0
0.5
1.0
1.5
2.0
0.0 0.5 1.0 1.5 2.0
Starting ε
Endin
g ε
Betti Number
0
1
Three-torus T3: b � (1, 3, 3, 1, 0, . . .)40
VICSEK SIMULATION A ANALYSIS
(A)
0.00
0.25
0.50
0.75
1.00
0 1000 2000 3000
Simulation Time t
Ord
er
Para
mete
r ϕ
b0 ≥ 5
b0 = 1
b0 = 2
(B)
0.0
0.5
1.0
1.5
2.0
0 1000 2000 3000
Simulation Time t
Pro
xim
ity P
ara
mete
r ε
level
1
2
3
4
5
b1 = 0 b1 = 0
b1 = 1 (C)
0.0
0.5
1.0
1.5
2.0
0 1000 2000 3000
Simulation Time t
Pro
xim
ity P
ara
mete
r ε
level
1
2
3
4
5
(A)
0
25
0 25
41
VICSEK SIMULATION B ANALYSIS
(A)
0.00
0.25
0.50
0.75
1.00
0 100 200 300 400 500 600
Simulation Time t
Ord
er
Para
mete
r ϕ
b0 ≥ 5
b0 = 1 (B)
0.0
0.5
1.0
1.5
2.0
0 100 200 300 400 500 600
Simulation Time t
Pro
xim
ity P
ara
mete
r ε
level
1
2
3
4
5
b1 = 0
b1 ≥ 5
b1 = 2
b1 = 3
(C)
0.0
0.5
1.0
1.5
2.0
0 100 200 300 400 500 600
Simulation Time t
Pro
xim
ity P
ara
mete
r ε
level
1
2
3
4
5
(B)
0
7
0 7
42
VICSEK SIMULATION C ANALYSIS
(A)
0.00
0.25
0.50
0.75
1.00
0 100 200 300
Simulation Time t
Ord
er
Para
mete
r ϕ
b0 ≥ 5
b0 = 1
b0 = 2
(B)
0.0
0.5
1.0
1.5
2.0
0 100 200 300
Simulation Time t
Pro
xim
ity P
ara
mete
r ε
level
1
2
3
4
5
b1 = 2
b1 ≥ 5
(C)
0.0
0.5
1.0
1.5
2.0
0 100 200 300
Simulation Time t
Pro
xim
ity P
ara
mete
r ε
level
1
2
3
4
5
(C)
0
5
0 5
43
IDENTIFYING PARAMETERSWITH TDA ANDMACHINE LEARNING
Goal:
Given simulated data from the Vicsek model, can we use machinelearning algorithms to recover the unknown underlying noise pa-rameter?
# Generate 100 simulations of different parameter choices# Compute alignment order parameter and H0 and H1
CROCKER plots# Compare pairwise (Euclidean) distances between simulations# Cluster with K -medoids
44
IDENTIFYING PARAMETERSWITH TDA ANDMACHINE LEARNING
Goal:
Given simulated data from the Vicsek model, can we use machinelearning algorithms to recover the unknown underlying noise pa-rameter?
# Generate 100 simulations of different parameter choices# Compute alignment order parameter and H0 and H1
CROCKER plots# Compare pairwise (Euclidean) distances between simulations# Cluster with K -medoids
44
EXPERIMENT 1
Noise parameters:η � 0.01, 0.1, 1
Pairwise Distance Matrices:
Alignment Order Parameter
H0 CROCKER H1 CROCKER H0&1 CROCKER
45
EXPERIMENT 2
Noise parameters:
η � 0.01, 0.02, 0.03, 0.05, 0.1, 0.19, 0.2, 0.21, 0.3, 0.5, 1, 1.5, 1.9, 1.99, 2
Pairwise Distance Matrices:
Alignment Order Parameter
H0 CROCKER H1 CROCKER H0&1 CROCKER
46
EXPERIMENT 3
Noise parameters:
η � 0.01, 0.5, 1, 1.5, 2
Pairwise Distance Matrices:H0 CROCKER H1 CROCKER H0&1 CROCKER
Alignment Order Parameter
47
K -MEDOIDS CLUSTERING RESULTS
Exp 1 Exp 2 Exp 3H0&1 Align H0&1 Align H0&1 Align
Accuracy 77.0% 63.3% 23.3% 14.3% 99.6 % 63.0%Silhouette Width 0.61 0.45 0.43 0.3 0.77 0.46
48
CONCLUSION
A SAMPLING OF OTHER APPLICATIONS
# (Bendich et al 2016) Persistent Homology of Brain Artery Trees# (Giusti et al 2015) Clique Topology Reveals Intrinsic Geometric Structure in
Neural Correlations# (Zhu 2013) Persistent Homology: An Introduction and a New Text
Representation for Natural Language Processing# (Freedman and Chen 2009) Algebraic Topology for Computer Vision# (Singh et al 2008) Topological Analysis of Population Activity in Visual Cortex# (Chepushtanova et al. 2016) Persistent Homology on Grassmann Manifolds
for Analysis of Hyperspectral Movies# (Zeppelzauer et al 2016) Topological Descriptors for 3d Surface Analysis# (Dabaghian et al 2012) A Topological Paradigm for Hippocampal Spatial
Map Formation Using Persistent Homology# (Stolz et al 2016) The Topological “Shape" of Brexit# (Betancourt et al 2018) Pseudo-multidimensional Persistence and Its
Applications# (Lee et al 2017) Integrated Multimodal Network Approach to PET and MRI
Based on Multidimensional Persistent Homology50
CONCLUSION
# Topology is a useful way to analyze data.
# Topology can reveal structural information about data thatcannot be seen by other measures.
# Combining persistent homology with machine learning can aidin classification results.
51
THANK YOU!
Questions?
Country DevelopmentAndrew Banman
Persistence ImagesHenry Adams, Sofya Chepushtanova, Tegan Emerson, Eric Hanson, MichaelKirby, Francis Motta, Rachel Neville, Chris Peterson, and Patrick Shipman
CROCKER Plots and Viscek ModelHenry Adams, Tom Halverson, Chad Topaz, and Lu Xian
Lori [email protected] of Mathematics, Statistics, and Computer Science
52