Studying the Shape of Data Using Topology - INEGI · Topological Data Analysis (TDA) TDA is a...
Transcript of Studying the Shape of Data Using Topology - INEGI · Topological Data Analysis (TDA) TDA is a...
![Page 1: Studying the Shape of Data Using Topology - INEGI · Topological Data Analysis (TDA) TDA is a branch of statistics. Goal: Apply topology to develop tools for studying qualitative](https://reader031.fdocuments.in/reader031/viewer/2022022511/5adfd70f7f8b9afd1a8d3c06/html5/thumbnails/1.jpg)
Studying the Shape of DataUsing Topology
Michael LesnickInstitute for Mathematics and its Applications, USA
INFOTECJune 17, 2014
![Page 2: Studying the Shape of Data Using Topology - INEGI · Topological Data Analysis (TDA) TDA is a branch of statistics. Goal: Apply topology to develop tools for studying qualitative](https://reader031.fdocuments.in/reader031/viewer/2022022511/5adfd70f7f8b9afd1a8d3c06/html5/thumbnails/2.jpg)
Topological Data Analysis (TDA)
TDA is a branch of statistics.
Goal: Apply topology to develop tools forstudying qualitative features of data.
![Page 3: Studying the Shape of Data Using Topology - INEGI · Topological Data Analysis (TDA) TDA is a branch of statistics. Goal: Apply topology to develop tools for studying qualitative](https://reader031.fdocuments.in/reader031/viewer/2022022511/5adfd70f7f8b9afd1a8d3c06/html5/thumbnails/3.jpg)
Two Data TypesData type 1:A finite set of points in Rn
[We call such data point cloud data.]
![Page 4: Studying the Shape of Data Using Topology - INEGI · Topological Data Analysis (TDA) TDA is a branch of statistics. Goal: Apply topology to develop tools for studying qualitative](https://reader031.fdocuments.in/reader031/viewer/2022022511/5adfd70f7f8b9afd1a8d3c06/html5/thumbnails/4.jpg)
Two Data TypesData type 1:A finite set of points in Rn
[We call such data point cloud data.]
![Page 5: Studying the Shape of Data Using Topology - INEGI · Topological Data Analysis (TDA) TDA is a branch of statistics. Goal: Apply topology to develop tools for studying qualitative](https://reader031.fdocuments.in/reader031/viewer/2022022511/5adfd70f7f8b9afd1a8d3c06/html5/thumbnails/5.jpg)
Two Data TypesData type 2:A function f : X ! R, X any space.
(We also study functions f : X ! Rm, m > 1).
![Page 6: Studying the Shape of Data Using Topology - INEGI · Topological Data Analysis (TDA) TDA is a branch of statistics. Goal: Apply topology to develop tools for studying qualitative](https://reader031.fdocuments.in/reader031/viewer/2022022511/5adfd70f7f8b9afd1a8d3c06/html5/thumbnails/6.jpg)
Two Data TypesData type 2:A function f : X ! R, X any space.
(We also study functions f : X ! Rm, m > 1).
![Page 7: Studying the Shape of Data Using Topology - INEGI · Topological Data Analysis (TDA) TDA is a branch of statistics. Goal: Apply topology to develop tools for studying qualitative](https://reader031.fdocuments.in/reader031/viewer/2022022511/5adfd70f7f8b9afd1a8d3c06/html5/thumbnails/7.jpg)
Topological Data Analysis (TDA)
TDA is a branch of statistics.
Goal: Apply topology to develop tools forstudying qualitative features of data.
Informally, qualitative features=“coarse-scale, global geometric features.”
![Page 8: Studying the Shape of Data Using Topology - INEGI · Topological Data Analysis (TDA) TDA is a branch of statistics. Goal: Apply topology to develop tools for studying qualitative](https://reader031.fdocuments.in/reader031/viewer/2022022511/5adfd70f7f8b9afd1a8d3c06/html5/thumbnails/8.jpg)
Topological Data Analysis (TDA)
TDA is a branch of statistics.
Goal: Apply topology to develop tools forstudying qualitative features of data.
Informally, qualitative features=“coarse-scale, global geometric features.”
![Page 9: Studying the Shape of Data Using Topology - INEGI · Topological Data Analysis (TDA) TDA is a branch of statistics. Goal: Apply topology to develop tools for studying qualitative](https://reader031.fdocuments.in/reader031/viewer/2022022511/5adfd70f7f8b9afd1a8d3c06/html5/thumbnails/9.jpg)
Examples of qualitative features of PCD (in 2-D):
![Page 10: Studying the Shape of Data Using Topology - INEGI · Topological Data Analysis (TDA) TDA is a branch of statistics. Goal: Apply topology to develop tools for studying qualitative](https://reader031.fdocuments.in/reader031/viewer/2022022511/5adfd70f7f8b9afd1a8d3c06/html5/thumbnails/10.jpg)
Clusters
![Page 11: Studying the Shape of Data Using Topology - INEGI · Topological Data Analysis (TDA) TDA is a branch of statistics. Goal: Apply topology to develop tools for studying qualitative](https://reader031.fdocuments.in/reader031/viewer/2022022511/5adfd70f7f8b9afd1a8d3c06/html5/thumbnails/11.jpg)
Clusters
![Page 12: Studying the Shape of Data Using Topology - INEGI · Topological Data Analysis (TDA) TDA is a branch of statistics. Goal: Apply topology to develop tools for studying qualitative](https://reader031.fdocuments.in/reader031/viewer/2022022511/5adfd70f7f8b9afd1a8d3c06/html5/thumbnails/12.jpg)
Clusters
![Page 13: Studying the Shape of Data Using Topology - INEGI · Topological Data Analysis (TDA) TDA is a branch of statistics. Goal: Apply topology to develop tools for studying qualitative](https://reader031.fdocuments.in/reader031/viewer/2022022511/5adfd70f7f8b9afd1a8d3c06/html5/thumbnails/13.jpg)
Cycles
![Page 14: Studying the Shape of Data Using Topology - INEGI · Topological Data Analysis (TDA) TDA is a branch of statistics. Goal: Apply topology to develop tools for studying qualitative](https://reader031.fdocuments.in/reader031/viewer/2022022511/5adfd70f7f8b9afd1a8d3c06/html5/thumbnails/14.jpg)
Cycles
![Page 15: Studying the Shape of Data Using Topology - INEGI · Topological Data Analysis (TDA) TDA is a branch of statistics. Goal: Apply topology to develop tools for studying qualitative](https://reader031.fdocuments.in/reader031/viewer/2022022511/5adfd70f7f8b9afd1a8d3c06/html5/thumbnails/15.jpg)
Tendrils/Flares
![Page 16: Studying the Shape of Data Using Topology - INEGI · Topological Data Analysis (TDA) TDA is a branch of statistics. Goal: Apply topology to develop tools for studying qualitative](https://reader031.fdocuments.in/reader031/viewer/2022022511/5adfd70f7f8b9afd1a8d3c06/html5/thumbnails/16.jpg)
Tendrils/Flares
![Page 17: Studying the Shape of Data Using Topology - INEGI · Topological Data Analysis (TDA) TDA is a branch of statistics. Goal: Apply topology to develop tools for studying qualitative](https://reader031.fdocuments.in/reader031/viewer/2022022511/5adfd70f7f8b9afd1a8d3c06/html5/thumbnails/17.jpg)
“Graph Structure”
![Page 18: Studying the Shape of Data Using Topology - INEGI · Topological Data Analysis (TDA) TDA is a branch of statistics. Goal: Apply topology to develop tools for studying qualitative](https://reader031.fdocuments.in/reader031/viewer/2022022511/5adfd70f7f8b9afd1a8d3c06/html5/thumbnails/18.jpg)
Qualitative Features of Functions
![Page 19: Studying the Shape of Data Using Topology - INEGI · Topological Data Analysis (TDA) TDA is a branch of statistics. Goal: Apply topology to develop tools for studying qualitative](https://reader031.fdocuments.in/reader031/viewer/2022022511/5adfd70f7f8b9afd1a8d3c06/html5/thumbnails/19.jpg)
Modes
![Page 20: Studying the Shape of Data Using Topology - INEGI · Topological Data Analysis (TDA) TDA is a branch of statistics. Goal: Apply topology to develop tools for studying qualitative](https://reader031.fdocuments.in/reader031/viewer/2022022511/5adfd70f7f8b9afd1a8d3c06/html5/thumbnails/20.jpg)
“Craters”
![Page 21: Studying the Shape of Data Using Topology - INEGI · Topological Data Analysis (TDA) TDA is a branch of statistics. Goal: Apply topology to develop tools for studying qualitative](https://reader031.fdocuments.in/reader031/viewer/2022022511/5adfd70f7f8b9afd1a8d3c06/html5/thumbnails/21.jpg)
In TDA, we seek to develop:
• Formal definitions of such features
• Computational tools for detecting,visualizing such features
• (When data is random) methodology forquantifying the statistical significance ofsuch features.
We focus on tools suitable for highdimensional PCD.
![Page 22: Studying the Shape of Data Using Topology - INEGI · Topological Data Analysis (TDA) TDA is a branch of statistics. Goal: Apply topology to develop tools for studying qualitative](https://reader031.fdocuments.in/reader031/viewer/2022022511/5adfd70f7f8b9afd1a8d3c06/html5/thumbnails/22.jpg)
In TDA, we seek to develop:
• Formal definitions of such features
• Computational tools for detecting,visualizing such features
• (When data is random) methodology forquantifying the statistical significance ofsuch features.
We focus on tools suitable for highdimensional PCD.
![Page 23: Studying the Shape of Data Using Topology - INEGI · Topological Data Analysis (TDA) TDA is a branch of statistics. Goal: Apply topology to develop tools for studying qualitative](https://reader031.fdocuments.in/reader031/viewer/2022022511/5adfd70f7f8b9afd1a8d3c06/html5/thumbnails/23.jpg)
In TDA, we seek to develop:
• Formal definitions of such features
• Computational tools for detecting,visualizing such features
• (When data is random) methodology forquantifying the statistical significance ofsuch features.
We focus on tools suitable for highdimensional PCD.
![Page 24: Studying the Shape of Data Using Topology - INEGI · Topological Data Analysis (TDA) TDA is a branch of statistics. Goal: Apply topology to develop tools for studying qualitative](https://reader031.fdocuments.in/reader031/viewer/2022022511/5adfd70f7f8b9afd1a8d3c06/html5/thumbnails/24.jpg)
In TDA, we seek to develop:
• Formal definitions of such features
• Computational tools for detecting,visualizing such features
• (When data is random) methodology forquantifying the statistical significance ofsuch features.
We focus on tools suitable for highdimensional PCD.
![Page 25: Studying the Shape of Data Using Topology - INEGI · Topological Data Analysis (TDA) TDA is a branch of statistics. Goal: Apply topology to develop tools for studying qualitative](https://reader031.fdocuments.in/reader031/viewer/2022022511/5adfd70f7f8b9afd1a8d3c06/html5/thumbnails/25.jpg)
Why Study Qualitative Features ofData?
Key Premise:Insight into shape of scientific data has a goodchance of giving insight into the science itself.
![Page 26: Studying the Shape of Data Using Topology - INEGI · Topological Data Analysis (TDA) TDA is a branch of statistics. Goal: Apply topology to develop tools for studying qualitative](https://reader031.fdocuments.in/reader031/viewer/2022022511/5adfd70f7f8b9afd1a8d3c06/html5/thumbnails/26.jpg)
An example:
• Statistics of natural images (persistenthomology)
![Page 27: Studying the Shape of Data Using Topology - INEGI · Topological Data Analysis (TDA) TDA is a branch of statistics. Goal: Apply topology to develop tools for studying qualitative](https://reader031.fdocuments.in/reader031/viewer/2022022511/5adfd70f7f8b9afd1a8d3c06/html5/thumbnails/27.jpg)
Statistics of Natural ImagesCarlsson et al. studied a set of 5000 3⇥ 3-pixelpatches sampled from natural images.
• After normalization of intensity+contrast, eachpatch lies on 7-D sphere.
• Discovery: Densest regions of data setconcentrate around a Klein bottle.
![Page 28: Studying the Shape of Data Using Topology - INEGI · Topological Data Analysis (TDA) TDA is a branch of statistics. Goal: Apply topology to develop tools for studying qualitative](https://reader031.fdocuments.in/reader031/viewer/2022022511/5adfd70f7f8b9afd1a8d3c06/html5/thumbnails/28.jpg)
Statistics of Natural ImagesCarlsson et al. studied a set of 5000 3⇥ 3-pixelpatches sampled from natural images.
• After normalization of intensity+contrast, eachpatch lies on 7-D sphere.
• Discovery: Densest regions of data setconcentrate around a Klein bottle.
![Page 29: Studying the Shape of Data Using Topology - INEGI · Topological Data Analysis (TDA) TDA is a branch of statistics. Goal: Apply topology to develop tools for studying qualitative](https://reader031.fdocuments.in/reader031/viewer/2022022511/5adfd70f7f8b9afd1a8d3c06/html5/thumbnails/29.jpg)
Statistics of Natural ImagesCarlsson et al. studied a set of 5000 3⇥ 3-pixelpatches sampled from natural images.
• After normalization of intensity+contrast, eachpatch lies on 7-D sphere.
• Discovery: Densest regions of data setconcentrate around a Klein bottle.
![Page 30: Studying the Shape of Data Using Topology - INEGI · Topological Data Analysis (TDA) TDA is a branch of statistics. Goal: Apply topology to develop tools for studying qualitative](https://reader031.fdocuments.in/reader031/viewer/2022022511/5adfd70f7f8b9afd1a8d3c06/html5/thumbnails/30.jpg)
Klein Bottle in Space of 3⇥3Patches
[Source: Carlsson, Perea 2014]
![Page 31: Studying the Shape of Data Using Topology - INEGI · Topological Data Analysis (TDA) TDA is a branch of statistics. Goal: Apply topology to develop tools for studying qualitative](https://reader031.fdocuments.in/reader031/viewer/2022022511/5adfd70f7f8b9afd1a8d3c06/html5/thumbnails/31.jpg)
Application: Texture classification [Perea,Carlsson 2013].
![Page 32: Studying the Shape of Data Using Topology - INEGI · Topological Data Analysis (TDA) TDA is a branch of statistics. Goal: Apply topology to develop tools for studying qualitative](https://reader031.fdocuments.in/reader031/viewer/2022022511/5adfd70f7f8b9afd1a8d3c06/html5/thumbnails/32.jpg)
Other Applications of TDA• biophysics of proteins• genomics + evolutionary biology• astronomy• coverage detection in wireless sensor networks• shape segmentation• shape comparison/shape matching• basketball analytics
![Page 33: Studying the Shape of Data Using Topology - INEGI · Topological Data Analysis (TDA) TDA is a branch of statistics. Goal: Apply topology to develop tools for studying qualitative](https://reader031.fdocuments.in/reader031/viewer/2022022511/5adfd70f7f8b9afd1a8d3c06/html5/thumbnails/33.jpg)
Introduction to Algebraic Topology
![Page 34: Studying the Shape of Data Using Topology - INEGI · Topological Data Analysis (TDA) TDA is a branch of statistics. Goal: Apply topology to develop tools for studying qualitative](https://reader031.fdocuments.in/reader031/viewer/2022022511/5adfd70f7f8b9afd1a8d3c06/html5/thumbnails/34.jpg)
What is Algebraic Topology?
Informally, branch of math concerned withproperties of geometric objects that are invariantunder “continuous deformations.”
Continuous deformations:
• bending
• twisting
• stretching
• (but not tearing)
![Page 35: Studying the Shape of Data Using Topology - INEGI · Topological Data Analysis (TDA) TDA is a branch of statistics. Goal: Apply topology to develop tools for studying qualitative](https://reader031.fdocuments.in/reader031/viewer/2022022511/5adfd70f7f8b9afd1a8d3c06/html5/thumbnails/35.jpg)
What is Algebraic Topology?
Informally, branch of math concerned withproperties of geometric objects that are invariantunder “continuous deformations.”
Continuous deformations:
• bending
• twisting
• stretching
• (but not tearing)
![Page 36: Studying the Shape of Data Using Topology - INEGI · Topological Data Analysis (TDA) TDA is a branch of statistics. Goal: Apply topology to develop tools for studying qualitative](https://reader031.fdocuments.in/reader031/viewer/2022022511/5adfd70f7f8b9afd1a8d3c06/html5/thumbnails/36.jpg)
Classic Example
![Page 37: Studying the Shape of Data Using Topology - INEGI · Topological Data Analysis (TDA) TDA is a branch of statistics. Goal: Apply topology to develop tools for studying qualitative](https://reader031.fdocuments.in/reader031/viewer/2022022511/5adfd70f7f8b9afd1a8d3c06/html5/thumbnails/37.jpg)
Classic Example
![Page 38: Studying the Shape of Data Using Topology - INEGI · Topological Data Analysis (TDA) TDA is a branch of statistics. Goal: Apply topology to develop tools for studying qualitative](https://reader031.fdocuments.in/reader031/viewer/2022022511/5adfd70f7f8b9afd1a8d3c06/html5/thumbnails/38.jpg)
Algebraic Topology + HolesPrimary example of a property invariant undercontinuous deformations: Presence of holes.
Algebraic topology is largely concerned with:
1 formalizing the notion of a “hole” ingeometric object,
2 calculating numbers of holes of di↵erenttypes,
3 understanding mathematical implications ofpresence of holes.
![Page 39: Studying the Shape of Data Using Topology - INEGI · Topological Data Analysis (TDA) TDA is a branch of statistics. Goal: Apply topology to develop tools for studying qualitative](https://reader031.fdocuments.in/reader031/viewer/2022022511/5adfd70f7f8b9afd1a8d3c06/html5/thumbnails/39.jpg)
Algebraic Topology + HolesPrimary example of a property invariant undercontinuous deformations: Presence of holes.
Algebraic topology is largely concerned with:
1 formalizing the notion of a “hole” ingeometric object,
2 calculating numbers of holes of di↵erenttypes,
3 understanding mathematical implications ofpresence of holes.
![Page 40: Studying the Shape of Data Using Topology - INEGI · Topological Data Analysis (TDA) TDA is a branch of statistics. Goal: Apply topology to develop tools for studying qualitative](https://reader031.fdocuments.in/reader031/viewer/2022022511/5adfd70f7f8b9afd1a8d3c06/html5/thumbnails/40.jpg)
Algebraic Topology + HolesPrimary example of a property invariant undercontinuous deformations: Presence of holes.
Algebraic topology is largely concerned with:
1 formalizing the notion of a “hole” ingeometric object,
2 calculating numbers of holes of di↵erenttypes,
3 understanding mathematical implications ofpresence of holes.
![Page 41: Studying the Shape of Data Using Topology - INEGI · Topological Data Analysis (TDA) TDA is a branch of statistics. Goal: Apply topology to develop tools for studying qualitative](https://reader031.fdocuments.in/reader031/viewer/2022022511/5adfd70f7f8b9afd1a8d3c06/html5/thumbnails/41.jpg)
Types of holes
In algebraic topology, we define i-dimensionalholes for each i � 0.
![Page 42: Studying the Shape of Data Using Topology - INEGI · Topological Data Analysis (TDA) TDA is a branch of statistics. Goal: Apply topology to develop tools for studying qualitative](https://reader031.fdocuments.in/reader031/viewer/2022022511/5adfd70f7f8b9afd1a8d3c06/html5/thumbnails/42.jpg)
0-D holes are connected components
The pair of ovals has two 0-D holes.
![Page 43: Studying the Shape of Data Using Topology - INEGI · Topological Data Analysis (TDA) TDA is a branch of statistics. Goal: Apply topology to develop tools for studying qualitative](https://reader031.fdocuments.in/reader031/viewer/2022022511/5adfd70f7f8b9afd1a8d3c06/html5/thumbnails/43.jpg)
1-D holes in 3-D objects are “holes you can seethrough.”
The donut has one 1-D hole.
![Page 44: Studying the Shape of Data Using Topology - INEGI · Topological Data Analysis (TDA) TDA is a branch of statistics. Goal: Apply topology to develop tools for studying qualitative](https://reader031.fdocuments.in/reader031/viewer/2022022511/5adfd70f7f8b9afd1a8d3c06/html5/thumbnails/44.jpg)
2-D holes in 3-D objects are hollow spaces.
A ballon has one 2-D hole.
![Page 45: Studying the Shape of Data Using Topology - INEGI · Topological Data Analysis (TDA) TDA is a branch of statistics. Goal: Apply topology to develop tools for studying qualitative](https://reader031.fdocuments.in/reader031/viewer/2022022511/5adfd70f7f8b9afd1a8d3c06/html5/thumbnails/45.jpg)
Counting Holes: Betti numbers
For a geometric object X , we define Bi(X), theith Betti number of X , to be the number ofi-dimensional holes in X .
![Page 46: Studying the Shape of Data Using Topology - INEGI · Topological Data Analysis (TDA) TDA is a branch of statistics. Goal: Apply topology to develop tools for studying qualitative](https://reader031.fdocuments.in/reader031/viewer/2022022511/5adfd70f7f8b9afd1a8d3c06/html5/thumbnails/46.jpg)
Examples
B0(X) = 2;B1(X) = 0;B2(X) = 0.
![Page 47: Studying the Shape of Data Using Topology - INEGI · Topological Data Analysis (TDA) TDA is a branch of statistics. Goal: Apply topology to develop tools for studying qualitative](https://reader031.fdocuments.in/reader031/viewer/2022022511/5adfd70f7f8b9afd1a8d3c06/html5/thumbnails/47.jpg)
Examples
B0(X) = 1;B1(X) = 2;B2(X) = 0.
![Page 48: Studying the Shape of Data Using Topology - INEGI · Topological Data Analysis (TDA) TDA is a branch of statistics. Goal: Apply topology to develop tools for studying qualitative](https://reader031.fdocuments.in/reader031/viewer/2022022511/5adfd70f7f8b9afd1a8d3c06/html5/thumbnails/48.jpg)
Computing Betti Numbers
For discretely represented geometric objects,Bi(X) is easily computable via linear algebra.
![Page 49: Studying the Shape of Data Using Topology - INEGI · Topological Data Analysis (TDA) TDA is a branch of statistics. Goal: Apply topology to develop tools for studying qualitative](https://reader031.fdocuments.in/reader031/viewer/2022022511/5adfd70f7f8b9afd1a8d3c06/html5/thumbnails/49.jpg)
Persistent Homology
![Page 50: Studying the Shape of Data Using Topology - INEGI · Topological Data Analysis (TDA) TDA is a branch of statistics. Goal: Apply topology to develop tools for studying qualitative](https://reader031.fdocuments.in/reader031/viewer/2022022511/5adfd70f7f8b9afd1a8d3c06/html5/thumbnails/50.jpg)
Topology of PCD?
How can we use the hole-detection formalism oftopology to develop robust computationalmethods for studying qualitative features of data?
One approach: Persistent Homology.
• Introduced in 2000
• Widely studied and applied
![Page 51: Studying the Shape of Data Using Topology - INEGI · Topological Data Analysis (TDA) TDA is a branch of statistics. Goal: Apply topology to develop tools for studying qualitative](https://reader031.fdocuments.in/reader031/viewer/2022022511/5adfd70f7f8b9afd1a8d3c06/html5/thumbnails/51.jpg)
Topology of PCD?
How can we use the hole-detection formalism oftopology to develop robust computationalmethods for studying qualitative features of data?
One approach: Persistent Homology.
• Introduced in 2000
• Widely studied and applied
![Page 52: Studying the Shape of Data Using Topology - INEGI · Topological Data Analysis (TDA) TDA is a branch of statistics. Goal: Apply topology to develop tools for studying qualitative](https://reader031.fdocuments.in/reader031/viewer/2022022511/5adfd70f7f8b9afd1a8d3c06/html5/thumbnails/52.jpg)
Persistent HomologyProduces simple descriptors of qualitativefeatures of data called barcodes.
A barcode is a set of closed intervals in R.
![Page 53: Studying the Shape of Data Using Topology - INEGI · Topological Data Analysis (TDA) TDA is a branch of statistics. Goal: Apply topology to develop tools for studying qualitative](https://reader031.fdocuments.in/reader031/viewer/2022022511/5adfd70f7f8b9afd1a8d3c06/html5/thumbnails/53.jpg)
Model Example
X
How can we detect the cycle in X?
![Page 54: Studying the Shape of Data Using Topology - INEGI · Topological Data Analysis (TDA) TDA is a branch of statistics. Goal: Apply topology to develop tools for studying qualitative](https://reader031.fdocuments.in/reader031/viewer/2022022511/5adfd70f7f8b9afd1a8d3c06/html5/thumbnails/54.jpg)
Naive Idea
Choose r > 0. Let U(X, r) be the union ofballs of radius r centered at the points of X .
Idea: Consider B1(U(X, r)) for some choice of r.
![Page 55: Studying the Shape of Data Using Topology - INEGI · Topological Data Analysis (TDA) TDA is a branch of statistics. Goal: Apply topology to develop tools for studying qualitative](https://reader031.fdocuments.in/reader031/viewer/2022022511/5adfd70f7f8b9afd1a8d3c06/html5/thumbnails/55.jpg)
Naive Idea
Choose r > 0. Let U(X, r) be the union ofballs of radius r centered at the points of X .
Idea: Consider B1(U(X, r)) for some choice of r.
![Page 56: Studying the Shape of Data Using Topology - INEGI · Topological Data Analysis (TDA) TDA is a branch of statistics. Goal: Apply topology to develop tools for studying qualitative](https://reader031.fdocuments.in/reader031/viewer/2022022511/5adfd70f7f8b9afd1a8d3c06/html5/thumbnails/56.jpg)
Example
X U(X,r)
B0(U(X, r)) = 1;B1(U(X, r)) = 1;B2(U(X, r))) = 0.
When X is nice enough, for a good choice of r,B1(U(X, r)) detects the cycle in X .
![Page 57: Studying the Shape of Data Using Topology - INEGI · Topological Data Analysis (TDA) TDA is a branch of statistics. Goal: Apply topology to develop tools for studying qualitative](https://reader031.fdocuments.in/reader031/viewer/2022022511/5adfd70f7f8b9afd1a8d3c06/html5/thumbnails/57.jpg)
Problems with this Descriptor
1 No clear way to choose r.
2 Invariant is unstable with respect toperturbation of data or small changes in r.
3 Doesn’t distinguish small holes from big ones
4 Invariant is very sensitive to outliers.
![Page 58: Studying the Shape of Data Using Topology - INEGI · Topological Data Analysis (TDA) TDA is a branch of statistics. Goal: Apply topology to develop tools for studying qualitative](https://reader031.fdocuments.in/reader031/viewer/2022022511/5adfd70f7f8b9afd1a8d3c06/html5/thumbnails/58.jpg)
Problems with this Descriptor
1 No clear way to choose r.
2 Invariant is unstable with respect toperturbation of data or small changes in r.
3 Doesn’t distinguish small holes from big ones
4 Invariant is very sensitive to outliers.
![Page 59: Studying the Shape of Data Using Topology - INEGI · Topological Data Analysis (TDA) TDA is a branch of statistics. Goal: Apply topology to develop tools for studying qualitative](https://reader031.fdocuments.in/reader031/viewer/2022022511/5adfd70f7f8b9afd1a8d3c06/html5/thumbnails/59.jpg)
Problems with this Descriptor
1 No clear way to choose r.
2 Invariant is unstable with respect toperturbation of data or small changes in r.
3 Doesn’t distinguish small holes from big ones
4 Invariant is very sensitive to outliers.
![Page 60: Studying the Shape of Data Using Topology - INEGI · Topological Data Analysis (TDA) TDA is a branch of statistics. Goal: Apply topology to develop tools for studying qualitative](https://reader031.fdocuments.in/reader031/viewer/2022022511/5adfd70f7f8b9afd1a8d3c06/html5/thumbnails/60.jpg)
Problems with this Descriptor
1 No clear way to choose r.
2 Invariant is unstable with respect toperturbation of data or small changes in r.
3 Doesn’t distinguish small holes from big ones
4 Invariant is very sensitive to outliers.
![Page 61: Studying the Shape of Data Using Topology - INEGI · Topological Data Analysis (TDA) TDA is a branch of statistics. Goal: Apply topology to develop tools for studying qualitative](https://reader031.fdocuments.in/reader031/viewer/2022022511/5adfd70f7f8b9afd1a8d3c06/html5/thumbnails/61.jpg)
Problems with this Descriptor
1 No clear way to choose r.
2 Invariant is unstable with respect toperturbation of data or small changes in r.
3 Doesn’t distinguish small holes from big ones
4 Invariant is very sensitive to outliers.
![Page 62: Studying the Shape of Data Using Topology - INEGI · Topological Data Analysis (TDA) TDA is a branch of statistics. Goal: Apply topology to develop tools for studying qualitative](https://reader031.fdocuments.in/reader031/viewer/2022022511/5adfd70f7f8b9afd1a8d3c06/html5/thumbnails/62.jpg)
Example: No Good Choice of r
![Page 63: Studying the Shape of Data Using Topology - INEGI · Topological Data Analysis (TDA) TDA is a branch of statistics. Goal: Apply topology to develop tools for studying qualitative](https://reader031.fdocuments.in/reader031/viewer/2022022511/5adfd70f7f8b9afd1a8d3c06/html5/thumbnails/63.jpg)
Example: No Good Choice of r
![Page 64: Studying the Shape of Data Using Topology - INEGI · Topological Data Analysis (TDA) TDA is a branch of statistics. Goal: Apply topology to develop tools for studying qualitative](https://reader031.fdocuments.in/reader031/viewer/2022022511/5adfd70f7f8b9afd1a8d3c06/html5/thumbnails/64.jpg)
Example: Sensitivity to Outliers
B1(U(X, r)) = 7;
![Page 65: Studying the Shape of Data Using Topology - INEGI · Topological Data Analysis (TDA) TDA is a branch of statistics. Goal: Apply topology to develop tools for studying qualitative](https://reader031.fdocuments.in/reader031/viewer/2022022511/5adfd70f7f8b9afd1a8d3c06/html5/thumbnails/65.jpg)
Problems with this Descriptor
1 No canonical choice of r.
2 Invariant is unstable with respect toperturbation of data or small changes in r.
3 Doesn’t distinguish small holes from big ones
4 Invariant is very sensitive to outliers.
Let’s deal with problems 1-3 first.
![Page 66: Studying the Shape of Data Using Topology - INEGI · Topological Data Analysis (TDA) TDA is a branch of statistics. Goal: Apply topology to develop tools for studying qualitative](https://reader031.fdocuments.in/reader031/viewer/2022022511/5adfd70f7f8b9afd1a8d3c06/html5/thumbnails/66.jpg)
A Solution
Consider not single choice of radius r, but allchoices of r at once.
This gives us a filtration, that is, a 1-parameterfamily of geometric objects:
F (X) = {U(X, r)}r2[0,1)
![Page 67: Studying the Shape of Data Using Topology - INEGI · Topological Data Analysis (TDA) TDA is a branch of statistics. Goal: Apply topology to develop tools for studying qualitative](https://reader031.fdocuments.in/reader031/viewer/2022022511/5adfd70f7f8b9afd1a8d3c06/html5/thumbnails/67.jpg)
Example
![Page 68: Studying the Shape of Data Using Topology - INEGI · Topological Data Analysis (TDA) TDA is a branch of statistics. Goal: Apply topology to develop tools for studying qualitative](https://reader031.fdocuments.in/reader031/viewer/2022022511/5adfd70f7f8b9afd1a8d3c06/html5/thumbnails/68.jpg)
Example
![Page 69: Studying the Shape of Data Using Topology - INEGI · Topological Data Analysis (TDA) TDA is a branch of statistics. Goal: Apply topology to develop tools for studying qualitative](https://reader031.fdocuments.in/reader031/viewer/2022022511/5adfd70f7f8b9afd1a8d3c06/html5/thumbnails/69.jpg)
Example
![Page 70: Studying the Shape of Data Using Topology - INEGI · Topological Data Analysis (TDA) TDA is a branch of statistics. Goal: Apply topology to develop tools for studying qualitative](https://reader031.fdocuments.in/reader031/viewer/2022022511/5adfd70f7f8b9afd1a8d3c06/html5/thumbnails/70.jpg)
Example
![Page 71: Studying the Shape of Data Using Topology - INEGI · Topological Data Analysis (TDA) TDA is a branch of statistics. Goal: Apply topology to develop tools for studying qualitative](https://reader031.fdocuments.in/reader031/viewer/2022022511/5adfd70f7f8b9afd1a8d3c06/html5/thumbnails/71.jpg)
Example
![Page 72: Studying the Shape of Data Using Topology - INEGI · Topological Data Analysis (TDA) TDA is a branch of statistics. Goal: Apply topology to develop tools for studying qualitative](https://reader031.fdocuments.in/reader031/viewer/2022022511/5adfd70f7f8b9afd1a8d3c06/html5/thumbnails/72.jpg)
Example
![Page 73: Studying the Shape of Data Using Topology - INEGI · Topological Data Analysis (TDA) TDA is a branch of statistics. Goal: Apply topology to develop tools for studying qualitative](https://reader031.fdocuments.in/reader031/viewer/2022022511/5adfd70f7f8b9afd1a8d3c06/html5/thumbnails/73.jpg)
Example
![Page 74: Studying the Shape of Data Using Topology - INEGI · Topological Data Analysis (TDA) TDA is a branch of statistics. Goal: Apply topology to develop tools for studying qualitative](https://reader031.fdocuments.in/reader031/viewer/2022022511/5adfd70f7f8b9afd1a8d3c06/html5/thumbnails/74.jpg)
Example
![Page 75: Studying the Shape of Data Using Topology - INEGI · Topological Data Analysis (TDA) TDA is a branch of statistics. Goal: Apply topology to develop tools for studying qualitative](https://reader031.fdocuments.in/reader031/viewer/2022022511/5adfd70f7f8b9afd1a8d3c06/html5/thumbnails/75.jpg)
Example
![Page 76: Studying the Shape of Data Using Topology - INEGI · Topological Data Analysis (TDA) TDA is a branch of statistics. Goal: Apply topology to develop tools for studying qualitative](https://reader031.fdocuments.in/reader031/viewer/2022022511/5adfd70f7f8b9afd1a8d3c06/html5/thumbnails/76.jpg)
Key Mathematical Observation
Not only can we count holes in each space in afiltration, we can track holes in a consistent wayacross the whole filtration at once.
The formalization of this idea is persistenthomology.
![Page 77: Studying the Shape of Data Using Topology - INEGI · Topological Data Analysis (TDA) TDA is a branch of statistics. Goal: Apply topology to develop tools for studying qualitative](https://reader031.fdocuments.in/reader031/viewer/2022022511/5adfd70f7f8b9afd1a8d3c06/html5/thumbnails/77.jpg)
BarcodesFor each i � 0, we can define barcode Bi(X), aset of closed intervals in R.
Each interval represents a i-D cylce in thefiltration.
Also records the radii at which that cycle forms,closes up.
![Page 78: Studying the Shape of Data Using Topology - INEGI · Topological Data Analysis (TDA) TDA is a branch of statistics. Goal: Apply topology to develop tools for studying qualitative](https://reader031.fdocuments.in/reader031/viewer/2022022511/5adfd70f7f8b9afd1a8d3c06/html5/thumbnails/78.jpg)
BarcodesFor each i � 0, we can define barcode Bi(X), aset of closed intervals in R.
Each interval represents a i-D cylce in thefiltration.Also records the radii at which that cycle forms,closes up.
![Page 79: Studying the Shape of Data Using Topology - INEGI · Topological Data Analysis (TDA) TDA is a branch of statistics. Goal: Apply topology to develop tools for studying qualitative](https://reader031.fdocuments.in/reader031/viewer/2022022511/5adfd70f7f8b9afd1a8d3c06/html5/thumbnails/79.jpg)
Properties of a Barcode
• Allows us to distinguish in significantfeatures from insignificant features
• Records the size/scale of the feature
• Is stable w.r.t. perturbations of the data.
• Is computable in practice (using a variant ofGaussian Elimination).
![Page 80: Studying the Shape of Data Using Topology - INEGI · Topological Data Analysis (TDA) TDA is a branch of statistics. Goal: Apply topology to develop tools for studying qualitative](https://reader031.fdocuments.in/reader031/viewer/2022022511/5adfd70f7f8b9afd1a8d3c06/html5/thumbnails/80.jpg)
Properties of a Barcode
• Allows us to distinguish in significantfeatures from insignificant features
• Records the size/scale of the feature
• Is stable w.r.t. perturbations of the data.
• Is computable in practice (using a variant ofGaussian Elimination).
![Page 81: Studying the Shape of Data Using Topology - INEGI · Topological Data Analysis (TDA) TDA is a branch of statistics. Goal: Apply topology to develop tools for studying qualitative](https://reader031.fdocuments.in/reader031/viewer/2022022511/5adfd70f7f8b9afd1a8d3c06/html5/thumbnails/81.jpg)
Properties of a Barcode
• Allows us to distinguish in significantfeatures from insignificant features
• Records the size/scale of the feature
• Is stable w.r.t. perturbations of the data.
• Is computable in practice (using a variant ofGaussian Elimination).
![Page 82: Studying the Shape of Data Using Topology - INEGI · Topological Data Analysis (TDA) TDA is a branch of statistics. Goal: Apply topology to develop tools for studying qualitative](https://reader031.fdocuments.in/reader031/viewer/2022022511/5adfd70f7f8b9afd1a8d3c06/html5/thumbnails/82.jpg)
Properties of a Barcode
• Allows us to distinguish in significantfeatures from insignificant features
• Records the size/scale of the feature
• Is stable w.r.t. perturbations of the data.
• Is computable in practice (using a variant ofGaussian Elimination).
![Page 83: Studying the Shape of Data Using Topology - INEGI · Topological Data Analysis (TDA) TDA is a branch of statistics. Goal: Apply topology to develop tools for studying qualitative](https://reader031.fdocuments.in/reader031/viewer/2022022511/5adfd70f7f8b9afd1a8d3c06/html5/thumbnails/83.jpg)
Stability
![Page 84: Studying the Shape of Data Using Topology - INEGI · Topological Data Analysis (TDA) TDA is a branch of statistics. Goal: Apply topology to develop tools for studying qualitative](https://reader031.fdocuments.in/reader031/viewer/2022022511/5adfd70f7f8b9afd1a8d3c06/html5/thumbnails/84.jpg)
Stability
![Page 85: Studying the Shape of Data Using Topology - INEGI · Topological Data Analysis (TDA) TDA is a branch of statistics. Goal: Apply topology to develop tools for studying qualitative](https://reader031.fdocuments.in/reader031/viewer/2022022511/5adfd70f7f8b9afd1a8d3c06/html5/thumbnails/85.jpg)
Once we have barcodes, can do furtherprocessing to find geometric representations ofthe significant holes.
![Page 86: Studying the Shape of Data Using Topology - INEGI · Topological Data Analysis (TDA) TDA is a branch of statistics. Goal: Apply topology to develop tools for studying qualitative](https://reader031.fdocuments.in/reader031/viewer/2022022511/5adfd70f7f8b9afd1a8d3c06/html5/thumbnails/86.jpg)
This framework for building descriptors of datavia barcodes is very flexible.
Example: We can build filtrations from pointcloud data whose barcodes detect flares orclusters.
Can also be adapted to detect qualitativefeatures of functions.
![Page 87: Studying the Shape of Data Using Topology - INEGI · Topological Data Analysis (TDA) TDA is a branch of statistics. Goal: Apply topology to develop tools for studying qualitative](https://reader031.fdocuments.in/reader031/viewer/2022022511/5adfd70f7f8b9afd1a8d3c06/html5/thumbnails/87.jpg)
This framework for building descriptors of datavia barcodes is very flexible.
Example: We can build filtrations from pointcloud data whose barcodes detect flares orclusters.
Can also be adapted to detect qualitativefeatures of functions.
![Page 88: Studying the Shape of Data Using Topology - INEGI · Topological Data Analysis (TDA) TDA is a branch of statistics. Goal: Apply topology to develop tools for studying qualitative](https://reader031.fdocuments.in/reader031/viewer/2022022511/5adfd70f7f8b9afd1a8d3c06/html5/thumbnails/88.jpg)
This framework for building descriptors of datavia barcodes is very flexible.
Example: We can build filtrations from pointcloud data whose barcodes detect flares orclusters.
Can also be adapted to detect qualitativefeatures of functions.
![Page 89: Studying the Shape of Data Using Topology - INEGI · Topological Data Analysis (TDA) TDA is a branch of statistics. Goal: Apply topology to develop tools for studying qualitative](https://reader031.fdocuments.in/reader031/viewer/2022022511/5adfd70f7f8b9afd1a8d3c06/html5/thumbnails/89.jpg)
Advertisement
Do you have data that might have interestingshape?
Come talk to us!
![Page 90: Studying the Shape of Data Using Topology - INEGI · Topological Data Analysis (TDA) TDA is a branch of statistics. Goal: Apply topology to develop tools for studying qualitative](https://reader031.fdocuments.in/reader031/viewer/2022022511/5adfd70f7f8b9afd1a8d3c06/html5/thumbnails/90.jpg)
Thanks!