Lecture 10: Discussion and Challenges - Statistical Sciencesayan/CBMS/Mukherjee/lecture10.pdf ·...

Post on 19-Jun-2020

2 views 0 download

Transcript of Lecture 10: Discussion and Challenges - Statistical Sciencesayan/CBMS/Mukherjee/lecture10.pdf ·...

Lecture 10: Discussion and ChallengesNSF/CBMS Conference

Sayan Mukherjee

Departments of Statistical Science, Computer Science, Mathematics

Duke University

www.stat.duke.edu/⇠sayan

May 31, 2016

Software and computing

A review paper

Topological Data AnalysisA software survey

Mikael Vejdemo-JohanssonAI Laboratory, Jozef Stefan Institute, Slovenia

Wednesday, March 19, 14

Cubical homologyPixels and voxels

• Cellular homology theoryBuilding blocks are n-cubes

• Admits very efficient matrix processing methods

• Homotopy reduction techniques reduce to matrix traversals

• Well adapted for 2d and 3d images or pixel/voxel clouds

Wednesday, March 19, 14

ChomP

• Cubical homology — with or without persistence

• GUI, command line interface, and C++ library

• Encodes a wide range of both space and mapping analyses

• Includes a wide range of homotopy-based optimizations

http://chomp.rutgers.edu/Software.html

Wednesday, March 19, 14

HAP

• Module for the GAP computer algebra system

• Primarily focused on research programming into group cohomology

• Includes support for cubical persistent homology

http://www.gap-system.org/Packages/hap.html

Wednesday, March 19, 14

Plex / jPlex / javaPlex

• Family of software packages developed at Stanford, adapted for use from Matlab

• Implements a range of algorithms — both for constructing complexes and computing their persistent (co)homology

• Current recommended incarnation: javaPlexhttp://javaplex.googlecode.com

Wednesday, March 19, 14

Dionysus

• Library for computational homology

• Contains example applications implementing persistent homology and cohomology, as well as time-varying persistence (vineyards) & low-dimensional optimizations

• Relies on Boost, and optionally on CGAL for low-dimensional optimizations

• Includes a Python interface through Boost::Python

http://www.mrzv.org/software/dionysusWednesday, March 19, 14

pHat

• Recent released software package and C++ library

• Implements several optimizations to the persistence algorithm

• Does not (currently) construct the complex for you

• (currently) restricted to �2 coefficients

• Some support for SMP parallelization using OpenMP

http://phat.googlecode.com

Wednesday, March 19, 14

Perseus

• Cubical and simplicial complex representation and several different construction methods

• Uses discrete morse theory to speed up computation

http://www.math.rutgers.edu/~vidit/perseus

Wednesday, March 19, 14

ToMaTo

• C++ library for topological analysis

• Relies on libANN for approximate nearest neighbors

http://geometrica.saclay.inria.fr/data/ToMATo/

Wednesday, March 19, 14

GAP Persistence

• Persistent homology and complex construction in the GAP computer algebra system

http://www-circa.mcs.st-and.ac.uk/~mik/persistence/

Wednesday, March 19, 14

Python Mapper

• Open source solution

• Developed by Müllner & Babu at Stanford University

• Focused on being a research tool

• Exports graph structure in several formats: GraphViz .dotd3.js JSON graph representation

http://math.stanford.com/~muellner/mapper

Wednesday, March 19, 14

Packages

Computing persistence homology

Given boundary matrix D

Find R = DV ,where V is upper-triangularR is reduced, no columns have lowest nonzeros in the same row.

The reduction is via Gaussian elimination, reduce to Smith Normalform.Rank of R are the number of o↵ diagonal ones.

Computing persistence homology

Given boundary matrix D

Find R = DV ,where V is upper-triangularR is reduced, no columns have lowest nonzeros in the same row.

The reduction is via Gaussian elimination, reduce to Smith Normalform.Rank of R are the number of o↵ diagonal ones.

Computing persistence homology

Given boundary matrix D

Find R = DV ,where V is upper-triangularR is reduced, no columns have lowest nonzeros in the same row.

The reduction is via Gaussian elimination, reduce to Smith Normalform.Rank of R are the number of o↵ diagonal ones.

Computing persistence homology

Reduction

60,

Reduction

61,

Reduction

62,

Reduction

63,

Reduction

64,

Reduction

65,

Adding Geometry

Building a complex - single linkage graph

Cech complexFor balls of radius centered at the points, a -simplex is inthe complex iff there is intersection of the balls.

40,

Adding Geometry

If we only have pairwise information

Vietoris-Rips complexFor balls of radius centered at the points, a -simplex is inthe complex iff there is clique in the Cech graph.

43,

Adding geometry

•  

Adding geometry

A simplex � = {q0

, ..., qk} is weakly witnessed by a point x ifd(qi , x) < d(q, x) for any i 2 [0, k] and q 2 Q \ {qo , ..., qk}

I is strongly witnessed if in additiond(qi , x) = d(qj , x), 8i , j 2 [0, k].

Given a set of points P = {p1

, p2

, ..., pn} 2 IRd and a subsetQ ✓ P ,

I The witness complex W (P ,Q) is the collection of simplifieswith vertices from Q with all subsimplices weakly witnessed bya point in P .

I Can be defined for a general metric space.

Adding geometry

A simplex � = {q0

, ..., qk} is weakly witnessed by a point x ifd(qi , x) < d(q, x) for any i 2 [0, k] and q 2 Q \ {qo , ..., qk}

I is strongly witnessed if in additiond(qi , x) = d(qj , x), 8i , j 2 [0, k].

Given a set of points P = {p1

, p2

, ..., pn} 2 IRd and a subsetQ ✓ P ,

I The witness complex W (P ,Q) is the collection of simplifieswith vertices from Q with all subsimplices weakly witnessed bya point in P .

I Can be defined for a general metric space.

Adding geometry

A simplex � = {q0

, ..., qk} is weakly witnessed by a point x ifd(qi , x) < d(q, x) for any i 2 [0, k] and q 2 Q \ {qo , ..., qk}

I is strongly witnessed if in additiond(qi , x) = d(qj , x), 8i , j 2 [0, k].

Given a set of points P = {p1

, p2

, ..., pn} 2 IRd and a subsetQ ✓ P ,

I The witness complex W (P ,Q) is the collection of simplifieswith vertices from Q with all subsimplices weakly witnessed bya point in P .

I Can be defined for a general metric space.

Adding geometry

Comparison

Multipqrameter persistence

Multiparameter persistence

Multiparameter persistence

Multiparameter persistence

Multiparameter persistence

Multiparameter persistence

Multiparameter persistence

Multiparameter persistence

Multiparameter persistence

Multiparameter persistence

Multiparameter persistence

Challenges

I Certificates for various approximate filtrations

I Distributed computing

I Discrete Morse approaches

I Randomized algorithms

I Multiscale persistence

I Multidimensional persistence

I Sampling and distribution properties of persistence

Challenges

I Certificates for various approximate filtrations

I Distributed computing

I Discrete Morse approaches

I Randomized algorithms

I Multiscale persistence

I Multidimensional persistence

I Sampling and distribution properties of persistence

Challenges

I Certificates for various approximate filtrations

I Distributed computing

I Discrete Morse approaches

I Randomized algorithms

I Multiscale persistence

I Multidimensional persistence

I Sampling and distribution properties of persistence

Challenges

I Certificates for various approximate filtrations

I Distributed computing

I Discrete Morse approaches

I Randomized algorithms

I Multiscale persistence

I Multidimensional persistence

I Sampling and distribution properties of persistence

Challenges

I Certificates for various approximate filtrations

I Distributed computing

I Discrete Morse approaches

I Randomized algorithms

I Multiscale persistence

I Multidimensional persistence

I Sampling and distribution properties of persistence

Challenges

I Certificates for various approximate filtrations

I Distributed computing

I Discrete Morse approaches

I Randomized algorithms

I Multiscale persistence

I Multidimensional persistence

I Sampling and distribution properties of persistence

Challenges

I Certificates for various approximate filtrations

I Distributed computing

I Discrete Morse approaches

I Randomized algorithms

I Multiscale persistence

I Multidimensional persistence

I Sampling and distribution properties of persistence

Inference

Paradigms

Data Filtration Barcodes Interpretation

Paradigm 1: EDA

Data Filtration Barcodes Modeling

Paradigm 2: Modeling

Paradigms

StatisticalTheory ApplicationsProbability

Theory

• Hypothesis  testing  

• Bootstrapping  

• Bayesian  estimation  

• Kalman  filtering  

• E-­‐M  

• Some  idea

• Normal  distribution  

• Central  Limit  Theorem  

• Gaussian  processes  

• Markov  chains  

• Bayes  theorem  

• Developing  ideas

• Test  drugs  effect  

• Noise  filtering  

• Tracking  

• Pattern  recognition  

• Classification  

• Many  ideasTDA

Paradigm 1

Study the shape of data, what is the (multiscale) topology of thedata.

Why and what summaries ?

(1) Extracting �k can provide intuition

(2) Projecting onto IRP2 can provide intuition

(3) Persistence landscapes and diagrams can provide informationpause

(4) Statistical guarantees on these summaries(i) minmax results(ii) confidence/credible intervals(iii) consistency(iv) central limit theorems(v) extreme value theory

Paradigm 1

Study the shape of data, what is the (multiscale) topology of thedata.

Why and what summaries ?

(1) Extracting �k can provide intuition

(2) Projecting onto IRP2 can provide intuition

(3) Persistence landscapes and diagrams can provide informationpause

(4) Statistical guarantees on these summaries(i) minmax results(ii) confidence/credible intervals(iii) consistency(iv) central limit theorems(v) extreme value theory

Paradigm 1

Study the shape of data, what is the (multiscale) topology of thedata.

Why and what summaries ?

(1) Extracting �k can provide intuition

(2) Projecting onto IRP2 can provide intuition

(3) Persistence landscapes and diagrams can provide informationpause

(4) Statistical guarantees on these summaries(i) minmax results(ii) confidence/credible intervals(iii) consistency(iv) central limit theorems(v) extreme value theory

Paradigm 1

Study the shape of data, what is the (multiscale) topology of thedata.

Why and what summaries ?

(1) Extracting �k can provide intuition

(2) Projecting onto IRP2 can provide intuition

(3) Persistence landscapes and diagrams can provide informationpause

(4) Statistical guarantees on these summaries

(i) minmax results(ii) confidence/credible intervals(iii) consistency(iv) central limit theorems(v) extreme value theory

Paradigm 1

Study the shape of data, what is the (multiscale) topology of thedata.

Why and what summaries ?

(1) Extracting �k can provide intuition

(2) Projecting onto IRP2 can provide intuition

(3) Persistence landscapes and diagrams can provide informationpause

(4) Statistical guarantees on these summaries(i) minmax results(ii) confidence/credible intervals(iii) consistency(iv) central limit theorems(v) extreme value theory

Paradigm 2

Summaries as features for downstream analysis.

(1) Machine learning perspective(i) features for classification and regression

(ii) features for dimension reduction(iii) kernel models/kernel engineering(iv) bias variance tradeo↵(v) function approximation questions

(2) Sampling distribution perspective(i) Su�ciency(ii) Pseudolikelihoods and empirical likelihoods(iii) Je↵rey’s conditioning(iv) Distributions of summaries under null models and hypothesis

testing(v) Understanding topological noise

Paradigm 2

Summaries as features for downstream analysis.

(1) Machine learning perspective(i) features for classification and regression(ii) features for dimension reduction

(iii) kernel models/kernel engineering(iv) bias variance tradeo↵(v) function approximation questions

(2) Sampling distribution perspective(i) Su�ciency(ii) Pseudolikelihoods and empirical likelihoods(iii) Je↵rey’s conditioning(iv) Distributions of summaries under null models and hypothesis

testing(v) Understanding topological noise

Paradigm 2

Summaries as features for downstream analysis.

(1) Machine learning perspective(i) features for classification and regression(ii) features for dimension reduction(iii) kernel models/kernel engineering

(iv) bias variance tradeo↵(v) function approximation questions

(2) Sampling distribution perspective(i) Su�ciency(ii) Pseudolikelihoods and empirical likelihoods(iii) Je↵rey’s conditioning(iv) Distributions of summaries under null models and hypothesis

testing(v) Understanding topological noise

Paradigm 2

Summaries as features for downstream analysis.

(1) Machine learning perspective(i) features for classification and regression(ii) features for dimension reduction(iii) kernel models/kernel engineering(iv) bias variance tradeo↵

(v) function approximation questions

(2) Sampling distribution perspective(i) Su�ciency(ii) Pseudolikelihoods and empirical likelihoods(iii) Je↵rey’s conditioning(iv) Distributions of summaries under null models and hypothesis

testing(v) Understanding topological noise

Paradigm 2

Summaries as features for downstream analysis.

(1) Machine learning perspective(i) features for classification and regression(ii) features for dimension reduction(iii) kernel models/kernel engineering(iv) bias variance tradeo↵(v) function approximation questions

(2) Sampling distribution perspective(i) Su�ciency

(ii) Pseudolikelihoods and empirical likelihoods(iii) Je↵rey’s conditioning(iv) Distributions of summaries under null models and hypothesis

testing(v) Understanding topological noise

Paradigm 2

Summaries as features for downstream analysis.

(1) Machine learning perspective(i) features for classification and regression(ii) features for dimension reduction(iii) kernel models/kernel engineering(iv) bias variance tradeo↵(v) function approximation questions

(2) Sampling distribution perspective(i) Su�ciency(ii) Pseudolikelihoods and empirical likelihoods

(iii) Je↵rey’s conditioning(iv) Distributions of summaries under null models and hypothesis

testing(v) Understanding topological noise

Paradigm 2

Summaries as features for downstream analysis.

(1) Machine learning perspective(i) features for classification and regression(ii) features for dimension reduction(iii) kernel models/kernel engineering(iv) bias variance tradeo↵(v) function approximation questions

(2) Sampling distribution perspective(i) Su�ciency(ii) Pseudolikelihoods and empirical likelihoods(iii) Je↵rey’s conditioning

(iv) Distributions of summaries under null models and hypothesistesting

(v) Understanding topological noise

Paradigm 2

Summaries as features for downstream analysis.

(1) Machine learning perspective(i) features for classification and regression(ii) features for dimension reduction(iii) kernel models/kernel engineering(iv) bias variance tradeo↵(v) function approximation questions

(2) Sampling distribution perspective(i) Su�ciency(ii) Pseudolikelihoods and empirical likelihoods(iii) Je↵rey’s conditioning(iv) Distributions of summaries under null models and hypothesis

testing

(v) Understanding topological noise

Paradigm 2

Summaries as features for downstream analysis.

(1) Machine learning perspective(i) features for classification and regression(ii) features for dimension reduction(iii) kernel models/kernel engineering(iv) bias variance tradeo↵(v) function approximation questions

(2) Sampling distribution perspective(i) Su�ciency(ii) Pseudolikelihoods and empirical likelihoods(iii) Je↵rey’s conditioning(iv) Distributions of summaries under null models and hypothesis

testing(v) Understanding topological noise

Where we are

StatisticalTheory ApplicationsProbability

Theory

• Hypothesis  testing  

• Bootstrapping  

• Bayesian  estimation  

• Kalman  filtering  

• E-­‐M  

• Some  idea

• Normal  distribution  

• Central  Limit  Theorem  

• Gaussian  processes  

• Markov  chains  

• Bayes  theorem  

• Developing  ideas

• Test  drugs  effect  

• Noise  filtering  

• Tracking  

• Pattern  recognition  

• Classification  

• Many  ideas

TDA

Open questions

(1) Principled approaches to filtration selection.

(2) Quantification of ✏-su�ciency for di↵erent models/moduloinvariants.

(3) Summaries for graphs

(4) Information geometry for spaces with singularities andstratified spaces

(5) MCMC for models of di↵erent dimensions and algebraicstructures

(6) Signal processing and dictionary learning for shapes

(7) Summaries of complex objects as vector spaces ?

(8) Distribution theory for topological and geometric summaries.

Open questions

(1) Principled approaches to filtration selection.

(2) Quantification of ✏-su�ciency for di↵erent models/moduloinvariants.

(3) Summaries for graphs

(4) Information geometry for spaces with singularities andstratified spaces

(5) MCMC for models of di↵erent dimensions and algebraicstructures

(6) Signal processing and dictionary learning for shapes

(7) Summaries of complex objects as vector spaces ?

(8) Distribution theory for topological and geometric summaries.

Open questions

(1) Principled approaches to filtration selection.

(2) Quantification of ✏-su�ciency for di↵erent models/moduloinvariants.

(3) Summaries for graphs

(4) Information geometry for spaces with singularities andstratified spaces

(5) MCMC for models of di↵erent dimensions and algebraicstructures

(6) Signal processing and dictionary learning for shapes

(7) Summaries of complex objects as vector spaces ?

(8) Distribution theory for topological and geometric summaries.

Open questions

(1) Principled approaches to filtration selection.

(2) Quantification of ✏-su�ciency for di↵erent models/moduloinvariants.

(3) Summaries for graphs

(4) Information geometry for spaces with singularities andstratified spaces

(5) MCMC for models of di↵erent dimensions and algebraicstructures

(6) Signal processing and dictionary learning for shapes

(7) Summaries of complex objects as vector spaces ?

(8) Distribution theory for topological and geometric summaries.

Open questions

(1) Principled approaches to filtration selection.

(2) Quantification of ✏-su�ciency for di↵erent models/moduloinvariants.

(3) Summaries for graphs

(4) Information geometry for spaces with singularities andstratified spaces

(5) MCMC for models of di↵erent dimensions and algebraicstructures

(6) Signal processing and dictionary learning for shapes

(7) Summaries of complex objects as vector spaces ?

(8) Distribution theory for topological and geometric summaries.

Open questions

(1) Principled approaches to filtration selection.

(2) Quantification of ✏-su�ciency for di↵erent models/moduloinvariants.

(3) Summaries for graphs

(4) Information geometry for spaces with singularities andstratified spaces

(5) MCMC for models of di↵erent dimensions and algebraicstructures

(6) Signal processing and dictionary learning for shapes

(7) Summaries of complex objects as vector spaces ?

(8) Distribution theory for topological and geometric summaries.

Open questions

(1) Principled approaches to filtration selection.

(2) Quantification of ✏-su�ciency for di↵erent models/moduloinvariants.

(3) Summaries for graphs

(4) Information geometry for spaces with singularities andstratified spaces

(5) MCMC for models of di↵erent dimensions and algebraicstructures

(6) Signal processing and dictionary learning for shapes

(7) Summaries of complex objects as vector spaces ?

(8) Distribution theory for topological and geometric summaries.

Open questions

(1) Principled approaches to filtration selection.

(2) Quantification of ✏-su�ciency for di↵erent models/moduloinvariants.

(3) Summaries for graphs

(4) Information geometry for spaces with singularities andstratified spaces

(5) MCMC for models of di↵erent dimensions and algebraicstructures

(6) Signal processing and dictionary learning for shapes

(7) Summaries of complex objects as vector spaces ?

(8) Distribution theory for topological and geometric summaries.

Mathematics

Spectral simplicial theory

(1) Cheeger inequalities for middle dimensions

(2) Higher-dimensional version of pageRank

(3) Limits of random walks as Brownian motion of forms

(4) Graph sparsification with L

1

(5) Synchronization and learning maps, and multicommodity flows

(6) SLE on simplicial complexes, loop erased random surfaces

Spectral simplicial theory

(1) Cheeger inequalities for middle dimensions

(2) Higher-dimensional version of pageRank

(3) Limits of random walks as Brownian motion of forms

(4) Graph sparsification with L

1

(5) Synchronization and learning maps, and multicommodity flows

(6) SLE on simplicial complexes, loop erased random surfaces

Spectral simplicial theory

(1) Cheeger inequalities for middle dimensions

(2) Higher-dimensional version of pageRank

(3) Limits of random walks as Brownian motion of forms

(4) Graph sparsification with L

1

(5) Synchronization and learning maps, and multicommodity flows

(6) SLE on simplicial complexes, loop erased random surfaces

Spectral simplicial theory

(1) Cheeger inequalities for middle dimensions

(2) Higher-dimensional version of pageRank

(3) Limits of random walks as Brownian motion of forms

(4) Graph sparsification with L

1

(5) Synchronization and learning maps, and multicommodity flows

(6) SLE on simplicial complexes, loop erased random surfaces

Spectral simplicial theory

(1) Cheeger inequalities for middle dimensions

(2) Higher-dimensional version of pageRank

(3) Limits of random walks as Brownian motion of forms

(4) Graph sparsification with L

1

(5) Synchronization and learning maps, and multicommodity flows

(6) SLE on simplicial complexes, loop erased random surfaces

Spectral simplicial theory

(1) Cheeger inequalities for middle dimensions

(2) Higher-dimensional version of pageRank

(3) Limits of random walks as Brownian motion of forms

(4) Graph sparsification with L

1

(5) Synchronization and learning maps, and multicommodity flows

(6) SLE on simplicial complexes, loop erased random surfaces

Parsons

Stochastic topology

Stochastic topology

Acknowledgements

Many people.

Funding:

I Center for Systems Biology at Duke

I NSF DMS, CCF, IIS

I AFOSR, DARPA

I NIH