Lecture 10: Discussion and Challenges - Statistical Sciencesayan/CBMS/Mukherjee/lecture10.pdf ·...

Lecture 10: Discussion and ChallengesNSF/CBMS Conference

Sayan Mukherjee

Departments of Statistical Science, Computer Science, Mathematics

Duke University

www.stat.duke.edu/⇠sayan

May 31, 2016

Software and computing

A review paper

Topological Data AnalysisA software survey

Mikael Vejdemo-JohanssonAI Laboratory, Jozef Stefan Institute, Slovenia

Wednesday, March 19, 14

Cubical homologyPixels and voxels

• Cellular homology theoryBuilding blocks are n-cubes

• Admits very efficient matrix processing methods

• Homotopy reduction techniques reduce to matrix traversals

• Well adapted for 2d and 3d images or pixel/voxel clouds

• Cubical homology — with or without persistence

• GUI, command line interface, and C++ library

• Encodes a wide range of both space and mapping analyses

• Includes a wide range of homotopy-based optimizations

http://chomp.rutgers.edu/Software.html

• Module for the GAP computer algebra system

• Primarily focused on research programming into group cohomology

• Includes support for cubical persistent homology

http://www.gap-system.org/Packages/hap.html

Plex / jPlex / javaPlex

• Family of software packages developed at Stanford, adapted for use from Matlab

• Implements a range of algorithms — both for constructing complexes and computing their persistent (co)homology

• Current recommended incarnation: javaPlexhttp://javaplex.googlecode.com

Dionysus

• Library for computational homology

• Contains example applications implementing persistent homology and cohomology, as well as time-varying persistence (vineyards) & low-dimensional optimizations

• Relies on Boost, and optionally on CGAL for low-dimensional optimizations

• Includes a Python interface through Boost::Python

http://www.mrzv.org/software/dionysusWednesday, March 19, 14

• Recent released software package and C++ library

• Implements several optimizations to the persistence algorithm

• Does not (currently) construct the complex for you

• (currently) restricted to �2 coefficients

• Some support for SMP parallelization using OpenMP

http://phat.googlecode.com

Perseus

• Cubical and simplicial complex representation and several different construction methods

• Uses discrete morse theory to speed up computation

http://www.math.rutgers.edu/~vidit/perseus

ToMaTo

• C++ library for topological analysis

• Relies on libANN for approximate nearest neighbors

http://geometrica.saclay.inria.fr/data/ToMATo/

GAP Persistence

• Persistent homology and complex construction in the GAP computer algebra system

http://www-circa.mcs.st-and.ac.uk/~mik/persistence/

Python Mapper

• Open source solution

• Developed by Müllner & Babu at Stanford University

• Focused on being a research tool

• Exports graph structure in several formats: GraphViz .dotd3.js JSON graph representation

http://math.stanford.com/~muellner/mapper

Packages

Computing persistence homology

Given boundary matrix D

Find R = DV ,where V is upper-triangularR is reduced, no columns have lowest nonzeros in the same row.

The reduction is via Gaussian elimination, reduce to Smith Normalform.Rank of R are the number of o↵ diagonal ones.

Reduction

Adding Geometry

Building a complex - single linkage graph

Cech complexFor balls of radius centered at the points, a -simplex is inthe complex iff there is intersection of the balls.

Adding Geometry

If we only have pairwise information

Vietoris-Rips complexFor balls of radius centered at the points, a -simplex is inthe complex iff there is clique in the Cech graph.

Adding geometry

A simplex � = {q0

, ..., qk} is weakly witnessed by a point x ifd(qi , x) < d(q, x) for any i 2 [0, k] and q 2 Q \ {qo , ..., qk}

I is strongly witnessed if in additiond(qi , x) = d(qj , x), 8i , j 2 [0, k].

Given a set of points P = {p1

, ..., pn} 2 IRd and a subsetQ ✓ P ,

I The witness complex W (P ,Q) is the collection of simplifieswith vertices from Q with all subsimplices weakly witnessed bya point in P .

I Can be defined for a general metric space.

Adding geometry

A simplex � = {q0

Adding geometry

A simplex � = {q0

Adding geometry

Comparison

Multipqrameter persistence

Multiparameter persistence

Challenges

I Certificates for various approximate filtrations

I Distributed computing

I Discrete Morse approaches

I Randomized algorithms

I Multiscale persistence

I Multidimensional persistence

I Sampling and distribution properties of persistence

Challenges

Inference

Paradigms

Data Filtration Barcodes Interpretation

Paradigm 1: EDA

Data Filtration Barcodes Modeling

Paradigm 2: Modeling

Paradigms

StatisticalTheory ApplicationsProbability

Theory

• Hypothesis testing

• Bootstrapping

• Bayesian estimation

• Kalman filtering

• E-‐M

• Some idea

• Normal distribution

• Central Limit Theorem

• Gaussian processes

• Markov chains

• Bayes theorem

• Developing ideas

• Test drugs effect

• Noise filtering

• Tracking

• Pattern recognition

• Classification

• Many ideasTDA

Paradigm 1

Study the shape of data, what is the (multiscale) topology of thedata.

Why and what summaries ?

(1) Extracting �k can provide intuition

(2) Projecting onto IRP2 can provide intuition

(3) Persistence landscapes and diagrams can provide informationpause

(4) Statistical guarantees on these summaries(i) minmax results(ii) confidence/credible intervals(iii) consistency(iv) central limit theorems(v) extreme value theory

Paradigm 1

(4) Statistical guarantees on these summaries

(i) minmax results(ii) confidence/credible intervals(iii) consistency(iv) central limit theorems(v) extreme value theory

Paradigm 1

Paradigm 2

Summaries as features for downstream analysis.

(1) Machine learning perspective(i) features for classification and regression

(ii) features for dimension reduction(iii) kernel models/kernel engineering(iv) bias variance tradeo↵(v) function approximation questions

(2) Sampling distribution perspective(i) Su�ciency(ii) Pseudolikelihoods and empirical likelihoods(iii) Je↵rey’s conditioning(iv) Distributions of summaries under null models and hypothesis

testing(v) Understanding topological noise

Paradigm 2

(1) Machine learning perspective(i) features for classification and regression(ii) features for dimension reduction

(iii) kernel models/kernel engineering(iv) bias variance tradeo↵(v) function approximation questions

Paradigm 2

(1) Machine learning perspective(i) features for classification and regression(ii) features for dimension reduction(iii) kernel models/kernel engineering

(iv) bias variance tradeo↵(v) function approximation questions

Paradigm 2

(1) Machine learning perspective(i) features for classification and regression(ii) features for dimension reduction(iii) kernel models/kernel engineering(iv) bias variance tradeo↵

(v) function approximation questions

Paradigm 2

(1) Machine learning perspective(i) features for classification and regression(ii) features for dimension reduction(iii) kernel models/kernel engineering(iv) bias variance tradeo↵(v) function approximation questions

(2) Sampling distribution perspective(i) Su�ciency

(ii) Pseudolikelihoods and empirical likelihoods(iii) Je↵rey’s conditioning(iv) Distributions of summaries under null models and hypothesis

Paradigm 2

(2) Sampling distribution perspective(i) Su�ciency(ii) Pseudolikelihoods and empirical likelihoods

(iii) Je↵rey’s conditioning(iv) Distributions of summaries under null models and hypothesis

Paradigm 2

(2) Sampling distribution perspective(i) Su�ciency(ii) Pseudolikelihoods and empirical likelihoods(iii) Je↵rey’s conditioning

(iv) Distributions of summaries under null models and hypothesistesting

(v) Understanding topological noise

Paradigm 2

testing

(v) Understanding topological noise

Paradigm 2

Where we are

StatisticalTheory ApplicationsProbability

Theory

• Hypothesis testing

• Bootstrapping

• Bayesian estimation

• Kalman filtering

• E-‐M

• Some idea

• Normal distribution

• Central Limit Theorem

• Gaussian processes

• Markov chains

• Bayes theorem

• Developing ideas

• Test drugs effect

• Noise filtering

• Tracking

• Pattern recognition

• Classification

• Many ideas

Open questions

(1) Principled approaches to filtration selection.

(2) Quantification of ✏-su�ciency for di↵erent models/moduloinvariants.

(3) Summaries for graphs

(4) Information geometry for spaces with singularities andstratified spaces

(5) MCMC for models of di↵erent dimensions and algebraicstructures

(6) Signal processing and dictionary learning for shapes

(7) Summaries of complex objects as vector spaces ?

(8) Distribution theory for topological and geometric summaries.

Open questions

Mathematics

Spectral simplicial theory

(1) Cheeger inequalities for middle dimensions

(2) Higher-dimensional version of pageRank

(3) Limits of random walks as Brownian motion of forms

(4) Graph sparsification with L

(5) Synchronization and learning maps, and multicommodity flows

(6) SLE on simplicial complexes, loop erased random surfaces

Parsons

Stochastic topology

Acknowledgements

Many people.

Funding:

I Center for Systems Biology at Duke

I NSF DMS, CCF, IIS

I AFOSR, DARPA

Lecture 10: Discussion and Challenges - Statistical Sciencesayan/CBMS/Mukherjee/lecture10.pdf ·...

Documents

Transcript of Lecture 10: Discussion and Challenges - Statistical Sciencesayan/CBMS/Mukherjee/lecture10.pdf ·...

Voxels for Unity: Manual (1.0)

Sparse estimation automatically selects voxels relevant ...Sparse estimation automatically selects voxels relevant for the decoding of fMRI activity patterns Okito Yamashitaa,⁎,

TR99-17 April 1999 IEEE Transactions on Visualization and ... · Alternatively, volume graphics models objects as a 3D discrete set of point samples (voxels). These voxels comprise

Anatomical Homology

homology & alignment

Homology modelling

Computational Homology - RWTH Aachen UniversityKolleg/bilder/comp_hom.pdf · Computational Homology 1. Abstract Deﬁnition of Homology A free chain complex C = ... Computational

Flexible Voxels for Motion-Aware Videographygraphics.cs.cmu.edu/projects/FlexibleVoxels/files/FlexibleVoxels.pdfFlexible Voxels for Motion-Aware Videography Mohit Gupta1, Amit Agrawal2,

Acoustic Voxels: Dingzeyu Li, David I.W. Levin, … · ssdssd dsd sd Acoustic Voxels: Computational Optimization of Modular Acoustic Filters Dingzeyu Li, David I.W. Levin, Wojciech

Computational Homology of cubical and permutahedral complexesfintanhegarty.com/pdfs/thesiswithoutappendices.pdf · the persistent homology of simplicial complexes and the homology

Singular Homology

Giga Voxels SIggraph Talk

Homology Groups And Persistence Homology. Outline Introduction Simplicial Complex Boundary Operator Homology Triangulation Persistent Homology 2.

Computational Homology Tutorial Homology Algorithmspeople.math.gatech.edu/~chomp/workshop/mrozek.pdf · Computational Homology Tutorial Homology Algorithms Computational Homology

Voxels in LittleBigPlanet 2

Class 8: Cech Homology, Homology of Relations, Relative Homology … › ... › s19-eat-notes-apr4.pdf · 2019-04-06 · Class 8: Cech Homology, Homology of Relations, Relative Homology

Homology modelling ? X-ray ? NMR ?. Homology Modelling !

Voxels in ASAP · 2 Voxels in ASAP Voxels in ASAP ... at the center of the voxel represents the total flux of all th ese rays in flux/units-cubed. The voxel is part of a larger 3D

Morphological Homology Molecular homology Fine anatomy Gross anatomy

Homology Review