Sketched Learning from Random Features Moments

Nicolas Keriven

Ecole Normale Supérieure (Paris)

CFM-ENS chair in Data Science

(thesis with Rémi Gribonval at Inria Rennes)

ISMP, July 6th 2018

Compressive learning

Nicolas Keriven 1/15

Nicolas Keriven

Compression Learning

Linear sketch

• Sketched learning: First compress data in a linear sketch [Cormode

2011], then learn

Nicolas Keriven

Linear sketch

2011], then learn• Hash tables, count sketches, histograms…

Nicolas Keriven

Linear sketch

• Advantages: one-pass, streaming, distributed compression, data privacy…

Nicolas Keriven

Linear sketch

• Advantages: one-pass, streaming, distributed compression, data privacy…

• In this talk: unsupervised learning

How-to: build a sketch

Nicolas Keriven

What is a sketch ?

Any linear sketch = empirical moments

Nicolas Keriven

What is a sketch ?

What is contained in a sketch ?

Nicolas Keriven

What is a sketch ?

• : mean

Nicolas Keriven

What is a sketch ?

• : mean

• : moment

Nicolas Keriven

What is a sketch ?

• : mean

• : moment

• : histogram

Nicolas Keriven

What is a sketch ?

• : mean

• : moment

• : histogram

• Proposed: kernel random features [Rahimi 2007]

(random proj. + non-linearity)

Nicolas Keriven

What is a sketch ?

• : mean

• : moment

• : histogram

Questions:

• What information is preserved by the sketching ?

• How to retrieve this information ?

• What is a sufficient number of features ?

Nicolas Keriven

What is a sketch ?

• : mean

• : moment

• : histogram

Questions:

Intuition: sketching as a linear embedding

Nicolas Keriven

What is a sketch ?

• : mean

• : moment

• : histogram

Questions:

- Assumption:

Nicolas Keriven

What is a sketch ?

• : mean

• : moment

• : histogram

Questions:

- Assumption:

- Linear operator:

Nicolas Keriven

What is a sketch ?

• : mean

• : moment

• : histogram

Questions:

- Assumption:

- Linear operator:

- « Noisy » linear measurement:

Noise small

Nicolas Keriven

What is a sketch ?

• : mean

• : moment

• : histogram

Questions:

- Assumption:

- Linear operator:

- « Noisy » linear measurement:

Noise small

Dimensionality-reducing, random, linear embedding: Compressive Sensing?

Retrieving mixture of Diracsfrom a sketch= k-means

Example of applications [Keriven 2016,2017]

Nicolas Keriven

Application: Spectral clusteringfor MNIST classification

Nicolas Keriven

- Twice faster than k-means- 4 orders of magnitude more

memory efficient

Nicolas Keriven

- Twice faster than k-means- 4 orders of magnitude more

memory efficient

Retrieving GMMs from a sketch

Application: speaker verification [Reynolds 2000]

Error:

• EM on 300 000 samples : 29.53• 20kB sketch computed on 50GB database: 28.96

In this talk

Nicolas Keriven

Q: Theoretical guarantees ?

• Inspired by Compressive Sensing:

• 1: with the Restricted Isometry Property (RIP)

• 2: with dual certificates

Outline

Nicolas Keriven

Information-preservation guarantees: a RIP analysis

Joint work with R. Gribonval, G. Blanchard, Y. Traonmilin

Total variation regularization:a dual certificate analysis

Conclusion, outlooks

Recall: Linear inverse problem

True distribution:

Sketch:

True distribution:

Nicolas Keriven

• Estimation problem = linear inverse problem on measures

• Extremely ill-posed !

Sketch:

True distribution:

Nicolas Keriven

• Estimation problem = linear inverse problem on measures

• Extremely ill-posed !

• Feasibility? (information-preservation)

Best algorithmpossible

Sketch:

Information preservation guarantees

: Model set of « simple » distributions (eg. GMMs)

Nicolas Keriven

GoalProve the existence of a decoder robustto noise and stable to modeling error.

« Instance-optimal » decoder

Nicolas Keriven

Lower Restricted Isometry Property

Nicolas Keriven

Generalized Method of Moments

Nicolas Keriven

New goal: find/construct models and operators that satisfy the LRIP (w.h.p.)

Generalized Method of Moments

Proving the LRIP

Nicolas Keriven

Goal: LRIP

Proving the LRIP

Nicolas Keriven

Construction of :- Kernel mean [Gretton 2006, Borgwardt 2006]

- Random features [Rahimi 2007]

Pointwise LRIP

Goal: LRIP

Proving the LRIP

Nicolas Keriven

Pointwise LRIP

Goal: LRIP

Extension to LRIP

Covering numbers (compacity) of the normalized secant set

Proving the LRIP

Nicolas Keriven

Pointwise LRIP

Goal: LRIP

Extension to LRIP

Subset of a unit ball (infinite dimension)that only depends on

Proving the LRIP

Nicolas Keriven

Pointwise LRIP

Goal: LRIP

Extension to LRIP

Subset of a unit ball (infinite dimension)that only depends on

Main result [Keriven 2016]

Nicolas Keriven

Main hypothesis

The normalized secant set has finite covering numbers.

Result

Nicolas Keriven

Main hypothesis

Result

Nicolas Keriven

Main hypothesis

Pointwise concentration Dimensionality of the model

Result

W.h.p.

Nicolas Keriven

Main hypothesis

Result

W.h.p.

Nicolas Keriven

Main hypothesis

Modeling error Empirical noise

Result

W.h.p.

Nicolas Keriven

Main hypothesis

Modeling error Empirical noise

Does not depend on !

Result

W.h.p.

Nicolas Keriven

Main hypothesis

Modeling error

- Classic Compressive Sensing: finite dimension: Known- Here: infinite dimension: Technical

Empirical noise

Does not depend on !

Application

Nicolas Keriven

k-means with mixtures of Diracs

Application

Nicolas Keriven

Hypotheses- - separated centroids- - bounded domain for centroids

Application

Nicolas Keriven

(no assumptionon the data)

Application

Nicolas Keriven

Sketch- Adjusted Random Fourier features (for

technical reasons)

Application

Nicolas Keriven

technical reasons)

Result- W.r.t. k-means usual cost (SSE)

Application

Nicolas Keriven

technical reasons)

Sketch size

Application

Nicolas Keriven

k-means with mixtures of Diracs GMM with known covariance

technical reasons)

Sketch size

Application

Nicolas Keriven

technical reasons)

Sketch size

Hypotheses- Sufficiently separated means- Bounded domain for means

Application

Nicolas Keriven

technical reasons)

Sketch size

Sketch- Fourier features

Application

Nicolas Keriven

technical reasons)

Sketch size

Result- With respect to log-likelihood

Application

Nicolas Keriven

technical reasons)

Sketch size

Application

Nicolas Keriven

technical reasons)

Sketch size

Compared to Generalized Method of moments, different guarantees

Outline

Nicolas Keriven

Total variation regularization:a dual certificate analysisJoint work with C. Poon, G. Peyré

Total Variation regularization

Nicolas Keriven

Previously: RIP analysis

Minimization: moment matching

Nicolas Keriven

• Must know

• Non-convex !

Nicolas Keriven

• Must know

• Non-convex !

Convex relaxation (« super resolution »): Beurling-LASSO (BLASSO) [DeCastro 2015]

• : Radon measure

• : Total variation (« L1 norm »)

Nicolas Keriven

• Must know

• Non-convex !

Convex relaxation (« super resolution »): Beurling-LASSO (BLASSO) [DeCastro 2015]

• : Radon measure

• : Total variation (« L1 norm »)

Questions:• Is the measure sparse ?

• Does it have the right number of components ?

• Does it recover the true ?

Dual certificates

Dual certificate analysis:

( = Lagrange multiplier)

Dual certificates

Function

Dual certificates

Such that:

• otherwise•

Function

Dual certificates

Nicolas Keriven

Step 1: study full kernel [Candes 2013]

Assume sufficiently separated

Such that:

• otherwise•

Function

Dual certificates

Nicolas Keriven

Step 2: bounding the deviations

Such that:

• otherwise•

Function

Dual certificates

Nicolas Keriven

Such that:

• otherwise•

Function

Dual certificates

Nicolas Keriven

m=10 m=20

Such that:

• otherwise•

Function

Dual certificates

Nicolas Keriven

m=50m=10 m=20

Such that:

• otherwise•

Function

Results for separated GMM

Nicolas Keriven

1: Ideal scaling in sparsity

Nicolas Keriven

In progress…

Nicolas Keriven

In progress…

• not necessarily right number of components, but:

• Mass of concentrated around true• (weak) robustness to modelling error

• Proof: infinite-dimensional golfingscheme (new)

Nicolas Keriven

In progress…

2: Minimal norm certificate[Duval, Peyré 2015]

In progress…

Assumption: data are actually drawn from a GMM…

Nicolas Keriven

In progress…

2: Minimal norm certificate[Duval, Peyré 2015]

In progress…

• when n high enough: sparse, withright number of components

• Proof: adaptation of [Tang, Recht 2013]

Assumption: data are actually drawn from a GMM…

Outline

Nicolas Keriven

Total variation regularization:a dual certificate analysis

Sketch learning

Nicolas Keriven

• Sketching :• Streaming, distributed learning

• Original view on data compression and generalized moments

• Combines random features and kernel mean with infinitedimensional Compressive sensing

Summary, outlooks

Nicolas Keriven

• RIP analysis• Information preservation guarantees• Fine control on noise, modeling error (instance optimal decoder) and

recovery metrics• Necessary and sufficient conditions

Summary, outlooks

Nicolas Keriven

• Dual certificate analysis• Convex minimization• In some cases, automatically guess the right number of components

Summary, outlooks

Nicolas Keriven

• Dual certificate analysis• Convex minimization• In some cases, automatically guess the right number of components

• Outlooks• Algorithms for TV minimization• Other features (not necessarily random…)• Other « sketched » learning tasks• Multilayer sketches ?

Thank you !

Nicolas Keriven

• Gribonval, Blanchard, Keriven, Traonmilin. Compressive Statistical Learning with Random Feature Moments. 2017. <arXiv:1706.07180>

• Keriven. Sketching for Large-Scale Learning of Mixture Models. PhD Thesis. <tel-01620815>

• Poon, Keriven, Peyré. A Dual Certificates Analysis of Compressive Off-the-Grid Recovery. 2018. <arXiv:1802.08464>

• Code, applications: nkeriven.github.io

Sketched Learning from Random Features Moments · Compressive learning Nicolas Keriven Compression...

Transcript of Sketched Learning from Random Features Moments · Compressive learning Nicolas Keriven Compression...

Sketched Learning from Random Features Moments · Compressive learning Nicolas Keriven Compression...

Documents

Transcript of Sketched Learning from Random Features Moments · Compressive learning Nicolas Keriven Compression...

Hand Sketched Transition Watercolor Lessonprodcat.milliken.com/ProductCatalogDocuments/Naturally Drawn... · Watercolour Lesson Hand Sketched Drawing in Ink Hand Sketched Transition

World Association of Technology TeachersHOW TO SKETCH A SIMPLE BENTWOOD CHAIR - QUICKLY The ‘bentwood’ chairs below, can be sketched simply, using a thick HB pencil or a broad

Teaching Technique 13 Sketch Notes · learning goals for Sketch Notes Choose the lecture about which students will create Sketch Notes Set assignment parameters (such as time allowed,

EnhancingtheRealism of Sketch and Painted Portraits with ...ichenlin/public/Lee_Face...plain the concept with a sketched mouth example, where the leftmostone-third is shaded and therightmosttwo-thirds

Chapter 3 Creating and Editing Sketched Features.

A DEEP LEARNING BASED APPROACH TO SKETCH …

...until i sketched it

Sketched Ridge Regression: Optimization Perspective ...

Sketchformer: Transformer-Based Representation for Sketched … · 2020. 6. 29. · Sketchformer: Transformer-based Representation for Sketched Structure Leo Sampaio Ferraz Ribeiro*1,

Teaching Technique 13 Sketch Notes...Teaching Technique 13 Sketch Notes. Clarify your teaching purpose and learning goals for Sketch Notes ... SKETCH MAIN IDEA 1 2 3. Sketch Notes

Bain's Redemption Machine Learning Explanation Sketch

First: Sketch out a customer profile. Look at three things. … Burger Second.pdf · · 2013-07-03•Now that you sketched out a profile of your Customer, let’s tackle the Value

BikeWays mobile app design deck & sketched prototype

Learning to Simplify:ully Convolutional Networks for Rough Sketch ...

Deep Correlated Holistic Metric Learning for Sketch-Based ...

Solidworks Tutorials · Sketch a polygon with 6 side, ... To create top curve of aerofoil, click Spline, ! and sketch top curve as sketched below, to end Spline press Esc key. Exit

SolidWorks Learning Notes [Sketch]SolidWorks Learning Notes [Sketch] by André Duarte B. L. Ferreira 1/10/2018 I don’t claim to have made everything here (but mostly it was me).

Deep Manifold Alignment for Mid-grain Sketch based Image ... · the object category and key visual characteristics of the sketched query (Fig. 1) but without requiring a precise sketch.

Flow Separation Control in Rocket Nozzlenozzle (Aerospike model) designed in CATIA software is given below. Fig -1: CATIA Sketch The rough conceptual design sketched in CATIA software

Learning to Describe Multimodally from Parallel Unimodal ... · Learning to Describe Multimodally from Parallel Unimodal Data? A Pilot Study on Verbal and Sketched Object Descriptions