Matrix Concentration

17
Matrix Concentration Nick Harvey University of British Columbia

description

Matrix Concentration. Nick Harvey University of British Columbia. TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A A A. The Problem. Given any random n x n , symmetric matrices Y 1 ,…, Y k . Show that  i Y i is probably “close” to E[  i Y i ]. - PowerPoint PPT Presentation

Transcript of Matrix Concentration

Page 1: Matrix Concentration

Matrix Concentration

Nick Harvey University of British Columbia

Page 2: Matrix Concentration

The ProblemGiven any random nxn, symmetric matrices Y1,…,Yk.Show that i Yi is probably “close” to E[i Yi].

Why?• A matrix generalization of the Chernoff bound.• Much research on eigenvalues of a random matrix

with independent entries. This is more general.

Page 3: Matrix Concentration

Chernoff/Hoeffding Bound• Theorem:

Let Y1,…,Yk be independent random scalars in [0,R].Let Y = i Yi. Suppose that ¹L · E[Y] · ¹U. Then

Page 4: Matrix Concentration

Rudelson’s Sampling Lemma• Theorem: [Rudelson ‘99]

Let Y1,…,Yk be i.i.d. rank-1, PSD matrices of size nxn s.t.E[Yi]=I, kYi k· R. Let Y = i Yi , so E[Y]=k¢I. Then

• Example: Balls and bins– Throw k balls uniformly into n bins

– Yi = Uniform over

– If k = O(n log n / ²2), all bins same up to factor 1§²

Page 5: Matrix Concentration

Rudelson’s Sampling Lemma• Theorem: [Rudelson ‘99]

Let Y1,…,Yk be i.i.d. rank-1, PSD matrices of size nxn s.t.E[Yi]=I, kYi k· R. Let Y = i Yi , so E[Y]=k¢I. Then

• Pros: We’ve generalized to PSD matrices• Mild issue: We assume E[Yi] = I.• Cons:– Yi’s must be identically distributed– rank-1 matrices only

Page 6: Matrix Concentration

Rudelson’s Sampling Lemma• Theorem: [Rudelson-Vershynin ‘07]

Let Y1,…,Yk be i.i.d. rank-1, PSD matrices s.t.E[Yi]=I, kYi k· R. Let Y = i Yi , so E[Y]=k¢I. Then

• Pros: We’ve generalized to PSD matrices• Mild issue: We assume E[Yi] = I.• Cons:– Yi’s must be identically distributed– rank-1 matrices only

Page 7: Matrix Concentration

Rudelson’s Sampling Lemma• Theorem: [Rudelson-Vershynin ‘07]

Let Y1,…,Yk be i.i.d. rank-1, PSD matrices s.t.E[Yi]=I. Let Y=i Yi , so E[Y]=k¢I. Assume Yi ¹ R¢I. Then

• Notation: • A ¹ B , B-A is PSD• ® I ¹ A ¹ ¯ I , all eigenvalue of A lie in [®,¯]

• Mild issue: We assume E[Yi] = I.

E[Yi]=I

Page 8: Matrix Concentration

Rudelson’s Sampling Lemma• Theorem: [Rudelson-Vershynin ‘07]

Let Y1,…,Yk be i.i.d. rank-1, PSD matrices.Let Z=E[Yi], Y=i Yi , so E[Y]=k¢Z. Assume Yi ¹ R¢Z. Then

• Apply previous theorem to { Z-1/2 Yi Z-1/2 : i=1,…,k }.• Use the fact that A ¹ B

, Z-1/2 A Z-1/2 ¹ Z-1/2

B Z-1/2

• So (1-²) k Z ¹ i Yi ¹ (1+²) k Z , (1-²) k I ¹ i Z-1/2

Yi Z-1/2 ¹ (1+²) k I

Page 9: Matrix Concentration

Ahlswede-Winter Inequality• Theorem: [Ahlswede-Winter ‘02]

Let Y1,…,Yk be i.i.d. PSD matrices of size nxn.Let Z=E[Yi], Y=i Yi , so E[Y]=k¢Z. Assume Yi ¹ R¢Z. Then

• Pros:– We’ve removed the rank-1 assumption.– Proof is much easier than Rudelson’s proof.

• Cons:– Still need Yi’s to be identically distributed.

(More precisely, poor results unless E[Ya] = E[Yb].)

Page 10: Matrix Concentration

Tropp’s User-Friendly Tail Bound• Theorem: [Tropp ‘12]

Let Y1,…,Yk be independent, PSD matrices of size nxn.s.t. kYi k· R. Let Y=i Yi. Suppose ¹L¢I ¹ E[Y] ¹ ¹U¢I. Then

• Pros:– Yi’s do not need to be identically distributed– Poisson-like bound for the right-tail– Proof not difficult (but uses Lieb’s inequality)

• Mild issue: Poor results unless ¸min(E[Y]) ¼ ¸max(E[Y]).

Page 11: Matrix Concentration

Tropp’s User-Friendly Tail Bound• Theorem: [Tropp ‘12]

Let Y1,…,Yk be independent, PSD matrices of size nxn.Let Y=i Yi. Let Z=E[Y]. Suppose Yi ¹ R¢Z. Then

Page 12: Matrix Concentration

Tropp’s User-Friendly Tail Bound• Theorem: [Tropp ‘12]

Let Y1,…,Yk be independent, PSD matrices of size nxn.s.t. kYi k· R. Let Y=i Yi. Suppose ¹L¢I ¹ E[Y] ¹ ¹U¢I. Then

• Example: Balls and bins– For b=1,…,n– For t=1,…,8 log(n)/²2

– With prob ½, throw a ball into bin b

– Let Yb,t = with prob ½, otherwise 0.

Page 13: Matrix Concentration

Additive Error• Previous theorems give multiplicative error:

(1-²) E[i Yi] ¹ i Yi ¹ (1+²) E[i Yi]

• Additive error also useful: ki Yi - E[i Yi]k · ²• Theorem: [Rudelson & Vershynin ‘07]

Let Y1,…,Yk be i.i.d. rank-1, PSD matrices.Let Z=E[Yi]. Suppose kZk·1, kYik· R. Then

• Theorem: [Magen & Zouzias ‘11]If instead rank Yi · k := £(R log(R/²2)/²2), then

Page 14: Matrix Concentration

Proof of Ahlswede-Winter• Key idea: Bound matrix moment generating function• Let Sk = i=1 Yi

k

Golden-Thompson Inequality

By induction,

tr eA+B · tr eA¢eB

Weakness:This is brutal

Page 15: Matrix Concentration

How to improve Ahlswede-Winter?• Golden-Thompson Inequality

tr eA+B · tr eA¢eB for all symmetric matrices A, B.

• Does not extend to three matrices! tr eA+B+C · tr eA¢eB¢eC is FALSE.

• Lieb’s Inequality: For any symmetric matrix L,the map f : PSD Cone ! R defined by f(A) = tr exp( L + log(A) )is concave.– So f interacts nicely with Expectation and Jensen’s inequality

Page 16: Matrix Concentration

Beyond the basics

• Hoeffding (non-uniform bounds on Yi’s) [Tropp ‘12]

• Bernstein (use bound on Var[Yi]) [Tropp ‘12]

• Freedman (martingale version of Bernstein) [Tropp ‘12]• Stein’s Method (slightly sharper results) [Mackey et al. ‘12]

• Pessimistic Estimators for Ahlswede-Winter inequality [Wigderson-Xiao ‘08]

Page 17: Matrix Concentration

Summary

• We now have beautiful, powerful, flexible extension of Chernoff bound to matrices.

• Ahlswede-Winter has a simple proof;Tropp’s inequality is very easy to use.

• Several important uses to date;hopefully more uses in the future.