Metric embeddings and geometric inequalitiesnaor/mat529.pdfThe topic of this course is geometric...
Transcript of Metric embeddings and geometric inequalitiesnaor/mat529.pdfThe topic of this course is geometric...
Metric embeddings and geometric inequalities
Lectures by Professor Assaf NaorScribes: Holden Lee and Kiran Vodrahalli
April 27, 2016
MAT529 Metric embeddings and geometric inequalities
2
Contents
1 The Ribe Program 71 The Ribe Program . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 Bourgainβs Theorem implies Ribeβs Theorem . . . . . . . . . . . . . . . . . . 14
2.1 Bourgainβs discretization theorem . . . . . . . . . . . . . . . . . . . . 152.2 Bourgainβs Theorem implies Ribeβs Theorem . . . . . . . . . . . . . . 17
2 Restricted invertibility principle 191 Restricted invertibility principle . . . . . . . . . . . . . . . . . . . . . . . . . 19
1.1 The first restricted invertibility principles . . . . . . . . . . . . . . . . 191.2 A general restricted invertibility principle . . . . . . . . . . . . . . . . 21
2 Ky Fan maximum principle . . . . . . . . . . . . . . . . . . . . . . . . . . . 233 Finding big subsets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.1 Little Grothendieck inequality . . . . . . . . . . . . . . . . . . . . . . 263.1.1 Tightness of Grothendieckβs inequality . . . . . . . . . . . . 29
3.2 Pietsch Domination Theorem . . . . . . . . . . . . . . . . . . . . . . 303.3 A projection bound . . . . . . . . . . . . . . . . . . . . . . . . . . . . 323.4 Sauer-Shelah Lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
4 Proof of RIP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 344.1 Step 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 344.2 Step 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3 Bourgainβs Discretization Theorem 471 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
1.1 Reduction to smooth norm . . . . . . . . . . . . . . . . . . . . . . . . 482 Bourgainβs almost extension theorem . . . . . . . . . . . . . . . . . . . . . . 49
2.1 Lipschitz extension problem . . . . . . . . . . . . . . . . . . . . . . . 502.2 John ellipsoid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 512.3 Proof . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
3 Proof of Bourgainβs discretization theorem . . . . . . . . . . . . . . . . . . . 573.1 The Poisson semigroup . . . . . . . . . . . . . . . . . . . . . . . . . . 57
4 Upper bound on πΏπβπ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 664.1 Fourier analysis on the Boolean cube and Enfloβs inequality . . . . . . 69
5 Improvement for πΏπ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
3
MAT529 Metric embeddings and geometric inequalities
6 Kirszbraunβs Extension Theorem . . . . . . . . . . . . . . . . . . . . . . . . 79
4 Grothendieckβs Inequality 911 Grothendieckβs Inequality . . . . . . . . . . . . . . . . . . . . . . . . . . . . 912 Grothendieckβs Inequality for Graphs (by Arka and Yuval) . . . . . . . . . . 933 Noncommutative Version of Grothendieckβs Inequality - Thomas and Fred . . 974 Improving the Grothendieck constant . . . . . . . . . . . . . . . . . . . . . . 1025 Application to Fourier series . . . . . . . . . . . . . . . . . . . . . . . . . . . 1046 ? presentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
5 Lipschitz extension 1111 Introduction and Problem Setup . . . . . . . . . . . . . . . . . . . . . . . . . 111
1.1 Why do we care? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1121.2 Outline of the Approach . . . . . . . . . . . . . . . . . . . . . . . . . 113
2 π-Magnification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1133 Wasserstein-1 norm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1144 Graph theoretic lemmas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
4.1 Expanders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1164.2 An Application of Mengerβs Theorem . . . . . . . . . . . . . . . . . . 118
5 Main Proof . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
4
Introduction
The topic of this course is geometric inequalities with applications to metric embeddings; andwe are actually going to do things more general than metric embeddings: Metric geometry.Today I will roughly explain what I want to cover and hopefully start proving a first majortheorem. The strategy for this course is to teach novel work. Sometimes topics will becovered in textbooks, but a lot of these things will be a few weeks old. There are also somethings which may not have been even written yet. I want to give you a taste for whatβs goingon in the field. Notes from the course I taught last spring are also available.
One of my main guidelines in choosing topics will be topics that have many accessibleopen questions. I will mention open questions as we go along. Iβm going to really choosetopics that have proofs which are entirely self-contained. Iβm trying to assume nothing. Myintention is to make it completely clear and there should be no steps you donβt understand.
Now this is a huge area. I will explain some of the background today. Iβm going to useproofs of some major theorems as excuses to see the lemmas that go into the proofs. Someof the theorems are very famous and major, and youβre going to see some improvements, butalong the way, we will see some lemmas which are immensely powerful. So we will always beproving a concrete theorem. But actually somewhere along the way, there are lemmas whichhave wide applicability to many many areas. These are excuses to discuss methods, thoughthe theorems are important.
The course can go in many directions: If some of you have some interests, we can alwayschange the direction of the course, so express your interests as we go along.
5
MAT529 Metric embeddings and geometric inequalities
6
Chapter 1
The Ribe Program
2/1/16
1 The Ribe Program
The main motivation for most of what we will discuss is called the Ribe Program, which is aresearch program many hundreds of papers large. We will see some snapshots of it, and it allcomes from a theorem from 1975, Ribeβs rigidity theorem 1.1.9, which we will state nowand prove later in a modern way. This theorem was Martin Ribeβs dissertation, which starteda whole direction of mathematics, but after he wrote his dissertation he left mathematics.Heβs apparently a government official in Sweden. The theorem is in the context of Banachspaces; a relation between their linear structure and their structure as metric spaces. Nowfor some terminology.
Definition 1.1.1 (Banach space): A Banach space is a complete, normed vector space.Therefore, a Banach space is equipped with a metric which defines vector length and distancesbetween vectors. It is complete, so every Cauchy sequence of converges to a limit definedinside the space.
Definition 1.1.2: Let (π, βΒ·βπ), (π, βΒ·βπ ) be Banach spaces. We say that π is (crudely)finitely representable in π if there exists some constant πΎ > 0 such that for every finite-dimensional linear subspace πΉ β π, there is a linear operator π : πΉ β π such that for everyπ₯ β πΉ ,
βπ₯βπ β€ βππ₯βπ β€ πΎ βπ₯βπ .
Note πΎ is decided once and for all, before the subspace πΉ is chosen.(Some authors use βfinitely representableβ to mean that this is true for any πΎ = 1 + π.
We will not follow this terminology.)Finite representability is important because it allows us to conclude that π has the
same finite dimensional linear properties (local properties) as π . That is, it preserves anyinvariant involves finitely many vectors, their lengths, etc.
7
MAT529 Metric embeddings and geometric inequalities
Letβs introduce some local properties like type. To motivate the definition, consider thetriangle inequality, which says
βπ¦1 + Β· Β· Β·+ π¦πβπ β€ βπ¦1βπ + Β· Β· Β·+ βπ¦πβπ .
In what sense can we improve the triangle inequality? In πΏ1 this is the best you can say. Inmany spaces there are ways to improve it if you think about it correctly.
For any choice π1, . . . , ππ β {Β±1},πβπ=1
πππ¦π
π
β€πβπ=1
βπ¦πβπ .
Definition 1.1.3: df:type Say that π has type π if there exists π > 0 such that for everyπ, π¦1, . . . , π¦π β π ,
Eπβ{Β±1}π
πβπ=1
πππ¦π
π
β€ π
[πβπ=1
βπ¦πβππ
] 1π
.
The πΏπ norm is always at most the πΏ1 norm; if the lengths are spread out, this is asymptot-ically much better. Say π has nontrivial type if π > 1.
For example, πΏπ(π) has type min(π, 2).
Later weβll see a version of βtypeβ for metric spaces. How far is the triangle inequalityfrom being an equality is a common theme in many questions. In the case of normed spaces,this controls a lot of the geometry. Proving a result for π > 1 is hugely important.
Proposition 1.1.4: pr:finrep-type If π is finitely representable and π has type π then also πhas type π.
Proof. Let π₯1, . . . , π₯π β π. Let πΉ = span{π₯1, . . . , π₯π}. Finite representability gives meπ : πΉ β π . Let π¦π = ππ₯π. What can we say about
βπππ¦π?
Eπ
πβπ=1
πππ¦π
π
= Eπ
π(
πβπ=1
πππ₯π)
π
β₯ Eπ
πβπ=1
ππππ
π
Eπ
πβπ=1
πππ¦π
π
β€ π
(πβπ=1
βππ₯πβπ) 1
π
β€ ππΎ
(πβπ=1
βπ₯πβπ) 1
π
.
Putting these two inequalities together gives the result.
8
MAT529 Metric embeddings and geometric inequalities
Theorem 1.1.5 (Kahaneβs inequality). For any normed space π and π β₯ 1, for all π,π¦1, . . . , π¦π β π ,
Eπβπ=1
πππ¦π
&π
(E[
πβπ=1
πππ¦π
ππ
]) 1π
.
Here &π means βup to a constantβ; subscripts say what the constant depends on. The constanthere does not depend on the norm π .
Kahaneβs Theorem tells us that the LHS of Definition 1.1.3 can be replaced by any norm,if we change β€ to .. We get that having type π is equivalent to
Eπβπ=1
πππ¦π
ππ
. π ππβπ=1
βπ¦πβππ .
Recall the parallelogram identity in a Hilbert space π»:
Eπβπ=1
πππ¦π
2=
πβπ=1
βπ¦πβ2π» .
A different way to understand the inequality in the definition of βtypeβ is: how far is a givennorm from being an Euclidean norm? The Jordan-von Neumann Theorem says that ifparallelogram identity holds then itβs a Euclidean space. What happes if we turn it in aninequality?
Eπβπ=1
πππ¦π
2π»
β₯β€ π
πβπ=1
βπ¦πβ2π» .
Either inequality still characterizes a Euclidean space.What happens if we add constants or change the power? We recover the definition for
type and cotype (which has the inequality going the other way):
Eπβπ=1
πππ¦π
ππ»
&.
πβπ=1
βπ¦πβππ» .
Definition 1.1.6: Say it has cotype π if
πβπ=1
βπ¦πβππ . πΆπEπβπ=1
πππ¦π
ππ
R. C. James invented the local theory of Banach spaces, the study of geometry thatinvolves properties involving finitely many vectors (βπ₯1, . . . , π₯π, π (π₯1, . . . , π₯π) holds). As acounterexample, reflexivity cannot be characterized using finitely many vectors (this is atheorem).
Ribe discovered link between metric and linear spaces.First, terminology.
9
MAT529 Metric embeddings and geometric inequalities
Definition 1.1.7: Two Banach spaces are uniformly homeomorphic if there exists π :π β π that is 1-1 and onto and π, πβ1 are uniformly continuous.
Without the word βuniformlyβ, if you think of the spaces as topological spaces, all ofthem are equivalent. Things become interesting when you quantify! βUniformlyβ meansyouβre controlling the quantity.
Theorem 1.1.8 (Kadec). Any two infinite-dimensional separable Banach spaces are home-omorphic.
This is a amazing fact: these spaces are all topologically equivalent to Hilbert spaces!Over time people people found more examples of Banach spaces that are homeomorphic
but not uniformly homeomorphic. Ribeβs rigidity theorem clarified a big chunk of what washappening.
Theorem 1.1.9 (Rigidity Theorem, Martin Ribe (1975)). thm:ribe Suppose that π, π are uni-formly homeomorphic Banach spaces. Then π is finitely representable in π and π is finitelyrepresentable in π.
For example, for πΏπ and πΏπ, for π = π itβs always that case that one is not finitely repre-sentable in the other, and hence by Ribeβs Theorem, πΏπ, πΏπ are not uniformly homeomorphic.(When I write πΏπ, I mean πΏπ(R).)
Theorem 1.1.10. For every π β₯ 1, π = 2, πΏπ and βπ are finitely representable in each other,yet not uniformly homeomorphic.
(Here βπ is the sequence space.)
Exercise 1.1.11: Prove the first part of this theorem: πΏπ is finitely representable in βπ.
Hint: approximate using step functions. Youβll need to remember some measure theory.When π = 2, πΏπ, βπ are separable and isometric.The theorem in various cases was proved by:
1. π = 1: Enflo
2. 1 < π < 2: Bourgain
3. π > 2: Gorelik, applying the Brouwer fixed point theorem (topology)
Every linear property of a Banach signs which is local (type, cotype, etc.; involvingsumming, powers, etc.) is preserved under a general nonlinear deformation.
After Ribeβs rigidity theorem, people wondered: can we reformulate the local theoryof Banach spaces without mentioning anything about the linear structure? Ribeβs rigiditytheorem is more of an existence statement, we canβt see anything about an explicit dictionarywhich maps statements about linear sums into statements about metric spaces. So peoplestarted to wonder whether we could reformulate the local theory of Banach spaces, but only
10
MAT529 Metric embeddings and geometric inequalities
looks at distances between pairs instead of summing things up. Local theory is one of thehugest subjects in analysis. If you could actually find a dictionary which takes one lineartheorem at a time, and restate it with only distances, there is a huge potential here! Becausethe definition of type only involves distances between points, we can talk about a metricspaceβs type or cotype. So maybe we can use the intution given by linear arguments, andthen state things for metric spaces which often for very different reasons remain true from thelinear domain. And then now maybe you can apply these arguments to graphs, or groups!We could be able to prove things about the much more general metric spaces. Thus, we endup applying theorems on linear spaces in situations with a priori nothing to do with linearspaces. This is massively powerful.
There are very crucial entries that are missing in the dictionary. We donβt even now howto define many of the properties! This program has many interesting proofs. Some of themost interesting conjectures are how to define things!
Corollary 1.1.12. cor:uh-type If π, π are uniformly homeomorphic and if one of them is oftype π, then the other does.
This follows from Ribeβs Theorem 1.1.9 and Proposition 1.1.4. Can we prove somethinglike this theorem without using Ribeβs Theorem 1.1.9? We want to reformulate the definitionof type using only the distance, so this becomes self-evident.
Enflo had an amazing idea. Suppose π is a Banach space, π₯1, . . . , π₯π β π. The type πinequality says
eq:type-pE[
πβπ=1
πππ₯π
π].π
πβπ=1
βπ₯πβπ . (1.1)
Letβs rewrite this in a silly way. Define π : {Β±1}π β π by
π(π1, . . . , ππ) =πβπ=1
πππ₯π.
Write π = (π1, . . . , ππ). Multiplying by 2π, we can write the inequality (1.1) as
eq:type-genE [βπ(π)β π(βπ)βπ] .π
πβπ=1
E [βπ(π)β π(π1, . . . , ππβ1,βππ, ππ+1, . . . , ππ)βπ] . (1.2)
This inequality just involves distances between points π(π), so it is the reformulation weseek.
Definition 1.1.13: df:enflo A metric space (π, ππ) has Enflo type π if there exists π > 0such that for every π and every π : {Β±1}π β π,
E[ππ(π(π), π(βπ))π] β€ π ππβπ=1
E[ππ(π(π), π(π1, . . . , ππβ1,βππ, ππ+1, . . . , ππ))π].
11
MAT529 Metric embeddings and geometric inequalities
This is bold. It wasnβt true before for a general function! The discrete cube (Booleanhypercube) {Β±1}π is all the π vectors, of which there are 2π. Our function just assigns 2π
points arbitrarily. No structure whatsoever. As they are indexed this way, you see nothing.But youβre free to label them by vertices of the cube however you want. But there are manylabelings! In (1.2), the points had to be vertices of a cube, but in Definition 1.1.13, theyare arbitrary. The moment you choose the labelings, you impose a cube structure betweenthe points. Some of them are diagonals of the cube, some of them are edges. π and βπ areantipodal points. But itβs not really a diagonal. They are points on a manifold, and area function of how you decided to label them. What this sum says is that the sum over alldiagonals, the length of the diagonals to the power π is less than the sum over edges to theππ‘β powers (these are the points where one ππ is different). Thus we can seeβ
diagπ .π
βedgeπ.
The total πth power of lengths of diagonals is up to a constant, at most the same thing overall edges.
This is a vast generalization of type; we donβt even know a Banach space satisfies this.The following is one of my favorite conjectures.
Conjecture 1.1.14 (Enflo). If a Banach space has type π then it also has Enflo type π.
This has been open for 40 years. We will prove the following.
Theorem 1.1.15 (Bourgain-Milman-Wolfson, Pisier). If π is a Banach space of type π > 1then π also has type πβ π for every π > 0.
If you know the type inequality for parallelograms, you get it for arbitrary sets of points,up to π. Basically, youβre getting arbitrarily close to π instead of getting the exact result.We also know that the conjecture stated before is true for a lot of specific Banach spaces,though we do not yet have the general result. For instance, this is true for the πΏ4 norm.Index functions by vertices; some pairs are edges, some are diagonals; then the πΏ4 norm ofthe diagonals is at most that of the edges.
12
MAT529 Metric embeddings and geometric inequalities
How do you go from knowing this for a linear function to deducing this for an arbitraryfunction?
Once you do this, you have a new entry in the hidden Ribe dictionary. If π and πare uniformly homeomorphic Banach spaces and π has Enflo type π, then so is π. Theminute you throw away the linear structure, Corollary 1.1.12 becomes easy. It requires atiny argument. Now you can take a completely arbitrary function π : {Β±1}π β π. Thereexists a homeomorphism π : π β π such that π, πβ1 are uniformly continuous. Now wewant to deduce that the same inequality in π gives the same inequality in π.
Proposition 1.1.16: pr:uh-enflo If π, π are uniformly homeomorphic Banach spaces and πhas Enflo type π, then so does π.
This is an example of making the abstract Ribe theorem explicit.
Lemma 1.1.17 (Corson-Klee). lem:corson-klee If π, π are Banach spaces and π : π β π areuniformly continuous, then for every π > 0 there exists πΏ(π) such that
βπ₯1 β π₯2βπ β₯ π =β βπ(π₯1)β π(π₯2)β β€ πΏ βπ₯1 β π₯2β .Proof sketch of 1.1.16 given Lemma 1.1.17. By definition of uniformly homeomorphic, thereexists a homeomorphism π : π β π such that π, πβ1 are uniformly continuous. Lemma 1.1.17tells us that π perserves distance up to a constant. Dividing so that the smallest nonzerodistance you see is at least 1, we get the same inequality in the image and the preimage.
Proof details. Let ππ denote π with the πth coordinate flipped. We need to prove
E(ππ(π(π), π(βπ))π) β€ π ππ
πβπ=1
E(ππ(π(π), π(ππ))π)
Without loss of generality, by scaling π we may assume that all the points π(π) are distanceat least 1 apart. (π is a Banach space, so distance scales linearly; this doesnβt affect whetherthe inequality holds.)
Let π : π β π be such that π, πβ1 are uniform homeomorphisms. Because πβ1 isuniformly homeomorphic, there is πΆ such that ππ (π¦1, π¦2) β€ 1 implies ππ(π
β1(π¦1), πβ1(π¦2)) <
πΆ. WLOG, by scaling π we may assume that all the points π(π) are max(1, πΆ) apart, sothat the points π β π(π) are at least 1 apart.
We know that for any π : {Β±1}π β π that
E(ππ(π(π), π(βπ))π) β€ π ππ
πβπ=1
E(ππ(π(π), π(ππ))π).
We apply this to π = π β π ,π
π
οΏ½οΏ½
{Β±1}π
π;;
π=πβπ##
π
13
MAT529 Metric embeddings and geometric inequalities
to get
E(ππ(π(π), π(βπ))π) = E(ππ(πβ1 β π(π), πβ1 β π(βπ))π)β€ πΏπβ1(1)E(ππ (π(π), π(βπ))π)
β€ πΏπβ1(1)π ππ
πβπ=1
E(ππ (π(π), π(ππ)))
β€ πΏπβ1(1)πΏπ(1)πππ
πβπ=1
E(ππ(π(π), π(ππ))π)
as needed.
The parallelogram inequality for exponent 1 instead of 2 follows from using the triangleinequality on all possible paths for all paths of diagonals. Type π > 1 is a strengthening ofthe triangle inequality. For which metric spaces does it hold?
Whatβs an example of a metric space where the inequality doesnβt hold with π > 1? Thecube itself (with πΏ1 distance).
ππ οΏ½ π.
I will prove to you that this is the only obstruction: given a metric space that doesnβt containbi-Lipschitz embeddings of arbitrary large cubes, the inequality holds.
We know an alternative inequality involving distance equivalent to type; I can prove it.It is, however, not a satisfactory solution to the Ribe program. There are other situationswhere we have complete success.
We will prove some things, then switch gears, slow down and discuss Grothendieckβsinequality and applications. They will come up in the nonlinear theory later.
2 Bourgainβs Theorem implies Ribeβs Theorem
2-3-16We will use the Corson-Klee Lemma 1.1.17.
Proof of Lemma 1.1.17. Suppose π₯, π¦ β π, βπ₯β π¦β β₯ π. Break up the line segment fromπ₯, π¦ into intervals of length π; let π₯ = π₯0, π₯1, . . . , π₯π = π¦ be the endpoints of those intervals,with
βπ₯π+1 β π₯πβ β€ π.
The modulus of continuity is defined as
ππ (π‘) = sup {βπ(π’)β π(π£)β : π’, π£ β π, βπ’β π£β β€ π‘} .
Uniform continuity says limπ‘β0ππ (π‘) = 0. The number of intervals is
π β€ βπ₯β π¦βπ
+ 1 β€ 2 βπ₯β π¦βπ
.
14
MAT529 Metric embeddings and geometric inequalities
Then
βπ(π₯)β π(π¦)β β€πβπ=1
βπ(π₯π)β π(π₯πβ1)β
β€ πΎππ (π) β€2ππ (π)
πβπ₯β π¦β ,
so we can let πΏ(π) =2ππ (π)
π.
2.1 Bourgainβs discretization theorem
There are 3 known proofs of Ribeβs Theorem.
1. Ribeβs original proof, 1987.
2. HK, 1990, a genuinely different proof.
3. Bourgainβs proof, a Fourier analytic proof which gives a quantitative version. This isthe version weβll prove.
Bourgain uses the Discretization Theorem 1.2.4. There is an amazing open problem in thiscontext.
Saying πΏ is big says there is a not-too-fine net, which is enough. Therefore we areinterested in lower bounds on πΏ.
Definition 1.2.1 (Discretization modulus): Let π be a finite-dimensional normed spacedim(π) = π < β. Let the target space π be an arbitrary Banach space. Consider theunit ball π΅π in π. Take a maximal πΏ-net π©πΏ in π΅π . Suppose we can embed π©πΏ into π viaπ : π©πΏ β π . Suppose we know in π that
βπ₯β π¦β β€ βπ(π₯)β π(π¦)β β€ π· βπ₯β π¦β .
for all π₯, π¦ β ππΏ. (We say that π©πΏ embeds with distortion π· into π .)
You can prove using a nice compactness argument that if this holds for πΏ is small enough,then the entire space π embeds into π with rough the same distortion. Bourgainβs dis-cretization theorem 1.2.4 says that you can choose πΏ = πΏπ to be independent of the geometryof π and π such that if you give a πΏ-approximation of the unit-ball in the π-dimensionalnorm, you succeed in embedding the whole space.
I often use this theorem in this way: I use continuous methods to show embedding πinto π requires big distortion; immediately I get an example with a finite object. Let us nowmake the notion of distortion more precise.
Definition 1.2.2 (Distortion): Suppose (π, ππ), (π, ππ ) are metric spaces π· β₯ 1. We saythat π embeds into π with distortion π· if there exists π : π β π and π > 0 such that forall π₯, π¦ β π,
πππ(π₯, π¦) β€ ππ (π(π₯), π(π¦)) β€ π·πππ(π₯, π¦).
15
MAT529 Metric embeddings and geometric inequalities
The infimum over those π· β₯ 1 such that π embeds into π with distortion is denotedπΆπ (π). This is a measure of how far π is being from a subgeometry of π .
Definition 1.2.3 (Discretization modulus): Let be a π-dimensional normed space andπ be any Banach space, π β (0, 1). Let πΏπ βΛπ (π) be the supremum over all those πΏ > 0 suchthat for every πΏ-net π©πΏ in π΅π ,
πΆπ (π©πΏ) β₯ (1β π)πΆπ (π).
Here π΅π := {π₯ β π : βπ₯β β€ 1}.
In other words, the distortion of the πΏ-net is not much larger than the distortion of thewhole space. That is, the discrete πΏ-ball encodes almost all information about the spacewhen it comes to embedding into π : If you got πΆπ (π©πΏ) to be small, then the distortion ofthe entire object is not much larger.
Theorem 1.2.4 (Bourgainβs discretization theorem). For every π, π β (0, 1), for every π, π ,dimπ = π,
πΏπ βΛπ (π) β₯ πβ(ππ )
πΆπ
.
Moreover for πΏ = πβ(2π)πΆπ, we have πΆπ (π) β€ 2πΆπ (π©πΏ).
Thus there is a πΏ dependending on the dimension such that in any π-dimensional normspace, the unit ball it encodes all the information of embedding π into anything else. Itβsonly a function of the dimension, not of any of the relevant geometry.
The theorem says that if you look at a πΏ-net in the unit ball, it encodes all the informationaboutπ when it comes to embedding into everything else. The amount you have to discretizeis just a function of the dimension, and not of any of the other relevant geometry.
Remark 1.2.5: The proof is via a linear operator. All the inequality says is that you canfind a function with the given distortion. The proof will actually give a linear operator.
The best known upper bound is
πΏπ βΛπ
(1
2
).
1
π.
The latest progress was 1987, there isnβt a better bound yet. You have a month to thinkabout it before you get corrupted by Bourgainβs proof.
There is a better bound when the target space is a πΏπ space.
Theorem 1.2.6 (Gladi, Naor, Shechtman). For every π β₯ 1, if dimπ = π,
πΏπ βΛπΏπ(π) &π2
π52
(We still donβt know what the right power is.) The case π = 1 is important for appli-cations. There are larger classes for spaces where we can write down axioms for where thisholds. There are crazy Banach spaces which donβt belong to this class, so weβre not done.We need more tools to show this: Lipschitz extension theorems, etc.
16
MAT529 Metric embeddings and geometric inequalities
2.2 Bourgainβs Theorem implies Ribeβs Theorem
With the βmoreover,β Bourgainβs theorem implies Ribeβs Theorem 1.1.9.
Proof of Ribeβs Theorem 1.1.9 from Bourgainβs Theorem 1.2.4. Let π, π be Banach spacesthat are uniformly homeomorphic. By Corson-Klee 1.1.17, there exists π : π β π such that
βπ₯β π¦β β₯ 1 =β βπ₯β π¦β β€ βπ(π₯)β π(π¦)β β€ πΎ βπ₯β π¦β .
(Apply the Corson-Klee lemma for both π and the inverse.)In particular, if π > 1 and π© is a 1-net in
π π΅π = {π₯ β π : βπ₯β β€ π } ,
then πΆπ (π© ) β€ πΎ. Equivalently, for every πΏ > 0 every πΏ-net in π΅π satisfies πΆπ (π© ) β€ πΎ. IfπΉ β π is a finite dimension subspace and πΏ = πβ(2 dimπΉ )πΆ dimπΉ
, then by the βmoreoverβ partof Bourgainβs Theorem 1.2.4, there exists a linear operator π : πΉ β π such that
βπ₯β π¦β β€ βππ₯β ππ¦β β€ 2πΎ βπ₯β π¦β
for all π₯, π¦ β πΉ . This means that π is finitely representable.
The motivation for this program comes in passing from continuous to discrete. Thetheory has many applications, e.g. to computer science whcih cares about finite things. Iwould like an improvement in Bourgainβs Theorem 1.2.4.
First weβll prove a theorem that has nothing to do with Ribeβs Theorem. There arelemmas we will be using later. Itβs an easier theorem. It looks unrelated to metric theory,but the lemmas are relevant.
17
MAT529 Metric embeddings and geometric inequalities
18
Chapter 2
Restricted invertibility principle
1 Restricted invertibility principle
1.1 The first restricted invertibility principles
We take basic facts in linear algebra and make things quantitative. This is the lesson of thecourse: when you make things quantitative, new mathematics appears.
Proposition 2.1.1: If π΄ : Rπ β Rπ is a linear operator, then there exists a linear subspaceπ β Rπ with dim(π ) = rank(π΄) such that π΄ : π β π΄(π ) is invertible.
Whatβs the quantitative question we want to ask about this? Invertibility just says thatan inverse exists. Can we find a large subspace where not only is π΄ invertible, but the inversehas small norm?
We insist that the subspace is a coordinate subspace. Let π1, . . . , ππ be the standardbasis of Rπ, ππ = (0, . . . , 1β β
π
, 0, . . .). The goal is to find a βlargeβ subset π β {1, . . . ,π} such
that π΄ is invertible on Rπ where
Rπ := {(π₯1, . . . , π₯π) β Rπ : π₯π = 0 if π β π}
and the norm of π΄β1 : π΄(Rπ) β Rπ is small.A priori this seems a crazy thing to do; take a small multiple of the identity. But we can
find conditions that allow us to achieve this goal.
Theorem 2.1.2 (Bourgain-Tzafriri restricted invertibility principle, 1987). thm:btrip Let π΄ :Rπ β Rπ be a linear operator such that
βπ΄ππβ2 = 1
for every π β {1, . . . ,π}. Then there exist π β {1, . . . ,π} such that
1. |π| β₯ ππβπ΄β2 , where βπ΄β is the operator norm of π΄.
19
MAT529 Metric embeddings and geometric inequalities
2. π΄ is invertible on Rπ and the norm of π΄β1 : π΄(Rπ) β Rπ is at most πΆ β² (i.e.,βπ΄π½πβπβ β€ πΆ β², to use the notation introduced below).
Here π, πΆ β² are universal constants.
Suppose the biggest eigenvalue is at most 100. Then you can always find a coordinatesubset of proportional size such that on this subset, π΄ is invertible and the inverse has normbounded by a universal constant.
All of the proofs use something very amazing.This proof is from 3 weeks ago. This has been reproved many times. Iβll state a theorem
that gives better bound than the entire history.This was extended to rectangular matrices. (The extension is nontrivial.)Given π β Rπ a linear subspace with dimπ = π and π΄ : π β Rπ a linear operator, the
singular values of π΄π 1(π΄) β₯ π 2(π΄) β₯ Β· Β· Β· β₯ π π(π΄)
are the eigenvalues of (π΄*π΄)12 . We can decompose
π΄ = ππ·π
where π· is a matrix with π π(π΄)βs on the diagonal, and π, π are unitary.
Definition 2.1.3: For π β₯ 1 the Schatten-von Neumann π-norm of π΄ is
βπ΄βππ:=
(πβπ=1
π π(π΄)π
) 1π
= (Tr((π΄*π΄)π2 ))
1π
= (Tr((π΄π΄*)π2 ))
1π
The cases π = β, 2 give the operator and Frobenius norm,
βπ΄βπβ= operator norm
βπ΄βπ2=ΓTr(π΄*π΄) =
(βπ2πποΏ½ 1
2 .
Exercise 2.1.4: βΒ·βππis a norm on β³πΓπ(R). You have to prove that given π΄,π΅,
(Tr([(π΄+π΅)*(π΄+π΅)]π2 ))
1π β€ (Tr((π΄*π΄)
π2 ))
1π + (Tr((π΅*π΅)
π2 ))
1π .
This requires an idea. Note if π΄,π΅ commute this is trivial. Apparently von Neumannwrote a paper called βMetric Spacesβ in the 1930s in which he just proves this inequalityand doesnβt know what to do with it, so it got forgotten for a while until the 1950s, whenSchatten wrote books on applications. When I was a student in grad school, I was takinga class on random matrices. There was two weeks break, I was certain that it was trivialbecause the professor had not said it was not, and it completely ruined my break though Icame up with a different proof of it. Itβs short, but not trivial: Itβs not typical linear algebra!.This is like another triangle inequality, which we may need later on.
Spielman and Srivastava have a beautiful theorem.
20
MAT529 Metric embeddings and geometric inequalities
Definition 2.1.5: Stable rank.Let π΄ : Rπ β Rπ. The stable rank is defined as
srank(π΄) =
(βπ΄βπ2
βπ΄βπβ
)2
.
The numerator is the sum of squares of the singular values, and the denominator is themaximal value. Large stable rank means that many singular values are nonzero, and infact large on average. Many people wanted to get the size of the subset in the RestrictedInvertibility Principle to be close to the stable rank.
Theorem 2.1.6 (Spielman-Srivastava). thm:ss For every linear operator π΄ : Rπ β Rπ, π β(0, 1), there exists π β {1, . . . ,π} with |π| β₯ (1β π)srank(π΄) such that
(π΄π½π)β1πβ
.
βπ
π βπ΄βπ2
.
Here, π½π is the identity function restricted to Rπ, π½ : Rπ βΛ Rπ.
This is stronger than Bourgain-Tzafriri. In Bourgain-Tzafriri the columns were unitvectors.
Proof of Theorem 2.1.2 from Theorem 2.1.6. Let π΄ be as in Theorem 2.1.2. Then βπ΄βπ2=Γ
Tr(π΄*π΄) =βπ and srank(π΄) = π
βπ΄β2πβ. We obtain the existence of
|π| β₯ (1β π)π
βπ΄β2πβ
with β(π΄π½π)β1βπβ.
βπ=
1π.
This is a sharp dependence on π.The proof introduces algebraic rather than analytic methods; it was eye-opening. Marcus
even got sets bigger than the stable rank and looked at ππβπ΄βπ2βπ΄βπ4
2, which is muchstronger.
1.2 A general restricted invertibility principle
Iβll show a theorem that implies all these intermediate theorems. We use (classical) analysisand geometry instead of algebra. What matters is not the ratio of the norms, but the tailof the distribution of π 1(π΄)
2, . . . , π π(π΄)2.
Theorem 2.1.7. thm:gen-srank Let π΄ : Rπ β Rπ be a linear operator. If π < rank(π΄) then thereexist π β {1, . . . ,π} with |π| = π such that
(π΄π½π)β1πβ
. minπ<πβ€rank(π΄)
Γππ
(π β π)βππ=π π π(π΄)
2.
21
MAT529 Metric embeddings and geometric inequalities
You have to optimize over π. You can get the ratio of πΏπ norms from the tail bounds. Thisimplies all the known theorems in restricted invertibility. The subset can be as big as youwant up to the rank, and we have sharp control in the entire range. This theorem general-izes Spielman-Srivasta (Theorem 2.1.6), which had generalized Bourgain-Tzafriri (Theorem2.1.2). 2-8-16
Now we will go backwards a bit, and talk about a less general result. After Theorem 2.1.6,a subsequent theorem gave the same theorem but instead of the stable rank, used somethingbetter.
Theorem 2.1.8 (Marcus, Spielman, Srivastava). thm:mss4 If
π <1
4
(βπ΄βπ2
βπ΄βπ4
)4
,
there exists π β {1, . . . ,π}, |π| = π such that(π΄π½π)
β1πβ
.
βπ
βπ΄βπ2
.
A lot of these quotients of norms started popping up in peopleβs results. The correctgeneralization is the following notion.
Definition 2.1.9: For π > 2, define the stable πth rank by
srankπ(π΄) =
οΏ½βπ΄βπ2
βπ΄βππ
οΏ½ 2ππβ2
.
Exercise 2.1.10: Show that if π β₯ π > 2, then
srankπ(π΄) β€ srankπ(π΄).
(Hint: Use Holderβs inequality.)
Now we would like to prove how Theorem 2.1.7 generalizes the previously listed results:
Proof of Generalizability of Theorem 2.1.7. Using Holderβs inequality with π2,
βπ΄β2π2=
πβπ=1
π π(π΄)2
=πβ1βπ=1
π π(π΄)2 +
πβπ=π
π π(π΄)2
β€ (π β 1)1β2π
οΏ½πβ1βπ=1
π π(π΄)π
οΏ½ 2π
+πβπ=π
π π(π΄)2
22
MAT529 Metric embeddings and geometric inequalities
β€ (π β 1)1β2π βπ΄β2ππ
+πβπ=π
π π(π΄)2
πβπ=π
π π(π΄)2 β₯ βπ΄β2π2
οΏ½1β (π β 1)β
2π
βπ΄β2ππ
βπ΄β2π2
οΏ½= βπ΄β2π2
οΏ½1β
οΏ½π β 1
srankπ(π΄)
οΏ½1β 2π
οΏ½Now we can plug the previous calculation into Theorem 2.1.7 to demonstrate the way thenew theorem generalizes the previous results:
(π΄π½π)β1. min
π+1β€πβ€rank(π΄)
Γππ
(π β π) βπ΄β2π2
(1β
(πβ1
srankπ(π΄)
)1β 2π
)=
βπ
βπ΄ββmin
π+1β€πβ€rank(π΄)
Γπ
(π β π)(1β
(πβ1
srankπ(π΄)
)1β 2π
)This equation implies all the earlier theorems.
To optimize, fix the stable rank, differentiate in π, and set to 0. All theorems in theliterature follow from this theorem; in particular, we get all the bounds we got before. Therewas nothing special about the number 4 in Theorem 2.1.8; this is about the distribution ofthe eigenvalues.
2 Ky Fan maximum principle
sec:kf Weβll be doing linear algebra. Itβs mostly mechanical, except weβll need this lemma.
Lemma 2.2.1 (Ky Fan maximum principle). lem:kf Suppose that π : Rπ β Rπ is a rank πorthogonal projection. Then
Tr(π΄*π΄π ) β€πβπ=1
π π(π΄)2
where π π(π΄) are the singular values, i.e., π π(π΄)2 are the eigenvalues of π΅ := π΄*π΄.
This material was given in class on 2-15. This lemma follows from the following generaltheorem.
Theorem 2.2.2 (Convex function of dot products acheives maximum at eigenvectors).thm:eignmax Let π΅ : Rπ β Rπ be symmetric, and let π : Rπ β R be a convex function. Then forevery orthonormal basis π’1, . . . , π’π β Rπ, there exists a permutation π such that
eq:kf0π(β¨π΅π’1, π’1β©, . . . , β¨π΅π’π, π’πβ©) β€ π(ππ(1), . . . , ππ(π)) (2.1)
23
MAT529 Metric embeddings and geometric inequalities
Essentially, weβre saying using the eigenbasis maximizes the convex function.
Remark 2.2.3: We can weaken the condition on π to the following: for every π < π,π‘ β¦β π(π₯1, . . . , π₯π + π‘, π₯π+1, . . . , π₯πβ1, π₯π β π‘, π₯π+1, . . . , π₯π) is convex as a function of π‘. If π issmooth, this is equivalent to the second derivative in π‘ being β₯ 0:
eq:kf1
π2π
ππ₯2π+π2π
ππ₯2πβ 2
π2π
ππ₯πππ₯πβ₯ 0 (2.2)
In other words, you just need that the Hessian is positive semidefinite. (Above, we wrotethe determinant of the Hessian on the π₯π β π₯π plane. This being true for all pairs π, π is thesame as the Hessian being positive definite.)
Proof. We may assume without loss of generality that
1. π is smooth. If not, convolute with a good kernel.
2. Strict inequality holds:π2π
ππ₯2π+π2π
ππ₯2πβ 2
π2π
ππ₯πππ₯π> 0.
To see this, note we can take any π > 0 and perturb π(π₯) into π(π₯) + πβπ₯β22. Theinequality to prove (2.1) is also perturbed by this slight change, and taking π β 0gives the desired inequality.
Now let π’1, . . . , π’π be an orthonormal basis at which π(β¨π΅π’1, π’1β©, . . . , β¨π΅π’π, π’πβ©) attainsits maximum. Then for π’π, π’π, we want to rotate in the π β π plane by angle π. Since π’π, π’πspan a two dimensional subspace, recall the 2-dimensional rotation matrix. Let
π π =
οΏ½cos(π) sin(π)sin(π) β cos(π)
οΏ½;π’π;π =
οΏ½π’ππ’π
οΏ½Multiplying, we get
π ππ’π;π =
οΏ½cos(π) sin(π)sin(π) β cos(π)
οΏ½ οΏ½π’ππ’π
οΏ½=
οΏ½cos(π)π’π + sin(π)π’πsin(π)π’π β cos(π)π’π
οΏ½=
οΏ½(π ππ’π;π)1(π ππ’π;π)2
οΏ½Then, we replace π with π(π) =
π(β¨π΅π’1, π’1β©, . . . , β¨π΅ (π ππ’π;π)1 , (π ππ’π;π)1β©, β¨π΅ (π ππ’π;π)2 , (π ππ’π;π)2β©, . . . , β¨π΅π’π, π’πβ©
οΏ½where we keep all other dot products the same. By assumption, π attains its maximum atπ = 0, so πβ²(0) = 0, πβ²β²(0) β€ 0. Expanding out the rotated dot products explicitly in π(π), weget that the πth argument is
cos2(π)β¨π΅π’π, π’πβ©+ sin2(π)β¨π΅π’π, π’πβ©+ sin(2π)β¨π΅π’π, π’πβ©
24
MAT529 Metric embeddings and geometric inequalities
and the πth argument is
sin2(π)β¨π΅π’π, π’πβ©+ cos2(π)β¨π΅π’π, π’πβ© β sin(2π)β¨π΅π’π, π’πβ©
Then we can mechanically take the derivatives at π = 0 to get
0 = πβ²(0) = 2β¨π΅π’π, π’πβ©(ππ₯π β ππ₯π)
0 β₯ πβ²β²(0) = 2 (β¨π΅π’π, π’πβ© β β¨π΅π’π, π’πβ©) (ππ₯π β ππ₯π)β β =0 if β¨π΅π’π,π’πβ©=0
+4β¨π΅π’π, π’πβ©2(ππ₯ππ₯π + ππ₯ππ₯π β 2ππ₯ππ₯π
οΏ½β β β₯0
.
This implies that for all π = π β¨π΅π’π, π’πβ© = 0, which implies that for all π, π΅π’π = πππ’π for someππ. Thus any function applied to a vector of dot products is maximized at eigenvalues.
Exercise 2.2.4: exr:kf If π : Rπ β R satisfies the conditions in Theorem 2.2.2 and (π’1, . . . , π’π),(π£1, . . . , π£π) are two orthonormal bases of Rπ, then for every π΄ : Rπ β Rπ, there exists π β ππ,(π1, . . . , ππ) β {Β±1}π such that
π(β¨π΄π’1, π£1β©, β¨π΄π’2, π£2β©, . . . , β¨π΄π’π, π£πβ©) β€ π(π1π π(1)(π΄), . . . , πππ π(π)(π΄))
Show that choosing π’, π£ as the singular vectors maximizes π (over all pairs of orthonormalbases).
To solve this problem, you can rotate both vectors in the same direction and take deriva-tives, and also rotate them in opposite directions and take derivatives to get enough infor-mation to prove that the singular values are the maximum.
Essentially, a lot of the inequalities you find in books follow from this. For instance, ifyou want to prove that the Schatten π-norm is a norm, it follows directly from this fact.
Corollary 2.2.5. Let β Β· β be a norm on Rπ that is invariant under premutations and sign:
β(π₯1, . . . , π₯π)β = β(π1π₯π(1), . . . , πππ₯π(π))β
for all π β {Β±1}π and π β ππ (In the literature, we call this a symmetric norm). Thisinduces a norm on matrices ππΓπ(R) with
βπ΄β = β(π π(1)(π΄), . . . , π π(π)(π΄)β)
Then the triangle inequality holds for matrices π΄,π΅:
βπ΄+π΅β β€ βπ΄β+ βπ΅β.
Proof. We have by Exercise 2.2.4
βπ΄+π΅β = max(π’π)β₯,(π£π)β₯
β(β¨(π΄+π΅)π’π, π£πβ©)ππ=1β
β€ max(π’π)β₯,(π£π)β₯
β(β¨(π΄π’π, π£πβ©)ππ=1β+ max(π’π)β₯,(π£π)β₯
β(β¨π΅π’π, π£πβ©)ππ=1β
β€ βπ΄β+ βπ΅β .
25
MAT529 Metric embeddings and geometric inequalities
Remember Theorem 2.2.2! For many results, you simply need to apply the right convexfunction to get the result.
Our lemma follows from setting π(π₯) =βππ=1 π₯π.
Proof of Ky Fan Maximum Principle (Lemma 2.2.1). Take an orthonormal basis π’1, . . . , π’πof π such that π’1, . . . , π’π is a basis of the range of π . Then
Tr(π΅π ) =πβπ=1
β¨π΅ππ, ππβ© β€πβπ=1
π π(π΅) =πβπ=1
π π(π΄)2
3 Finding big subsets
Weβll present 4 lemmas for finding big subsets with certain properties. Weβll put themtogether at the end.
3.1 Little Grothendieck inequality
Theorem 2.3.1 (Little Grothendieck inequality). thm:lgi Fix π,π, π β N. Suppose that π :Rπ β Rπ is a linear operator. Then for every π₯1, . . . , π₯π β Rπ,
πβπ=1
βππ₯πβ22 β€π
2βπβ2βπβββπ2
πβπ=1
π₯2ππ
for some π β {1, . . . ,π} where π₯π = (π₯π1, . . . , π₯ππ).
Later we will show π2is sharp.
If we had only 1 vector, what does this say?
βππ₯1β2 β€βπ
2βπββπβββπ2
βπ₯1ββ
We know the inequality is true for π = 1 with constant 1, by definition of the operator norm.The theorem is true for arbitrary many vectors, losing an universal constant (π
2). After we
see the proof, the example where π2is attained will be natural.
We give Grothendieckβs original proof.The key claim is the following.
Claim 2.3.2. clm:lgi
eq:lgi1
πβπ=1
(πβπ=1
(π *ππ₯π)2π
) 12
β€βπ
2βπββπβββπ2
(πβπ=1
βππ₯πβ2) 1
2
. (2.3)
26
MAT529 Metric embeddings and geometric inequalities
Proof of Theorem 2.3.1. Assuming Claim 2.3.2, we prove the theorem.
πβπ=1
βππ₯πβ22 =πβπ=1
β¨ππ₯π, ππ₯πβ©
=πβπ=1
β¨π₯π, π *ππ₯πβ©
=πβπ=1
πβπ=1
π₯ππ(π*ππ₯π)π
β€πβπ=1
(πβπ=1
π₯2ππ
) 12(
πβπ=1
(π *ππ₯π)2π
) 12
by Cauchy-Schwarz
β€
οΏ½max1β€πβ€π
(πβπ=1
π₯2ππ
) 12
οΏ½οΏ½πβπ=1
πβπ=1
(π *ππ₯π)2π
οΏ½ 12
β€ max1β€πβ€π
(πβπ=1
π₯2ππ
) 12 βπ
2βπββπβββπ2
(πβπ=1
βππ₯πβ22
) 12
πβπ=1
βππ₯πβ22 β€π
2βπβ2βπβββπ2
maxπ
πβπ=1
π₯2ππ.
We bounded by a square root of the multiple of the same term, a bootstrapping argument.In the last step, divide and square.
Proof of Claim 2.3.2. Let π1, . . . , ππ be iid standard Gaussian random variables. For everyfixed π β {1, . . . ,π},
πβπ=1
ππ(π*ππ₯π)π.
This is a Gaussian random variable with mean 0 and varianceβππ=1(π
*ππ₯π)2π . Taking the
expectation,1
Eπβπ=1
ππ(π*ππ₯π)π
=
(πβπ=1
(π *ππ₯π)2π
) 12Γ2
π.
Sum these over π:
E
β‘β£ πβπ=1
π *(
πβπ=1
ππππ₯π)π
β€β¦ =
Γ2
π
πβπ=1
(πβπ=1
(π *ππ₯π)2π
) 12
πβπ=1
(πβπ=1
(π *ππ₯π)2π
) 12
=
βπ
2E
β‘β£ πβπ=1
π *
πβπ=1
ππ(ππ₯π)π
β€β¦ .eq:lgi2 (2.4)
1Γ
12π
β«βββ |π₯|πβ π₯2
2 = β2Γ
12π [π
β π₯2
2 ]β0 =Γ
2π
27
MAT529 Metric embeddings and geometric inequalities
Define a random sign vector π§ β {Β±1}π by
π§π = sign
οΏ½(π *
πβπ=1
ππππ₯π
)π
οΏ½Then
πβπ=1
(π *
πβπ=1
πππ₯π)π
=
β¨π§, π *
πβπ=1
ππππ₯π
β©=
β¨ππ§,
πβπ=1
ππππ₯π
β©β€ βππ§β2
πβπ=1
ππππ₯π
2
β€ βπββπβββπ2
πβπ=1
ππππ₯π
2
This is a pointwise inequality. Taking expectations and using Cauchy-Schwarz,
eq:lgi3E
β‘β£ πβπ=1
(π *
πβπ=1
ππππ₯π
)π
β€β¦ β€ βπββπβββπ2
οΏ½Eπβπ=1
ππππ₯π
22
οΏ½ 12
. (2.5)
What is the second moment? Expand:
eq:lgi4Eπβπ=1
ππππ₯π
22
= E
β‘β£βππ
ππππ β¨ππ₯π, ππ₯πβ©
β€β¦ =πβπ=1
βππ₯πβ22 . (2.6)
Chaining together (2.4), (2.5), (2.6) gives the result.
Why use the Gaussians? The identity characterizes the Gaussians using rotation invari-ance. Using other random variables gives other constants that are not sharp.
There will be lots of geometric lemmas:
β A fact about restricting matrices.
β Another geometric argument to give a different method for selecting subsets.
β A combinatorial lemma for selecting subsets.
Finally weβll put them together in a crazy induction.2-10-16: We were in the process of proving three or four subset selection principles, which
we will somehow use to prove the RIP.The little Grothendieck inequality (Theorem 2.3.1) is part of an amazing area of mathe-
matics with many applications. Itβs little, but very useful. The proof is really Grothendieckβsoriginal proof, re-organized. For completeness, weβll show the fact that the inequality is sharp(cannot be improved).
28
MAT529 Metric embeddings and geometric inequalities
3.1.1 Tightness of Grothendieckβs inequality
Corollary 2.3.3.Γπ/2 is the best constant in Theorem 2.3.1.
From the proof, we reverse engineer vectors that make the inequality sharp. They aregiven in the following example.
Example 2.3.4: Let π1, π2, . . . , ππ be iid Gaussians on the probability space (Ξ©, π ). Letπ : πΏβ(Ξ©, π ) β βπ2 be
ππ = (E[ππ1], . . . ,E[πππ]).
Let π₯π β πΏβ(Ξ©, π ),
π₯π =ππ(βπ
π=1 π2π
οΏ½ 12
.
Proof. Le π1, . . . , ππ;π ;π₯1, . . . π₯π be as in the example. Note the π₯π are nothing more thanvectors on the π-dimensional unit sphere, so they are bounded functions on the measurespace Ξ©. We can also write
eq:sum-xr
πβπ=1
π₯π(π)2 =
πβπ=1
ππ(π)2βπ
π=1 ππ(π)2= 1 (2.7)
We use the Central Limit Theorem in order to replace the Ξ© by a discrete space. Let ππ,π beΒ±1 random variables. Then letting
ππ =ππ,1 + . . .+ ππ,πβ
π
instead, we have that ππ approaches a standard Gaussian in distribution, so the statementswe make will be asymptotically true. With this discretization, the random variables {ππ}live in Ξ© = {Β±1}ππΎ . So πΏβ(Ξ©) = π2
ππΎ
β , which is in a large but finite dimension. So π willreally be a coordinate in Ξ©.
Now we show two things; they are nothing more than computations.
1. βπβπΏβ(Ξ©,P)βππ2=Γ2/π,
2. We also showβππ=1 βππ₯πβ22
πββββββ 1.
From (2.7) and the 2 items, the little Grothendieck inequality is sharp in the limit.
29
MAT529 Metric embeddings and geometric inequalities
For (1), we have
βπβββββ2 = supβπβββ€1
(πβπ=1
E [πππ]2
)1/2
= supβπβββ€1supβπ
π=1πΌ2π=1
βπ=1
πΌπE [πππ]
= supβπ
π=1πΌ2π=1
supβπβββ€1E[π
πβπ=1
πΌπππ
]= supβπ
π=1
Eπβπ=1
πΌπππ
= E|π1| =
Γ2
π
(2.8)
as we claimed, since βπΌβ2 = 1 impliesβππ=1 πΌπππ is also a gaussian.
Now for (2),
πβπ=1
βππ₯πβ22 =πβπ=1
οΏ½E
β‘β£ π2π(βππ=1 π
2π
οΏ½1/2β€β¦οΏ½2
= πΎ
οΏ½E
β‘β£ π21(βππ=1 π
2π
οΏ½1/2β€β¦οΏ½2
= πΎ
οΏ½1
πΎE
β‘β£ πβπ=1
π2π(βππ=1 π
2π
οΏ½1/2β€β¦οΏ½2
=1
πΎ
οΏ½E
β‘β£( πβπ=1
π2π
)1/2β€β¦οΏ½2
(2.9)
and you can use Stirling to finish. This is just a π2-distribution.In this case E π1π2
(β
ππ2π )
1/2 = E π1(βπ2)
(β
ππ2π )
1/2 . Also note that if (π1, . . . , ππ) β Rπ is a standard
Gaussian, then (π1,...,ππ)(βπ
π=1π2π
οΏ½1/2 and(βπ
π=1 π2π )
1/2οΏ½are independent. In other words, the length
and angle are independent: This is just polar coordinates, you can check this.
Now, how does this relate to the Restricted Invertibility Problem?
3.2 Pietsch Domination Theorem
Theorem 2.3.5 (Pietsch Domination Theorem). thm:pdt Fix π,π β N and π > 0. Supposethat π : Rπ β Rπ is a linear operator such that for every π₯1, . . . , π₯π β Rπ have
eq:pdt1
(πβπ=1
βππ₯πβ22
)1/2
β€π max1β€πβ€π
(πβπ=1
π₯2ππ
)1/2
(2.10)
30
MAT529 Metric embeddings and geometric inequalities
Then there exist π = (π1, . . . , ππ) β Rπ with π1 β₯ 0 andβππ=1 ππ = 1 such that for every
π₯ β Rπ
eq:pdt2βππ₯β2 β€π
(πβπ=1
ππ βπ₯πβ2)1/2
(2.11)
The theorem says that you can come up with a probability measure such that the normof T as an operator as a standard norm from πβ to π2 (?), is bounded by π .
Remark 2.3.6: The theorem really an iff: (2.11) is a stronger statement than (2.10), andin fact they are equivalent.
Proof. Define πΎ β Rπ with
πΎ =
{π¦ β Rπ : π¦π =
πβπ=1
βππ₯πβ22 βπ2πβπ=1
π₯2ππ for some π, π₯1, . . . , π₯π β Rπ}
Basically we cleverly select a convex set. Every π-tuple of vectors in Rπ gives you a newvector in Rπ. Letβs check that πΎ is convex. We have to check if two vectors π¦, π§ β πΎ,then all points on the line between them are in πΎ. π¦, π§ β πΎ means that there exist (π₯π)
ππ=1,
(π€π)ππ=1,
π¦π =πβπ=1
βππ₯πβ22 βπ2πβπ=1
π₯2ππ
π§π =πβ
π=1
βππ€πβ22 βπ2πβ
π=1
π€2ππ
for all π. Then πΌπ¦π + (1 β πΌ)π§π comes from (βπΌπ₯1, . . . ,
βπΌπ₯π,
β1β πΌπ€1, . . .
β1β πΌπ€π). So
by design, πΎ is a convex set.Now, the assumption of the theorem says that(
πβπ=1
βππ₯πβ22
)1/2
β€πmax1β€πβ€π
(πβπ=1
π₯2ππ
)1/2
which implies
βππ₯πβ22 βπ2max1β€πβ€ππβπ=1
π₯2ππ β€ 0
which implies πΎ β© (0,β)π = β . By the hyperplane separation theorem (for two disjointconvex sets in Rπ with at least one compact, there is a hyperplane between them), thereexists 0 = π = (π1, . . . , ππ) β Rπ with
β¨π, π¦β© β€ β¨π, π§β©
for all π¦ β πΎ and π§ β (0,β)π. By renormalizing, we may assumeβππ=1 ππ = 1. Moreover π
cannot have any strictly negative coordinate: Otherwise you could take π§ to have arbitrarily
31
MAT529 Metric embeddings and geometric inequalities
large value at a strictly negative coordinate with zeros everywhere else, implying β¨π’, π§β© is nolonger bounded from below, a contradiction. Therefore, π is a probability vector and β¨π, π§β©can be arbitrarily small. So for every π¦ β πΎ,
βππ=1 πππ¦π β€ 0. Write
π¦π = βππ₯β22 βπ2 βπ₯πβ2 β πΎ.
Expanding this out,
βππ₯β22 βπ2πβπ=1
ππ βπ₯πβ2 β€ 0,
which is exactly what we wanted.
3.3 A projection bound
Lemma 2.3.7. lem:projbound π,π β N, π β (0, 1), π : Rπ β Rπ a linear operator. Thenβπ β {1, . . . ,π} with |π| β₯ (1β π)π such that
βProjRππβπβ β€β
π
2ππβπβππ2βππ1
We will find ways to restrict a matrix to a big submatrix. We wonβt be able to controlits operator norm, but we will be able to control the norm from ππ2 to ππ1 . Then we pass toa further subset, which this becomes an operator norm on, which is an improvement whichGrothendieck gave us. This is the first very useful tool to start finding big submatrices.
Proof. We have π : ππ2 β ππ1 , π* : ππβ β ππ2 . Now some abstract nonsense gives us that for Ba-
nach spaces, the norm of an operator and its adjoint are equal, i.e. βπβππ2βππ1= βπ *βππββππ2
.This statement follows from the Hahn-Banach theorem (come see me if you havenβt seenthis before, Iβll tell you what book to read). From the Little Grothendieck inequality(Theorem 2.3.1), π * satisfies the assumption of the Pietsch domination theorem 2.3.5 withπ =
Γπ2βπβππ2βππ1
(weβre applying it to π *). By the theorem, there exists a probabilityvector (π1, . . . , ππ) such that for every π¦ β Rπ,
βπ *π¦β2 =π
(πβπ=1
πππ¦2π
)1/2
with π =Γ
π2βπβππ2βππ1
. Define π =βπ β {1, . . . ,π} : ππ β€ 1
ππ
{; then |π| β₯ (1 β π)π by
Markovβs inequality. We can also see this by writing
1 =πβπ=1
ππ =βπβπ
ππ +βπ βπ
ππ >βπβπ
ππ +πβ |π|ππ
which follows since for π β π, ππ >1ππ
. Continuing,
ππβπ+ |π|ππ
β₯βπβπ
ππ
|π| β₯ (ππ)βπβπ
ππ +π(1β π)
32
MAT529 Metric embeddings and geometric inequalities
Then, because π is a probability distribution, (ππ)βπβπ ππ β₯ 0 and we have
|π| β₯ π(1β π)
Now take π₯ β Rπ and choose π¦ β Rπ with βπ¦β2 = 1. Then
β¨π¦,ProjRπππ₯β©2 = β¨π *ProjRππ¦, π₯β©2 β€ βπ *ProjRππ¦β22 Β· βπ₯β22
β€ π
2βπβππ2βππ1
(βπβπ
πππ¦2π
)βπ₯β22 β€
π
2βπβ2ππ2βππ1
1
ππβπ₯β22
by Cauchy-Schwarz. Taking square roots gives the desired result.
In the previous proof, we used a lot of duality to get an interesting subset.
Remark 2.3.8: In Lemma 2.3.7, I think that either the constant π/2 is sharp (no subsetare bigger; it could come from the Gaussians), or there is a different constant here. If theconstant is 1, I think you can optimize the previous argument and get the constant to bearbitrarily close to 1, which would have some nice applications: In other words, getting
Γπ
2ππ
as close to 1 as possible would be good. I didnβt check before class, but you might wantto check if you can carry out this argument using the Gaussian argument we made for thesharpness of π
2in Grothendieckβs inequality (Theorem 2.3.1). Itβs also possible that there is
a different universal constant.
3.4 Sauer-Shelah Lemma
Now we will give another lemma which is very easy and which we will use a lot.
Lemma 2.3.9 (Sauer-Shelah). lem:saushel Take integers π,π β N and suppose that we have alarge set Ξ© β {Β±1}π with
|Ξ©| >πβ1βπ=0
(π
π
)Then βπ β {1, . . . , π} such that with |π| = π, if you project onto Rπ the set of vectors, you getthe entire cube: ProjRπ(Ξ©) = {Β±1}π.2 For every π β {Β±1}π, there exists πΏ = (πΏ1, . . . , πΏπ) β Ξ©such that πΏπ = ππ for π β π.
Note that Lemma 2.3.9 is used in the proof of the Fundamental Theorem of StatisticalLearning Theory.
Proof. We want to prove by induction on π. First denote the shattering set
sh(Ξ©) = {π β {1, . . . , π} : ProjRπΞ© = {Β±1}π}2I.e., the VC dimension of Ξ© is β₯ π.
33
MAT529 Metric embeddings and geometric inequalities
The claim is that the number of sets shattered by a given set is |sh(Ξ©)| β₯ |Ξ©|. The emptyset case is trivial. What happens when π = 1? Ξ© β {β1, 1}, and thus the set is shattered.Assume that our claim holds for π, and now set Ξ© β {Β±1}π+1 = {Β±1}π Γ {Β±1}. Define
Ξ©+ = {π β {Β±1}π : (π, 1) β Ξ©}
Ξ©β = {π β {Β±1}π : (π,β1) β Ξ©}
Then, letting Ξ©+ = {(π, 1) β {Β±1}π+1 : π β Ξ©+} and Ξ©β similarly, we have |Ξ©| = |Ξ©+| +|Ξ©β| = |Ξ©+|+ |Ξ©β|. By our inductive step, we have sh(Ξ©+) β₯ |Ξ©+| and sh(Ξ©β) β₯ |Ξ©β|. Notethat any subset that shatters Ξ©+ also shatters Ξ©, and likewise for Ξ©β. Note that if a set Ξ©β²
shatters both of them, we are allowed to add on an extra coordinate to get Ξ©β²Γ{Β±1} whichshatters Ξ©. Therefore,
sh(Ξ©+) βͺ sh(Ξ©β) βͺ {π βͺ {π+ 1} : π β sh(Ξ©+) β© sh(Ξ©β)} β sh(Ξ©)
where the last union is disjoint since the dimensions are different. Therefore, we can nowuse this set inclusion to complete the induction using the principle of inclusion-exclusion:
|sh(Ξ©)| β₯ |sh(Ξ©+) βͺ sh(Ξ©β)|+ |sh(Ξ©+) β© sh(Ξ©β)| (disjoint sets)
= |sh(Ξ©+)|+ |sh(Ξ©β)| β |sh(Ξ©+) β© sh(Ξ©β)|+ |sh(Ξ©+) β© sh(Ξ©β)|= |sh(Ξ©+)|+ |sh(Ξ©β)|β₯ |Ξ©+|+ |Ξ©β| = |Ξ©|
which completes the induction as desired.
We will primarily use the theorem as the following corollary, which says that if you havehalf of the points in terms of cardinality, you get half of the dimension.
Corollary 2.3.10. If |Ξ©| β₯ 2πβ1 then there exists π β {1, . . . , π} with |π| β₯ βπ+12β β₯ π
2such
that ProjRπΞ© = {Β±1}π.
2-15-16: Last time we left off with the proof of the Sauer-Shelah lemma. To remind you,we were finding ways to find interesting subsets where matrices behave well. Now recall wehad a linear algebraic fact which I owe you; I will prove it in an analytic way. The proof hasbeen moved to Section 2.
4 Proof of RIP
4.1 Step 1
Now we need another geometric lemma for the proof of Theorem 2.1.7, the restricted invert-ibility principle.
34
MAT529 Metric embeddings and geometric inequalities
Lemma 2.4.1 (Step 1). lem:step1 Fix π,π, π β N. Let π΄ : Rπ β Rπ be a linear operator withrank(π΄) β₯ π. For every π β {1, . . . ,π}, denote
πΈπ = (span((π΄ππ)πβπ ))β₯ .
Then there exists π β {1, . . . ,π} with |π | = π such that for all π β π ,
βProjπΈπβ{π}π΄ππβ2 β₯
1βπ
(πβπ=1
π π(π΄)2
)1/2
.
Basically weβre taking the projection of the ππ‘β column onto the orthogonal completementof the span of the subspace of all columns in the set except for the ππ‘β one, and bounding thenorm of that by a dimension term and the square root of the sum of the eigenvalues. (This issharp asymptotically, and may in fact even be sharp as written tooβI need to check. Checkthis?)
Proof. For every π β {1, . . . ,π}, denote
πΎπ = conv ({Β±π΄ππ}πβπ )
The idea is to make the convex hull have big volume. Once we do that, wewill get all theseinequalities for free. Let π β {1, . . . ,π} be the subset of size π that maximizes volπ(πΎπ ).We know that volπ(πΎπ ) > 0. Observe that for any π½ β {1, . . . ,π} of size π β 1 and π β π½,we have
πΎπ½βͺ{π} = conv (πΎπ½ βͺ {Β±π΄ππ}) ,
which is a double cone.
What is the height of this cone? It is βProjπΈπ½π΄ππβ2, as πΈπ½ is the orthogonal complement
of the space spanned by π½. Therefore, the π-dimensional volume is given by
volπ(πΎπ½βͺ{π}) = 2 Β·volπβ1(πΎπ½) Β· βProjπΈπ½
π΄ππβ2π
35
MAT529 Metric embeddings and geometric inequalities
Because |π | = π is the maximizing subset of πΉΞ©, for any π β π and π β {1, . . . ,π}, choosingπ½ = π β {π}, we get
volπ(πΎπ½βͺ{π}) β₯ volπ(πΎπ½βͺ{π})
=β βProjπΈπβ{π}π΄ππβ2 β₯ βProjπΈπβ{π}
π΄ππβ2.
for every π β π and π β {1, . . . ,π}. Summing,
πβProjπΈπβ{π}π΄ππβ22 β₯
πβπ=1
βProjπΈπβ{π}π΄ππβ22 = βProjπΈπβ{π}
π΄β2π2.
Then, for all π β π ,
eq:rip-s1-1βProjπΈπβ{π}π΄ππβ2 β₯
1βπβProjπΈπβ{π}
π΄βπ2 (2.12)
Letβs denote π = ProjπΈπβ{π}. Note π is an orthogonal projection of rank π β 1. Then,
eq:rip-s1-2
βππ΄β2π2= Tr((ππ΄)*(ππ΄)) = Tr(π΄*π *ππ΄) = Tr(π΄*ππ΄) = Tr(π΄π΄*π )
= Tr(π΄π΄*)β Tr(π΄π΄*(πΌ β π )) β₯πβπ=1
π π(π΄)2 β
πβ1βπ=1
π π(π΄)2 =
πβπ=π
π π(π΄)2
(2.13)
using the Ky Fan maximal principle 2.2.1, since πΌ β π is a projection of rank πβ π + 1.Putting (2.12) and (2.13) together gives the result.
4.2 Step 2
In our proof of the restricted invertibility principle, this is the first step. Before proving it,let me just tell you what the second step looks like.
Lemma 2.4.2 (Step 2). lem:step2 Let π,π, π β N, π΄ : Rπ β Rπ, rank(π΄) > π. Let π β{1, . . . ,π} with |π| = rank(π΄) such that {π΄ππ}πβπ are linearly independent. Denote forevery π β Ξ©
πΉπ = πΈπβ{π} =(span(π΄ππ)πβπβ{π}
οΏ½.
Then there exists π β π with |π| = π such that
β (π΄π½π)β1 βπβ .
Γrank(π΄)Γ
rank(π΄)β πΒ·maxπβπ
1
βProjπΉππ΄ππβ
Most of the work is in the second step. First we pass to a subset where we have someinformation about the shortest possible orthogonal project. But Step 1 saves us by boundingwhat this can be. Here we use the Grothendieck inequality, Sauer-Shelah, etc. Everything:Itβs simple, but it kills the restricted invertibility principle.
36
MAT529 Metric embeddings and geometric inequalities
Proof of Theorem 2.1.7 given Step 1 and 2. Take π΄ : Rπ β Rπ. By Step 1 (Lemma 2.4.1),we can find subset π β {1, . . . ,π} with |π | = π such that for all π β π ,
eq:step1βProjπΈπβ{π}π΄ππβ2 β₯
1βπ
(πβπ=π
π π(π΄)2
)1/2
. (2.14)
Now we apply Step 2 (Lemma 2.4.2) to π΄π½π , using π = π , and find a further subset π β πsuch that
β (π΄π½π)β1 βπβ β€ minπ<π<rank(π΄)
Γrank(π΄)
rank(π΄)β πmaxπβπ
1ProjπΉπ
π΄ππ
β€ minπ<π<rank(π΄)
Γππ
(π β π)βππ=π π π(π΄)
2
which we get by plugging directly in π for the rank and using Step 1 (2.14) to get thedenominator.
2-17Now we prove Step 2 (Lemma 2.4.2). Note we can assume π = {1, . . . ,π} and that the
rank is π.First we need some lemmas.
Lemma 2.4.3. lem:rip-step2-1 Let π΄ : Rπ β Rπ be such that {π΄ππ}ππ=1 are linearly independent,and π β {1, . . . ,π}, π‘ β N. Then there exists π β π with
|π | β₯(1β 1
2π‘
)|π|,
such that denoting π = πβͺ({1, . . . ,π}βπ), πΉπ = (span({π΄ππ}π =π))β₯, andπ = maxπβπ1
βProjπΉππ΄ππβ ,
we have that for all π β Rπ, βπβπ
|ππ| β€ 2π‘2π
Γ|π|βπβπ
πππ΄ππ
2
.
37
MAT529 Metric embeddings and geometric inequalities
This is proved by a nice inductive argument.
Proof. TODO next time
Lemma 2.4.4. lem:rip-step2-2 Let π,π, π‘ β N and π½ β {1, . . . ,π}. Let π΄ : Rπ β Rπ be alinear operator such that {π΄ππ}ππ=1 are linearly independent. Then there exist two subsets
π β π β π½ such that |π | β₯(1β 1
2π‘
οΏ½|π½|, |πβπ| β€ |π½|
4, and if we denote π = π βͺ ({1, . . . ,π}βπ½),
π = maxπβπ1
βProjπΉππ΄ππβ , then
ProjRπ(π΄π½π)β1πβ
. 2π‘2π.
Proof of Lemma 2.4.4 from Lemma 2.4.3. Apply Lemma 2.4.3 with π = π½. Basically, weβregoing to inductively construct a subset π = π½ of {1, Β· Β· Β· ,π} onto which to project, so thatwe can control the π1 norm of the subset π by the π2 norm on a slightly greater set π viaLemma 2.4.3.
We find π β π½ with |π | β₯(1β 1
2π‘
οΏ½|π½| such thatβ
πβπ|ππ| β€ 2
π‘2π
Γ|π½|
βπβπ
πππ΄ππ
2
.
Rewriting gives that
βπ β Rπ, βProjRπ πβ β€ 2π‘2π
Γ|π½| βπ΄π½ππβ2
=βProjRπ (π΄π½π)
β1βπ2ββπ1
. 2π‘2π
Γ|π½|.
Denote π = |π½|4|π | . By Lemma 2.3.7, there exists π β π , |π| β₯ (1β π)|π | such thatProjRπ(π΄π½π)
β1πβ
=ProjRπ ProjRπ (π΄π½π)
β1πβ
.Γ
π
2π|π |2
π‘2π
Γ|π½|
. 2π‘2π.
38
MAT529 Metric embeddings and geometric inequalities
Now, we have basically finished making use of Lemma 2.4.3 in the proof of Lemma 2.4.4.It remains to use Lemma 2.4.4 to prove the full second step in our proof of the RestrictedInvertibility Principle.
Proof of Lemma 2.4.2 (Step 2). Fix an integer π such that 122π+1 β€ 1 β π
πβ€ 1
22π. Proceed
inductively as follows. First set
π0 = {1, . . . ,π}π0 = π.
Suppose π’ β {0, . . . , π + 1} and we constructed ππ, ππ β {1, . . . ,π} such that if we denoteπ½π’ = ππ’βππ’, ππ’ = ππ’ βͺ ({1, . . . ,π}βπ½π’β1), then
1. ππ’ β ππ’ β π½π’β1
2. |ππ’| β₯(1β 1
22πβπ’+4
οΏ½|π½π’β1|
3. |π½π’| β€ 14|π½π’β1|
4. βProjRππ’ (π΄π½ππ’)β1βπβ
. 2πβπ’2π .
Let π» = 2π β π’+ 4. For instance, |π1| β₯(1β 1
22π+3
οΏ½|π½0|. What is the new π½?
For the inductive step, apply Lemma 2.4.4 on π½π’β1 with π‘ = 2πβ π’+ 4 to get ππ’ β ππ’ βπ½π’β1 such that |ππ’| β₯
(1β 1
22πβπ’+4
οΏ½|π½π’β1|, |ππ’βππ’| β€ |π½π’β1|
4As we induct, we are essentially
building up more ππ’ to eventually produce the invertible subset over which we will project.Note that the size of the π set is decreasing as we proceed inductively.
eq:rip-s2-1
ProjRππ’ (π΄π½ππ’)
β1. 2πβπ’/2π. (2.15)
39
MAT529 Metric embeddings and geometric inequalities
We know |π½π’β1| β€ π4π’β1 ,
π½π’β1 = π½π’ β ππ’ β (π½π’β1βππ’)|π½π’β1| = |π½π’|+ |ππ’|+ (|π½π’β1| β |ππ’|)|ππ’| = |π½π’β1| β |π½π’| β (|π½π’β1| β |ππ’|)
β₯ |π½π’β1| β |π½π’| β|π½π’β1|22πβπ’+4
β₯ |π½π’β1| β |π½π’| βπ
22π+π’+2.
Our choice for the invertible subset is
π =π+1β¨π’=1
ππ’.
Telescoping gives
|π| =π+1βπ’=1
|ππ’| β₯ |π½0| β |π½π+1| βπ
22π+2
ββπ’=1
1
2π’
β₯ πβ π
4π+1β π
22π+2
= π(1β 1
22π+1
)β₯ π
π
π= π.
Observe that π β βπ+1π’=1 ππ’ and for every π’,
ππ’, . . . , ππ+1 β ππ’
π1, . . . , ππ’β1 β {1, . . . ,π}βπ½π’β1
This allows us to use the conclusion for all the ππ’βs at once.For π β Rπ, π½ππ β π½ππ’Rππ’ ,
ProjRππ’ (π΄π½ππ’)β1(π΄π½π)π = ProjRππ’ π½ππ.
since π΄β1π΄ = πΌ and projecting a subset onto its containing set is just the subset itself. Then,breaking π½ππ into orthogonal components,
βπ½ππβ22 =π+1βπ’=1
βProjRππ’ π½ππβ22
=π+1βπ’=1
ProjRππ’ (π΄π½ππ’)
β1(π΄π½π)π22
.π+1βπ’=1
22πβπ’π2 βπ΄π½ππβ22 by (2.15)
. 22ππ2 βπ΄π½ππβ22
40
MAT529 Metric embeddings and geometric inequalities
β€ π2
1β ππ
βπ΄π½ππβ22
βπ½ππβ2 β€Γ
π
πβ ππ βπ΄π½ππβ2
Since this is true for all π β Rπ,
=β(π΄π½π)
β1πβ
.Γ
π
πβ ππ.
An important aspect of this whole proof to realize is the way we took a first subset togive one bound, and then took another subset in order to apply the Little Grothendieckinequality. In other words, we first took a subset where we could control π1 to π2, and thentook a further subset to get the operator norm bounds.
Also, notice that in this approach, we construct our βoptimal subsetβ of the space in-ductively: However, doing this in general is inefficient. It would be nice to construct the setgreedily instead for algorithmic reasons, if we wanted to come up with a constructive proof.I have a feeling that if we were to avoid this kind of induction, it would involve not usingSauer-Shelah at all.
How can we make this theorem algorithmic?The way the Pietsch Domination Theorem 2.3.5 worked was by duality. We look at a
certain explicitly defined convex set. We found a separating hyperplane which must be aprobability measure. Then we had a probabilistic construction. This part is fine.
The bottleneck for making this an algorithm (I do believe this will become an algorithm)consists of 2 parts:
1. Sauer-Shelah lemma 2.3.9: We have some cloud of points in the boolean cube, andwe know there is some large subset of coordinates (half of them) such that when youproject to it you see the full cube. Iβm quite certain that itβs NP-hard in general. (Workis necessary to formulate the correct algorithmic Sauer-Shelah lemma. How is the setgiven to you?). In fact, formulating the right question for an algorithmi Sauer-Shelahis the biggest difficulty.
We only need to answer the algorithmic question tailored to our sets, which havea simple description: the intersection of an ellipsoid with a cube. There is a goodchance that there is a polynomial time algorithm in this case. This question has otherapplications as well (perhaps one could generalize to the intersection of the cube withother convex bodies). 3
2. The second bottleneck is finding a subset with maximal volume.
Itβs another place where we chose a subset, the subset that maximizes the volume outof all subsets of a given size (Lemma 2.4.1). Specifically, we want the set of columns
3There could be an algorithm out there, but I couldnβt find one in the literature.
41
MAT529 Metric embeddings and geometric inequalities
of a matrix that maximizes the volume of the convex hull. Computing the volume ofthe convex hull requires some thought. Also, there are
(ππ
οΏ½subsets of size π; if π = π/2
there are exponentially many. We need a way to find subsets with maximum volumefast. There might be a replacement algorithm which approximately maximizes thisvolume.
2-22: We are at the final lemma, Lemma 2.4.3 in the Restricted Invertiblity Theorem.
The proof of this is an inductive application of the Sauer-Shelah lemma. A very importantidea comes from Giannopoulos. If you naively try to use Sauer-Shelah, it wonβt work out.We will give a stronger statement of the previous lemma which we can prove by induction.
Lemma 2.4.5 (Stronger version of Lemma 2.4.3). lem:SS-induct-stronger Take π,π β N, π΄ : Rπ βRπ a linear operator such that {π΄ππ}ππ=1 are linearly independent. Suppose that π β₯ 0 is aninteger and π β {1, . . . ,π}. Then there exists π β π with |π | β₯ (1 β 1
2π)|π| such that for
every π β π for all π β Rπ we have
eq:rip21-1
βπβπ
|ππ| β€πΓ|π|(
πβπ=1
2π/2) β
πβππππ΄ππ
2
+ (2π β 1)β
πβπβ©(πβπ)|ππ| (2.16)
42
MAT529 Metric embeddings and geometric inequalities
Lemma 2.4.3 (all we need to complete the proof of the Restricted Invertibility Principle)is the case where π = π βͺ {1, Β· Β· Β· ,π} β π, π‘ = π.
We prove the stronger version via induction on π.
Proof. As π becomes bigger, weβre allowed to take more of the set π. The idea is to useSauer-Shelah, taking half of π, and then half of what remains at each step.
For π = 0, the statement is vacuous, because we can take π to be the empty set. Byinduction, assume that the statement holds for π: we have found π β π such that |π | β₯(1β 1
2π)|π| and (2.16) holds for every π β π. If π = π already, then π satisfies (2.16) for π+1
as well, so WLOG |π β π | > 0. Now define π£π is the projection
π£π =ProjπΉπ
π΄ππProjπΉπ
π΄ππ22
Then β¨π£π, π΄ππβ© = πΏππ by definition since (π£π) is a dual basis for the π΄ππβs.Now we want to user Sauer-Shelah so weβre going to define a certain subset of the cube.
Define
Ξ© =
β§β¨β©π β {Β±1}πβπ :
βπβπβπ πππ£π
2
β€πΓ2|π β π |
β«β¬βwhich is an ellipsoid intersected with the cube (it is not a sphere since the π£πs are notorthogonal). Then we have
π2|π β π | β₯βπβπβπ
1
βProjπΉππ΄ππβ2
=βπβπβπ
βπ£πβ22
=1
2|πβπ |β
πβ{Β±1}πβπ
βπβπβπ πππ£π
2
2
.
where the last step is true for any vectors (sum the squares and the pairwise correlationsdisappear).
Using Markovβs inequality this is
β₯ 1
2|πβπ |(2|πβπ | β |Ξ©|
οΏ½π22|π β π |
which gives|Ξ©| > 2|πβπ |β1
Then by Sauer-Shelah (Lemma 2.3.9, there exists π½ β π β π such that
ProjRπ½ Ξ© = {Β±1}π½
and
|π½| β₯ 1
2|π β π |
43
MAT529 Metric embeddings and geometric inequalities
Define π * = π βͺ π½. We will show that π * satisfies the inductive hypothesis with π + 1.Each time we find a certain set of coordinates to add to what we have before. |π *| is thecorrect size because
|π *| = |π |+ |π½| β₯ |π |+ |π| β |π |2
=|π |+ |π|
2β₯(1β 1
2π+1
)|π|
where we used that |π | β₯(1β 1
2π
οΏ½|π|. So at least π * is the right size.
Now, suppose π β π *. For every π β Rπ, we claim there exists some π β Ξ© such thatβπ β π½ such that ππ = sign(ππ). For any π½, we can find some vector in the cube that has thesign pattern of our given vector π. What does being in Ξ© mean? It means that at least thedual basis is small there. π β Ξ© says that
eq:step21-2ββπβπβπ
πππ£πβ2 β€πΓ2|π β π | β€
πΓ2|π|
2π/2(2.17)
That was how we chose our ellipsoid.
βπβπ½
|ππ| =
βπβπ½
πππ΄ππ,βπβπβπ
ππ£π
βbecause the π£πβs are a dual basis so β¨π΄ππ, π£πβ© = πΏππ, and π½ β πβπ . We only know the ππ arethe signs when youβre inside π½. This equals
=
βπβπ
πππ΄ππ,βπβπβπ
ππ£π
ββ
βπβ(πβπ½)β©(πβπ)
ππππ
Note that (π β π½) β© (π β π) = π β© (π β π *). In this set we canβt control the signs ππ. ByCauchy-Schwarz and (2.17), this is
β€βπβπ
πππ΄ππ
2
Β·
βπβπβπ ππ£π
2
+β
πβπβ©(πβπ*)|ππ|
β€βπβπ
πππ΄ππ
2
Β·πΓ2|π|
2π/2+
βπβπβ©(πβπ*)
|ππ|.
Because the conclusion of Sauer-Shelah told us nothing about the signs of ππ outside π½, sowe just take the worst possible thing.
Summarizing, βπβπ½
|ππ| β€πΓ2|π|
2π/2
βπβπ
πππ΄ππ
2
+β
πβπβ(πβπ*)|ππ|
44
MAT529 Metric embeddings and geometric inequalities
Using the inductive step,βπβπ*
|ππ| =βπβπ
|ππ|+βπβπ½
|ππ|
β€πΓ|π|πΌπ
βπβπ
πππ΄ππ
2
+(2π β 1
οΏ½ βπβπβ©(πβπ)
|ππ|+βπβπ½
|ππ|
= πΌπΓ|π|βπβπ
πππ΄ππ
2
+(2π β 1
οΏ½ βπβπβ©(πβπ*)
|ππ|+ 2πβπβπ½
|ππ|
In the last step, we moved π½ from the second to the third term. Now use what we got beforefor the bound on
βπβπ½ |ππ| and plug it in to get
β€(πΌπ + 2(π+1)/2
οΏ½Γ|π|βπβπ
πππ΄ππ
2
+(2π+1 β 1
οΏ½ βπβπβ©(πβπ*)
|ππ|
which is exactly the inductive hypothesis.4
Remark 2.4.6 (Algorithmic Sauer-Shelah): We only used Sauer-Shelah 2.3.9 for intersect-ing cubes with an ellipsoid, so to make RIP algorithmic, we need an algorithm for findingthese particular intersections. Moreover, the ellipsoids are big is because we ask that the2-norm is at most
β2 times the expectation. An efficient algorithm probably exists.
Afterwards, wecould ask for higher dimensional shapes. Iβve seen some references thatworked for Sauer-Shelah when sets were of a special form, namely of size π(π). This is some-thing more geometric. I donβt think thereβs literature about Sauer-Shelah for intersectionof surfaces with small degree. This is a tiny motivation to do it, but itβs still interestingindependently.
4I looked through the original Gianpopoulous paper, and it was clear he tried out many many thingsto find which inductive hypothesis makes everything go through cleanly. You want to bound an π1 sumfrom above, so you want to use duality, and then use Sauer-Shelah to get signs such that the norm of thedual-basis is small.
45
MAT529 Metric embeddings and geometric inequalities
46
Chapter 3
Bourgainβs Discretization Theorem
1 Outline
We will prove Bourgainβs Discretization Theorem. This will take maybe two weeks, and hasmany interesting ingredients along the way. By doing this, we will also prove Ribeβs theorem,which is what we stated at the beginning.
Letβs remind ourselves of the definition.
Definition (Definition 1.2.3, discretization modulus): (π, β Β· βπ), (π, β Β· βπ ) are Banachspaces. Let π β (0, 1). Then πΏπ βΛπ (π) is the supremum over πΏ > 0 such that for every πΏ-netπ©πΏ of the unit ball of π, the distortion
πΆπ (π) β€ πΆπ (π©πΏ)
1β π
πΆπ is smallest bi-Lipschitz distortion by which you can embed π into π . There are ideasrequired to get 1 β π. For example, for π = 1
2, the equation says is that if we succeed in
embedding a πΏ-net into π , then we succeeded in the full space with a distortion twice asmuch. A priori itβs not even clear that there exists such a πΏ. Thereβs a nontrivial compactnessargument needed to prove its existence. We will just prove bounds on it assuming it exists.
Now, Bourgainβs discretization theorem says
Theorem (Bourgainβs discretization theorem 1.2.4). If dim(π) = π, dim(π ) = β, then
πΏπ βΛπ (π) β₯ πβ(ππ )
πΆπ
for πΆ a universal constant.
Remark 3.1.1: It doesnβt matter what the maps are for the π©πΏ, the proof will give a linearmapping and we wonβt end up needing them. Assuming linearity in the definition is notnecessary.
Rademacherβs theorem says that any mapping Rπ β Rπ is differentiable almost every-where with bi-Lipschitz derivative. Yu can extend this to π = β, but you need some
47
MAT529 Metric embeddings and geometric inequalities
additional properties: π must be the dual space, and the limit needs to be in the weak-*topology. In this topology there exists a sequence of norm-1 vectors that tend to 0. Almosteverywhere, this doesnβt happen. The Principle of Local Reflexivity says π ** and π arenot the same for infinite dimensional π . The double dual of all sequences which tend to 0is πΏβ, a bigger space, but the difference between these never appears in finite dimensionalphenomena.
1.1 Reduction to smooth norm
From now on, π΅π = {π₯ β π : βπβπ β€ 1} denotes the ball, and ππ = ππ΅ = {π₯ β π : βπβπ = 1}denotes the boundary.
Later on we will be differentiating things without thinking about it, so we first prove thatwe can assume βΒ·βπ is smooth on π β {0}.
Lemma 3.1.2. For all πΏ β (0, 1) there exists some πΏ-net of ππ with |π©πΏ| β€(1 + 2
πΏ
οΏ½π.
Proof. Let π©πΏ β ππ be maximal with respect to inclusion such that βπ₯β π¦βπ > πΏ for everydistinct π₯, π¦ β π©πΏ. Maximality means that if π§ β ππ , there exists π₯ β π©πΏ, βπ₯ β π¦βπ β€ πΏ(otherwise ππΏ βͺ {π§} is a bigger πΏ-net). The balls {π₯ + πΏ
2π΅π}π₯βπ©πΏ
are pairwise disjoint andcontained in 1 + πΏ
2π΅π . Thus
vol
οΏ½οΏ½1 +
πΏ
2
οΏ½π΅π
οΏ½β₯
βπβππΏ
vol
οΏ½π₯+
πΏ
2π΅π
οΏ½οΏ½1 +
πΏ
2
οΏ½πvol(π΅π) = |π©πΏ|
οΏ½πΏ
2
οΏ½πvol(π΅π)
1
Reduction to smooth norm. Let π©πΏ be a πΏ-net of ππ with |π©πΏ| = π β€ (1 + 2πΏ)π. Given
π§ β π©πΏ, by Hahn-Banach we can choose π§* such that
1. β¨π§, π§*β© = 1
2. βπ§*βπ* = 1.
Let π be an integer such that π1/(2π) β€ 1 + πΏ. Then define
βπ₯β :=
οΏ½βπ§βπ©πΏ
β¨π§*, π₯β©2ποΏ½1/(2π)
Each term satisfies |β¨π§*, π₯β©| β€ βπ₯βπ , so we know
βπ₯β β€ π1/2πβπ₯βπ β€ (1 + πΏ)βπ₯βπ .1There are non-sharp bounds for the smallest size of a πΏ-net. There is a lot of literature about the relations
between these things, but we just need an upper bound.
48
MAT529 Metric embeddings and geometric inequalities
For the lower bound, if π₯ β ππ , then choose π§ β π©πΏ such that βπ₯ β π§βπ β€ πΏ. Then1β β¨π§*, π₯β© = β¨π§*, π§ β π₯β© β€ βπ§ β π₯β β€ πΏ, so β¨π§*, π₯β© β₯ 1β πΏ, giving
βπ₯β β₯ (1β πΏ) βπ₯βπ .
Thus any norm is up to 1+πΏ some equal to a smooth norm. πΏ was arbitrary, so without loss ofgenerality we can assume in the proof of Bourgain embedding that the norm is smooth.
The strategy to prove Bourgainβs discretization theorem is as follows. Given a πΏ-net π©πΏ βπ΅π with πΏ β€ πβ(π/π)πΆπ
, and we know β π : π©πΏ β π such that 1π·βπ₯βπ¦βπ β€ βπ(π₯)βπ(π¦)βπ β€
βπ₯β π¦βπ , i.e., we can embed with distortion π·. Our goal is to show that if πΏ β€ πβπ·πΆπ
thenthis implies that there exists π : π β π linear operator invertible βπβ βπβ1β . π·.
The steps are as follows.
1. Find the correct coordinate system (John ellipsoid), which will give us a dot prod-uct structure and a natural Laplacian. (We will need a little background in convexgeometry. )
2. A priori π is defined on the net. Extend π to the whole space in a nice way (Bourgainβsextension theorem) which doesnβt coincide with the function on the net, but is not toofar away from it.
3. Solve the Laplace equation. Start with some initial condition, and then evolve πaccording to the Poisson semigroup. This extended function is smooth the instant itflows a little bit away from the discrete function.
4. There is a point where the derivative satisfies what we want: The point exists by apigeonhole style argument, but we wonβt be able to give it explicitly. This comes fromestimates of the Poisson kernel. We will use Fourier analysis.
2-24
2 Bourgainβs almost extension theorem
Theorem 3.2.1 (Bourgainβs almost extension theorem). Let π be a π-dimensional normedspace, π a Banach space, π©πΏ β ππ a πΏ-net of ππ , π β₯ πΆπΏ. Suppose π : π©πΏ β π isπΏ-Lipschitz. Then there exists πΉ : π β π such that
1. βπΉβLip . (1 + πΏππ)πΏ.
2. βπΉ (π₯)β π(π₯)βπ β€ ππΏ for all π₯ β π©πΏ.
3. Supp(πΉ ) β (2 + π)π΅π .
4. πΉ is smooth.
Parts 3 and 4 will come βfor free.β
49
MAT529 Metric embeddings and geometric inequalities
2.1 Lipschitz extension problem
Theorem 3.2.2 (Johnson-Lindenstrauss-Schechtman, 1986). Let π be a π-dimensionalnormed space π΄ β π, π be a Banach space, and π : π΄ β π be πΏ-Lipschitz. There ex-ists πΉ : π β π such that πΉ |π΄ = π and βπΉβLip . ππΏ.
We know a lower bound ofβπ; losing
βπ is sometimes needed. (The lower bound for
nets on the whole space is 4βπ.) A big open problem is what the true bound is.
This was done a year before Bourgain. Why didnβt he use this theorem? This theoremis not sufficient because the Lipschitz constant grows with π.
We want to show that βπβ βπβ1β . π·. We canβt bound the norm of πβ1 with anythingless than the distortion π·, so to prove Bourgain embedding we canβt lose anything in theLipschitz constant of π ; the Lipschitz constant canβt go to β as πβ β.
Bourgain had the idea of relaxing the requirement that the new function be strictlyan extension (i.e., agree with the original function where it is defined). Whatβs extremelyimportant is that the new function be Lipschitz with a constant independent of π.
We need βπΉβLip . πΏ. Letβs normalize so πΏ = 1.When the parameter is π = ππΏ, we want βπ(π₯)β πΉ (π₯)β . ππΏ. Note πΏ is small (less than
the inverse of any polynomial), so losing π is nothing.2
How sharp is Theorem 3.2.1? Given a 1-Lipschitz function on a πΏ-net, if we want toalmost embed it without losing anything, how close can close can we guarantee it to be fromthe original function
Theorem 3.2.3. There exists a π-dimensional normed space π, Banach space π , πΏ > 0,π©πΏ β ππ a πΏ-net, 1-Lipschitz function π : π©πΏ β π such that if π : π β π , βπΉβLip . 1 thenthere exists π₯ β π©πΏ such that
βπΉ (π₯)β π(π₯)β &π
ππβlnπ
.
Thus, what Bourgain proved is essentially sharp. This is a fun construction with Grothendieckβsinequality.
Our strategy is as follows. Consider ππ‘ * πΉ , where
ππ‘(π₯) =πΆππ‘
(π‘2 + βπ₯β22)π+12
, πΆπ =Ξ(π+12
οΏ½π
π+12
,
and ππ‘ is the Poisson kernel. Let (ππΉ )(π₯) = (ππ‘ * πΉ )β²(π₯); π is linear and βπβ . 1. Asπ‘ β 0, this becomes closer to πΉ . We hope when π‘ β 0 that it is invertible. This is true.We give a proof by contradiction (we wonβt actually find what π‘ is) using the pigeonholeprinciple.
The Poisson kernel depends on an Euclidean norm in it. In order for the argument towork, we have to choose the right Euclidean norm.
2If people get πΏ to be polynomial, then weβll have to start caring about π.
50
MAT529 Metric embeddings and geometric inequalities
2.2 John ellipsoid
Theorem 3.2.4 (Johnβs theorem). Let π be a π-dimensional normed space, identified withRπ. Then there exists a linear operator π : Rπ β Rπ such that 1β
πβππ₯β2 β€ βπ₯βπ β€ βππ₯β2
for all π₯ β π.
Johnβs Theorem says we can always find π which sandwiches the π-norm up to a factorof
βπ. βEverything is a Hilbert space up to a
βπ factor.β
Proof. Without loss of generality π· β€ π. Let β° be an ellipsoid of maximal volume containedin π΅π .
maxβvol(ππ΅βπ2
) : π βππ(R), ππ΅βπ2β π΅π
{.
β° exists by compactness. Take π = πβ1.The goal is to show that
βπβ° β π΅π . We can choose the coordinate system such that
β° = π΅βπ2= π΅π
2 .Suppose by way of contradiction that
βππ΅π
2 β π΅π . Then there exists π₯ β π΅π such thatβπ₯βπ β€ 1 yet βπ₯β2 >
βπ.
Denote π = βπ₯β2 , π¦ = π₯π. Then βπ¦β2 = 1 and ππ¦ β π΅π .
By applying a rotation, we can assume WLOG π¦ = π1.Claim: Define for π > 1, π < 1
β°π,π :={
πβπ=1
π‘πππ :(π‘1π
)2
+πβπ=2
(π‘ππ
)2
β€ 1
}.
Stretching by π and squeezing by π can make the ellipse grow and stay inside the body,contradicting that it is the minimizer. More precisely, we show that there exists π > 1 andπ < 1 such that β°π,π β π΅π and Vol(β°π,π) > Vol(π΅π
2 ). This is equivalent to πππβ1 > 1.
We need a lemma with some trigonometry.
Lemma 3.2.5 (2-dimensional lemma). Suppose π > 1, π β (0, 1), and π2
π2+ π2
(1β 1
π2
οΏ½β€ 1.
Then β°π,π β π΅π .
51
MAT529 Metric embeddings and geometric inequalities
Proof. Let π =β
π2βπ2π2β1
. Then π(π) = πππβ1 = π(π2βπ2π2β1
οΏ½πβ12 , π(1) = 1. It is enough to show
πβ²(1) > 0. Now
πβ²(π) =
οΏ½π2 β π2
π2 β 1
οΏ½πβ12
β1π2 β ππ2
π2 β 1,
which is indeed > 0 for π >βπ. Note this is really a 2-dimensional argument; there is 1
special direction. We shrunk the π¦ direction and expanded the π₯ direction.Itβs enough to show the new ellipse β°π,π is in the rhombus in the picture. Calculations
are left to the reader.
2.3 Proof
Proof of Theorem 3.2.1. By translating π (so that it is 0 at some point), without loss ofgenerality we can assume that for all π₯ β π©πΏ, βπ(π₯)βπ β€ 2πΏ.Step 1: (Rough extension πΉ1 on ππ .) We show that there exists πΉ1 : ππ β π such that forall π₯ β π©πΏ β π such that for all π₯ β π©πΏ,
1. βπΉ1(π₯)β π(π₯)βπ β€ 2πΏπΏ.
2. βπ₯, π¦ β ππ , βπΉ1(π₯)β πΉ1(π¦)β β€ πΏ(βπ₯β π¦βπ + 4πΏ).
This is a partition of unity argument. Write π©πΏ = {π1, . . . , ππ}. Consider
{(π₯π + 2πΏπ΅π) β© ππ}ππ=1,
which is an open cover of ππ .Let {ππ : ππ β [0, 1]}ππ=1 be a partition of unity subordinted to this open cover. This
means that
52
MAT529 Metric embeddings and geometric inequalities
1. Suppππ β π₯π + 2πΏπ΅π ,
2.βππ=1 ππ(π₯) = 1 for all π₯ β ππ .
For all π₯ β ππ , define πΉ1(π₯) =βππ=1 ππ(π₯)π(π₯π) β π . Then as πΉ1 is a weighted sum of
π(π₯π)βs, βπΉ1ββ β€ 2πΏ. If π₯ β π©πΏ, because ππ(π₯) is 0 when |π₯β π₯π| > 2πΏ,
βπΉ1(π₯)β π(π₯)βπ =
βπ:βπ₯βπ₯πβπβ€2πΏ
ππ(π₯)(π(π₯π)β π(π₯))
β€ β
βπ₯βπ₯πββ€2πΏ
ππ(π₯)πΏ βπ₯β π₯πβπ β€ 2πΏπΏ.
For π₯, π¦ β ππ ,
βπΉ1(π₯)β πΉ1(π¦)β =
β
βπ₯β π₯πβ β€ 2πΏβπ¦ β π₯πβ β€ 2πΏ
(π(π₯π)β π(π₯π))ππ(π₯)ππ(π₯)
β€β
βπ₯β π₯πβ β€ 2πΏβπ¦ β π₯πβ β€ 2πΏ
πΏ βπ₯π β π₯πβππ(π₯)ππ(π¦)
β€ πΏ(βπ₯β π¦β+ 4πΏ).
2-29: We continue proving Bourgainβs almost extension theorem.Step 2: Extend πΉ1 to πΉ2 on the whole space such that
1. βπ₯ β π©πΏ, βπΉ2(π₯)β π(π₯)βπ . πΏπΏ.
2. βπΉ2(π₯)β πΉ2(π¦)βπ . πΏ(βπ₯β π¦βπ + πΏ).
3. Supp(πΉ2) β 2π΅π .
4. πΉ2 is smooth.
Denote πΌ(π‘) = max{1β |1β π‘|, 0}.
53
MAT529 Metric embeddings and geometric inequalities
Let
πΉ2(π₯) = πΌ(βπ₯βπ)πΉ1 (π₯) βπ₯βπ .
πΉ2 still satisfies condition 1. As for condition 2,
βπΉ2(π₯)β πΉ2(π¦)βπ =
πΌ(βπ₯βπ)πΉ1
οΏ½π₯
βπ₯βπ
οΏ½β πΌ(βπ¦βπ)πΉ1
οΏ½π¦
βπ¦βπ
οΏ½β€ |πΌ(βπ₯β)β πΌ(βπ¦β)|
πΉ1
οΏ½π₯
βπ₯βπ
οΏ½β β
β€2πΏ
+πΌ(βπ¦β)πΉ1
οΏ½π₯
βπ₯βπ
οΏ½β πΉ1
οΏ½π¦
βπ¦βπ
οΏ½
β€ (βπ₯β β βπ¦β)2πΏ+ πΌ(βπ¦β)πΏ(
π₯
βπ₯ββ π¦
βπ¦β
+ 4πΏ
)β€ 2πΏ βπ₯β π¦β+ πΏπΌ(βπ¦β)
(βπ₯β
1
βπ₯ββ 1
βπ¦β
+
βπ₯β π¦ββπ¦β
+ 4πΏ
)β€ 2πΏ βπ₯β π¦β+ πΏπΌ(βπ¦β)
οΏ½βπ₯β π¦ββπ¦β
+βπ₯β π¦ββπ¦β
+ 4πΏ
οΏ½. πΏ(βπ₯β π¦β+ πΏ),
where in the last step we used πΌ(βπ¦β) β€ βπ¦β and πΌ(βπ¦β) β€ 1.Note πΉ2 is smooth because the sum for πΉ1 was against a partition of unity and βΒ·βπ is
smooth, although we donβt have uniform bounds on smoothness for πΉ2.Step 3: We make πΉ smoother by convolving.
Lemma 3.2.6 (Begun, 1999). Let πΉ2 : π β π satisfy βπΉ2(π₯)β πΉ2(π¦)βπ β€ πΏ(βπ₯β π¦βπ+πΏ).Let π β₯ ππΏ. Define
πΉ (π₯) =1
Vol(ππ΅π)
β«ππ΅π
πΉ2(π₯+ π¦) ππ¦.
Then
βπΉβLip β€ πΏ
οΏ½1 +
πΏπ
2π
οΏ½.
The lemma proves the almost extension theorem as follows. We passed from π : π©πΏ β πto πΉ1 to πΉ2 to πΉ . If π₯ β π©πΏ,
βπΉ (π₯)β π(π₯)βπ =
1
Vol(ππ΅π)
β«π΅π
(πΉ2(π₯+ π¦)β π(π₯)) ππ¦
β€ 1
Vol(ππ΅π)
β«ππ΅π
βπΉ2(π₯+ π¦)β πΉ2(π₯)βπ + βπΉ2(π₯)β π(π₯)βπβ β πΏπΏ
ππ¦
β€ 1
Vol(ππ΅π)
β«ππ΅π
(πΏ(βπ¦βπβ β β€π
+πΏπΏ)) ππ¦ . πΏπ.
Now we prove the lemma.
54
MAT529 Metric embeddings and geometric inequalities
Proof. We need to show
βπΉ (π₯)β πΉ (π¦)βπ β€ πΏ
οΏ½1 +
πΏπ
2π
οΏ½βπ₯β π¦βπ .
Without loss of generality π¦ = 0, Vol(ππ΅π) = 1. Denote
π = ππ΅πβ(π₯+ ππ΅π)
π β² = (π₯+ ππ΅π)βππ΅π .
We have
πΉ (0)β πΉ (π₯) =β«ππΉπ§(π¦) ππ¦ β
β«π β²πΉπ§(π¦) ππ¦.
Define π(π§) to be the Euclidean length of the interval (π§ + Rπ₯) β© (ππ΅π). By Fubini,β«Proj
πβ₯ (ππ΅π)π(π§) ππ§ = Volπ(ππ΅π) = 1.
Denote
π = {π§ β ππ΅π : (π§ + Rπ₯) β© (ππ΅π) β© (π₯+ ππ΅π) = π}π = ππ΅πβπ.
Define πΆ : π β π β² a shift in direction π on every fiber that maps the interval (π§ + Rπ₯) β©π β (π§ + Rπ₯) β©π β².
55
MAT529 Metric embeddings and geometric inequalities
πΆ is a measure preserving transformation with
βπ§ β πΆ(π§)βπ =
β§β¨β©βπ₯βπ , π§ β€ π
π(π§)βπ₯βπβπ₯β2
, π§ β π β©π.
(In the second case we translate by an extra factor π(π§)βπ₯β2
.) Then
βπΉ (0)β πΉ (π₯)βπ =β«
ππΉ2(π¦) ππ¦ β
β«π β²πΉ2(π¦) ππ¦
π
=β«
π(πΉ2(π¦)β πΉ2(πΆ(π¦))) ππ¦
π
β€β«ππΏ(βπ¦ β πΆ(π¦)βπ + πΏ) ππ¦
β€β«ππΏ(βπ¦ β πΆ(π¦)βπ + πΏ) ππ¦
= πΏπΏVol(π) + πΏβ«πβπ¦ β πΆ(π¦)βπ ππ¦β«
πβπ¦ β πΆ(π¦)βπ ππ¦ =
β«πβπ₯βπ ππ¦ +
β«πβ©π
π(π¦) βπ₯βπβπ₯β2
ππ¦
= βπ₯βπ Vol(π) +β«Proj(πβ©π)
π(π§) βπ₯βπβπ₯β2
βπ₯β2 ππ§ orthogonal decomposition
= βπ₯βπ Vol(π) + Vol(ππ΅πβπ)
= βπ₯βπ Vol(ππ΅π) = βπ₯βπ .
We show π = ππ΅πβ(π₯+ ππ΅π) β ππ΅πβ(1β βπ₯βπ)ππ΅π . Indeed, for π¦ βπ ,
βπ¦ β π₯βπ β₯ π
βπ¦β β₯ π β βπ₯β =
οΏ½1β βπ₯β
π
οΏ½π.
56
MAT529 Metric embeddings and geometric inequalities
Then
Vol(π) β€ Vol(ππ΅π)β Vol
οΏ½οΏ½1β βπ₯β
π
οΏ½ππ΅π
οΏ½= 1β
οΏ½1β βπ₯β
π
οΏ½.π βπ₯βπ
Bourgain did it in a more complicated, analytic way avoiding geometry. Begun noticesthat careful geometry is sufficient.
Later we will show this theorem is sharp.
3 Proof of Bourgainβs discretization theorem
At small distances there is no guarantee on the function π . Just taking derivatives is danger-ous. It might be true that we can work with the initial function. But the only way Bourgainfigured out how to prove the theorem was to make a 1-parameter family of functions.
3.1 The Poisson semigroup
Definition 3.3.1: The Poisson kernel is ππ‘(π₯) : Rπ β R given by
ππ‘(π₯) =πΆππ‘
(π‘2 + βπ₯β22)π+12
, πΆπ =Ξ(π+12
οΏ½π
π+12
.
Proposition 3.3.2 (Properties of Poisson kernel): 1. For all π‘ > 0,β«Rπ ππ‘(π₯) ππ₯ = 1.
2. (Semigroup property) ππ‘ * ππ = ππ‘+π .
3. ππ‘(π₯) = πβ2πβπ₯β2π‘.
Lemma 3.3.3. Let πΉ be the function obtained from Bourgainβs almost extension theo-rem 3.2.1. For all π‘ > 0, βππ‘ * πΉβLip . 1.
Proof. We have
ππ‘ * πΉ (π₯)β ππ‘ * πΉ (π¦) =β«Rπππ‘(π§)(πΉ (π₯β π§)β πΉ (π₯β π¦)) ππ§.
Now use the fact that πΉ is Lipschitz.
Our goal is to show there exists π‘0 > 0, π₯ β π΅ such that if we define
eq:bdt-Tπ = (ππ‘0 * πΉ )β²(π₯) : π β π, (3.1)
(i.e., ππ = ππ(ππ‘0 *πΉ )(π₯)). We showed βπβ . 1. It remains to show βπβ1β . π·. Then π hasdistortion at most π(π·), and hence π gives the desired extension in Bourgainβs discretizatontheorem 1.2.4.
3-2: Today we continue with Bourgainβs Theorem. Summarizing our progress so far:
57
MAT529 Metric embeddings and geometric inequalities
1. Initially we had a function π : π©πΏ β π and a πΏ-net π©πΏ of π΅π .
2. From π we got a new function πΉ : π β π with supp(πΉ ) β 3π΅π , satisfying thefollowing. (None of the constants matter.)
(a) βπΉβπΏπ β€ πΆ for some constant.
(b) For all π₯ β π©πΏ, βπΉ (π₯)β π(π₯)βπ β€ ππΏ.
Let ππ‘ : Rπ β R be the Poisson kernel. What can be said about the convolution ππ‘ * πΉ?We know that for all π‘ and for all π₯, β(ππ‘ *πΉ )(π₯)βπβπ β€ 1. The goal is to make this functioninvertible. More precisely, the norm of the inverted operator is not too small.
Weβll ensure this one direction at a time by investigating the directional derivatives. Forπ : Rπ β π , the directional derivative is defined by: for all π β ππ ,
πππ(π₯) = limπ‘β0
π(π₯+ π‘π)β π(π₯)
π‘.
There isnβt any particular reason why we need to use the Poisson kernel; there are manyother kernels which satisfy this. We definitely need the semigroup property, in addition toall sorts of decay conditions.
We need a lot of lemmas.
Lemma 3.3.4 (Key lemma 1). lem:bdt-1 There are universal constants πΆ,πΆ β², πΆ β²β² such that thefollowing hold. Suppose π‘ β (0, 1
2], π β (0,β), πΏ β (0, 1
100π) satisfy
πΏ β€ πΆπ‘ log(3/π‘)β
πβ€ πΆ β² 1
π5/2π·2(3.2)
πΆ β²β²π3/2π·2 log(3/π‘) β€ π β€ πΆ β²β²
π‘βπ. (3.3)
Then for all π₯ β 12π΅π , π β ππ , we have
(βππ(ππ‘ * πΉ )βπ * ππ π‘)(π₯) β₯πΆ β²β²β²
π·
This result says that given a direction π and look at how long the vector ππ(ππ‘ *πΉ ) is, itis not only large, but large on average. Here the average is over the time period π π‘.
Note that I havenβt been very careful with constants; there are some points where theconstant is missing/needs to be adjusted.
The first condition says is that π‘ is not too small, but the lower bound is extremely small.
The upper bound is(1π
οΏ½πfor some π whereas the lower bound is something like πβπ
π, so
thereβs a huge gap.The second condition says that log
(1π‘
οΏ½is still exponential.
This lemma will come up later on and it will be clear.From now on, let us assume this key lemma and I will show you how to finish. We will
go back and prove it later.
58
MAT529 Metric embeddings and geometric inequalities
Lemma 3.3.5 (Key lemma 2). lem:bdt-2 Thereβs a constant πΆ1 such that the following holds (wecan take πΆ1 = 8). Let π be any Borel probability measure on ππ . For every π ,π΄ β (0,β),there exists π΄
(π +1)π+1 β€ π‘ β€ π΄ such thatβ«ππ
β«Rπ
βππ(ππ‘ * πΉ )(π₯)βπ ππ₯ππ(π) β€β«ππ
β«Rπ
βππ(π(π +1)π‘ * πΉ )βπ ππ₯β β (*)
ππ(π) +πΆ1vol(3π΅π)
π
Basically, under convolution, we can take the derivative under the integral sign. Thus(*) is an average of what we wrote on the left. Averaging only makes things smaller fornorms, because norms are convex. Thus the right integral is less than the left integral, fromJensenβs inequality.
The lemma says that adding that small factor, we get a bound in the opposite direction.We will argue that there must be a scale at which Jensenβs inequality stabilizes, i.e., βππ(ππ‘ *πΉ )βπ * ππ π‘(π₯) stabilizes to a constant.
Note this is really the pigeonhole argument, geometrically.
Proof. Bourgain uses this technique a lot in his papers: find some point where the inequalityis equality, and then leverage that.
If the result does not hold, then for every π‘ in the range π‘(π +1)π+1 β€ π‘ β€ π΄, the reverse
holds. Weβll use the inequality at eveyr point in a geometric series. For all π β {0, . . . ,π+1},β«ππ
β«Rπ
βππ(ππ΄(π +1)πβπβ1*πΉ )(π₯)βπ ππ₯ππ(π) >β«ππ
β«Rπ
βππ(ππ΄(π +1)πβπ*πΉ )(π₯)βπ ππ₯ππ(π)+πΆ1vol(3π΅π)
π.
Summing up these inequalities and telescopingβ«ππ
β«Rπ
βππ(ππ΄(π +1)βπβ1*πΉ )(π₯)βπ ππ₯ππ(π) >β«ππ
β«Rπ
βππ(ππ΄(π +1)*πΉ )(π₯)βπ ππ₯ππ(π)+8(π+ 1)vol(3π΅π)
π(3.4)
(recall that weβve been using the semigroup properties of the Poisson kernel this whole time).Now why is this conclusion absurd? Take πΆ1 to be the bound on the Lipschitz constant ofπΉ . Because βπππΉβπ β€ 8, we haveππ(ππ΄(π +1)βπβ1 * πΉ )(π₯) β€ πΆ1. Since partial derivativescommute with the integral sign, we getβ«
ππ
β«Rπ
βππ(ππ΄(π +1)βπβ1 * πΉ )(π₯)βπ ππ₯ ππ(π) =β«ππ
β«Rπ
β(ππ΄(π +1)βπβ1 * πππΉ )(π₯)βπ ππ₯ ππ(π)
(3.5)
β€β«ππ
β«RππΆ1 β€ πΆ1vol(π΅π) (3.6)
because πΉ is Lipschitz and the Poisson semigroup integrates to unity. Together (3.4)and (3.6) give a contradiction.
Now assuming Lemma 3.3.4 (Key Lemma 1), letβs complete the proof of Bourgainβs
discretization theorem. Assume from now on πΏ <(
1ππ·
οΏ½(πΆπ·)2π
where πΆ is a large enoughconstant that we will choose (πΆ = 500 or something).
59
MAT529 Metric embeddings and geometric inequalities
Proof of Bourgainβs Discretization Theorem 1.2.4. Let β± β ππ be a 1πΆ2π·
-net in ππ . Then|β±| β€ (πΆ3π·)π for some πΆ3. We will apply the Lemma 3.3.5 (Key Lemma 2) with π theuniform measure on β± ,
π΄ = (1/πΆπ·)5π
π + 1 = (πΆπ·)4π
π = β(πΆπ·)π+1β.
Then there exists (1/(πΆπ·))(πΆπ·)2π β€ π‘ β€ (1/(πΆπ·))5π such that
eq:bdt-pf-lem2
βπββ±
β«Rπ
βππ(ππ‘ * πΉ )(π₯)βπ ππ₯ β€βπββ±
β«Rπ
βππ(π(π +1)π‘ * πΉ )(π₯)βπ ππ₯+8vol(3π΅π)
π|β±|
(3.7)We check the conditions of Lemma 3.3.4 (Key Lemma 1).
1. For (3.2), note π‘ is exponentially small, so the RHS inequality is satisfied. For the LHS
inequality, note πΏ β€ π‘ andπΆ ln( 1
π‘ )βπ
β€ 1. To see the second inequality, noteπΆ ln( 1
π‘ )βπ
β₯πΆ5π ln(πΆπ·)β
πβ₯ 1 (for π large enough).
2. For (3.2), note the LHS is dominated by ln(1π‘
οΏ½β€ (πΆπ·)2π ln(πΆπ·) which is much less
than π = (πΆπ·)4π β 1, and 1π‘, the dominating term on the RHS, is β₯ (πΆπ·)5π.
In my paper (reference?), I wrote what the exact constants are. Theyβre written in thepaper, and are not that important. Weβre choosing our constants big enough so that ourinequalities hold true with room to spare.
Now we can use Key Lemma 1, which says
eq:bdt-pf-lem1 (βππ(ππ‘ * πΉ )βπ * ππ π‘) (π₯) β₯1
π·(3.8)
Using π(π +1)π‘ = ππ π‘ * ππ‘ (semigroup property) and Jensenβs inequality on the norm, whichis a convex function, we have
eq:bdt-convβππ(π(π +1)π‘ *πΉ )(π₯)βπ = β (ππ(ππ‘ * πΉ ))*ππ π‘(π₯)βπ β€ (βππ(ππ‘ * πΉ )βπ * ππ π‘) (π₯). (3.9)
Since the norm is a convex function, by Jensenβs we get our inequality.Let
π(π₯) := (βππ(ππ‘ * πΉ )βπ * ππ π‘) (π₯)β βππ(π(π +1)π‘ * πΉ
οΏ½(π₯)βπ
From (3.9), π(π₯) β₯ 0 pointwise. Using Markovβs inequality,
vol{π₯ β 1
2π΅π : π(π₯) >
1
π·
}β€ π·
β«Rπ
(βππ(ππ‘ * πΉ )βπ * ππ π‘) (π₯)β βππ(π(π +1)π‘ * πΉ
οΏ½(π₯)βπ )ππ₯.
because we can upper bound the integral over the ball by an integral over Rπ. Note we areusing the probabilistic method.
60
MAT529 Metric embeddings and geometric inequalities
This inequality was for a fixed π. We now use the union bound over β± , a πΏ-net of πβs toget
vol{π₯ β 1
2π΅π : βπ β β± , ππ(π₯) > 1/π·
}(3.10)
β€βπββ±
vol(π₯ β 1
2π΅π : ππ(π₯) > 1/π·) (3.11)
β€ π·βπββ±
β«Rπ
(βππ(ππ‘ * πΉ )βπ * ππ π‘)(π₯)β βππ(π(π +1)π‘ * πΉ )(π₯)βπ
οΏ½ππ₯ (3.12)
= π·βπββ±
β«Rπ
(βππ(ππ‘ * πΉ )βπ )(π₯)β βππ(π(π +1)π‘ * πΉ )(π₯)βπ
οΏ½ππ₯ (3.13)
β€ πΆ1vol(3π΅π)
π(πΆ3π·)π < vol
(1
2π΅π
). (3.14)
where (3.13) follows because when you convolve something with a probability measure, theintegral over Rπ does not change, and (3.14) follows from (3.7) and our choice of π.
Weβve proved that there must exist a point in half the ball such that for every π β β± ,our net, we have
1
π·β₯ βππ(ππ‘ * πΉ )βπ * ππ π‘(π₯)β βππ(π(π +1)π‘ * πΉ )(π₯)βπ
I think we want either this to be 12π·
, or (3.8) to be 1π·/2
, in order to get the following bound
(with 12π·
or 1π·). This involves changing some constants in the proof.
From this we can conclude that for this π₯,
βππβπ = βππ(π(π +1)π‘ * πΉ )(π₯)βπ β₯ 1/π·
where π β ππ we let π‘0 = (π + 1)π‘. This shows βπβ1β β€ π·.
For the exact constants, see the paper (reference).Note this is very much a probabiblistic existence statement or result. Usually we estimate
by hand the random variable we want to care about. Here we want to prove an existentialbound, so we estimate the probability of the bad case. But we estimate the probability ofthe bad case using another proof by contradiction.
It remains to estimate some integrals.3-7: Finishing Bourgain. Thereβs a remaining lemma about the Poisson semigroup that
weβre going to do today (Lemma 3.3.4); itβs nice, but itβs not as nice as what came before.The last remaining lemma to prove is Lemma 3.3.4.Remember that 1β
πβπ₯β2 β€ βπ₯βπ β€ βπ₯β2 for all π₯ β π. We need three facts about the
Poisson kernel.
Proposition 3.3.6 (Fact 1): pr:pois1β«Rπβ(ππ΅π)
ππ‘(π₯)ππ₯ β€ π‘βπ
π.
61
MAT529 Metric embeddings and geometric inequalities
Proof. First note that the Poisson semigroup has the property that
eq:pois-rescaleππ‘(π₯) =1
π‘ππ1(π₯/π‘), (3.15)
i.e., ππ‘ is just a rescaling of π1. So itβs enough to prove this for π‘ = 1. We haveβ«βπ₯βπβ₯π
ππ‘(π₯) ππ₯ β€β«βπ₯β2β₯π
ππ‘(π₯) ππ₯
=β«βπ₯β2β₯π/π‘
π1(π₯) ππ₯
by change of variables and (3.15)
= πΆπππβ1
β« β
π/π‘
π πβ1
(1 + π 2)(π+1)/2ππ
polar coordinates, where ππβ1 = vol(Sπβ1)
β€ πΆπππβ1
β« β
π/π‘
1
π 2ππ
=π‘
ππΆπππβ1
β€ π‘βπ
π,
where the last inequality follows from expanding in terms of the Gamma function and usingStirlingβs formula.
Above we changed to polar coordinates usingβ«βπ₯β2β€π
π(βπ₯β) ππ₯ =β« π
0ππβ1 βπ₯βπβ1 π(π) ππ.
Proposition 3.3.7 (Fact 2): For all π¦ β Rπ,β«Rπ
|ππ‘(π₯)β ππ‘(π₯+ π¦)|ππ₯ β€βπβπ¦β2π‘
.
Proof. Itβs again enough to prove this when π‘ = 1.β«Rπ
|ππ‘(π₯)β ππ‘(π₯+ π¦)| ππ₯ =β«Rπ
β« 1
0β¨βππ‘(π₯+ π π¦), π¦β©ππ
ππ₯
β€ βπ¦β2β«Rπ
ββππ‘(π₯)β2ππ₯ Cauchy-Schwarz
β€ βπ¦β2(π+ 1)πΆπ
β«Rπ
βπ₯β2(1 + βπ₯β22)
π+32
ππ₯ computing gradient
= βπ¦β2(π+ 1)πΆπππβ1
β« β
0
ππ
(1 + π2)π+32
ππ polar coordinates
β€βπ βπ¦β2 .
62
MAT529 Metric embeddings and geometric inequalities
The integral and multiplicative constants perfectly cancels out the π + 1 and becomes 1(calculation omitted).
Proposition 3.3.8 (Fact 3): For all 0 < π‘ < 12and π₯ β π΅π , we have
βππ‘ * πΉ (π₯)β πΉ (π₯)βπ β€βππ‘ log(3/π‘).
Proof. The LHS equalsβ«Rπ
(πΉ (π₯β π¦)β πΉ (π₯))ππ‘(π¦)ππ¦πβ€β«π₯+3π΅π
βπΉ (π₯β π¦)β πΉ (π₯)βπ ππ‘(π¦)ππ¦β β (3.16)
+ πβ«Rπβ(π₯+3π΅π)
ππ‘(π¦)ππ¦β β (3.17)
.
Using βπΉ (π₯β π¦)β πΉ (π₯)βπ β€ βπΉβLip ππ Β· βπ¦βπ and that βπΉβLip is a constant,
(3.16) =β«π₯+3π΅π
βπ¦βπππ‘(π¦)ππ¦
β€β«4βππ΅ππ
2
βπ¦β2ππ‘(π¦)ππ¦ using π₯ β π΅π =β π₯+ 3π΅π β 4βππ΅π2
= π‘πΆπππβ1
β« 4βπ
π‘
0
π π
(1 + π 2)(π+1)/2ππ polar coordinates
β€ π‘βπ
οΏ½β« βπ
0
1βπππ +
β« 4βπ/π‘
βπ
1
π ππ
οΏ½β€ π‘
βπ(1 + ln
(4
π‘
)),
where we used the fact that π π
(1+π 2)π+12
is maximized when π =βπ, and is always β€ 1
π .
For the second term, note that 2π΅π β π₯+ 3π΅π , so
(3.17) = πβ«Rπβ2π΅π
ππ‘(π¦) β€ππ‘βπ
2
where we used the tail bound of Fact 1 (Pr. 3.3.7). Adding these two bounds gives theresult.
Now we have these three facts, we can finish the proof of the lemma.
Proof of Lemma 3.3.4.
Claim 3.3.9 (ππ‘ * πΉ separates points). Let π = ππ·βππ‘ log(3/π‘). Let π€, π¦ β 1
2π΅π with
βπ€ β π¦βπ β₯ π. Then
βππ‘ * πΉ (π€)β ππ‘ * πΉ (π¦)βπ β₯ 1/π·
We do not claim at all that these numbers are sharp.
63
MAT529 Metric embeddings and geometric inequalities
Proof. We can find a point π β π©πΏ which is πΏ away from π€, and π β π©πΏ which is πΏ away fromπ¦. We also know that πΉ is close to π , and we can use Fact 3 (Pr. 3.3.8) to bound the errorof convolving.
By the triangle inequality
βππ‘ * πΉ (π€)β ππ‘ * πΉ (π¦)βπ β₯ βπ(π)β π(π)β β βπΉ (π)β π(π)β β βπΉ (π)β π(π)ββ βπΉ (π€)β πΉ (π)β β βπΉ (π¦)β πΉ (π)ββ βππ‘ * πΉ (π€)β πΉ (π€)β β βππ‘ * πΉ (π¦)β πΉ (π¦)β
β₯ βπβ πβπ·
β 2ππΏ β ππΏ ββππ‘ log(3/π‘)
β₯ βπ¦ β π€β β 2πΏ
π·β 2ππΏ β ππΏ β
βππ‘ log(3/π‘).
Taking
π = ππ·βππ‘ log(3/π‘),
we find this is at least a constant times βπ¦βπ€β/π·, we need π = ππ·βππ‘ log(3/π‘). Note thatβ
ππ‘ log(3/π‘) is the largest error term because by the assumptions in the lemma, it is greaterthan 2ππΏ.
Now consider
βππ‘ * πΉ (π§ + ππ)β ππ‘ * πΉ (π§)β
By the claim, if π§ β 14π΅π , then we know
π/π· β€ βππ‘ * πΉ (π§ + ππ)β ππ‘ * πΉ (π§)β (3.18)
=β« π
0βππ(ππ‘ * πΉ )(π§ + π π)βπ (3.19)
64
MAT529 Metric embeddings and geometric inequalities
Suppose βπ₯β π¦β β€ 14(we will apply (3.19) to π₯β π¦ = π§), π₯ β 1
2π΅π , π¦ β 1
8π΅π . By throwing
away part of the integral,
1
π
β« π
0
β«Rπ
βππ(ππ‘ * πΉ )(π₯+ π πβ π¦)βπ ππ π‘(π¦)ππ¦ (3.20)
β₯ 1
π
β« π
0
β«(1/8)π΅π
βππ(ππ‘ * πΉ )(π₯β π¦ + π π)βπ ππ π‘(π¦) ππ ππ¦ (3.21)
=β«(1/8)π΅π
(1
π
β« π
0βππ(ππ‘ * πΉ )(π₯β π¦ + π π)βπ ππ
)ππ π‘(π¦) ππ¦ by Fubini (3.22)
β₯ 1
π·
β«(1/8)π΅π
ππ π‘(π¦)ππ¦ by (3.19) (3.23)
β₯ π
π·(3.24)
for some constant π. The calculation above says that the average length in every directionis big, and if you average the average length again, it is still big. How do we get rid of theadditional averaging? We care about
βππ(ππ‘ * πΉ )β * ππ π‘(π₯) =β«Rπ
βππ(ππ‘ * πΉ )(π₯β π¦)βπ ππ π‘(π¦)ππ¦
=1
π
β« π
0
οΏ½β«Rπ
βππ(ππ‘ * πΉ )(π₯β π¦ + π π)βπ ππ π‘(π¦)ππ¦οΏ½
β 1
π
β« π
0
β«Rπ
βππ(ππ‘ * πΉ )(π₯β π¦)βπ (ππ π‘(π¦ + π π)β ππ π‘(π¦))ππ¦
>π
π·β πβ²
π
β« π
0
β«Rπ
|ππ π‘(π¦ + π π)β ππ π‘(π¦)|ππ ππ¦ by (3.24) and what?
β₯ π
π·β πβ²
π
β« π
0
βππ βπβ2π π‘
ππ by Fact 2 (Pr. 3.3.7)
This error term is
πβ²
π
β« π
0
βππ βπβ2π π‘
ππ =ππ·π log
(3π‘
οΏ½π
= π(1
π·
)by (3.2), so βππ(ππ‘*πΉ )β*ππ π‘(π₯) = Ξ©
(1π·
οΏ½(if we chose constants correctly), as desired. Factor
ofβπ?
To summarize: You compute the error such that you have distance π. Then you lookat the average derivative on each line like this. Then, the average derivative is roughlyβππ(ππ‘ *πΉ )β*ππ π‘(π₯). So averaging the derivative in a bigger ball is the same up to constantsas taking a random starting point and averaging over the ball. Whatβs written in the lemmais that the derivative of a given point in this direction, averaged over a little bit bigger ball,itβs not unreasonable to see that at first take starting point and going in direction π andaveraging in the bigger ball is equivalent to randomizing over starting points.
3-21: A better bound in Bourgain Discretization
65
MAT529 Metric embeddings and geometric inequalities
4 Upper bound on πΏπβπ
We give an example where we need a discretization bound πΏ going to 0.
Theorem 3.4.1. There exist Banach spaces π, π with dim π = π such that if πΏ < 1 andπ©πΏ is a πΏ-net in the unit ball and πΆπ (π©πΏ) & πΆπ (π), then πΏ . 1
π.
Together with the lower bound we proved, this gives
1
πππΆπ< πΏπβπ (1/2) <
1
π
The lower bound would be most interesting to improve.
The example will be π = βπ1 , π = β2.
We need two ingredients.
Firstly, we show that (πΏ1,Γβπ₯β π¦β1) is isometric to a subset of πΏ2. More generally, we
prove the following.
Lemma 3.4.2. Given a measure space (Ξ©, π), (πΏ1(π),Γβπ₯β π¦β1) is isometric to a subset
of πΏ2(Ωà R, πΓ π), where π is the Lesbegue measure of R.
Note all separable Hilbert spaces are the same, πΏ2(Ξ©ΓR, πΓ π) is the same whenever itis separable.
Proof. First we define π : πΏ1(π) β πΏ2(πΓ π) as follows. For π β πΏ1(π), let
ππ(π€, π₯) =
β§βͺβͺβ¨βͺβͺβ©1 0 β€ π(π€) β€ π₯
β1 π₯ β€ π(π₯) β€ 0
0 otherwise
Consider two functions π1, π2 β πΏ1(π). The function |ππ1 β ππ2| is the indicator of thearea between the graph of π1 and the graph of π2.
66
MAT529 Metric embeddings and geometric inequalities
Thus the πΏ2 norm is, letting 1 be the signed indicator,
βππ1 β ππ2βπΏ2(πΓπ) =
Γβ«Ξ©
β«R
(1π1(π€),π2(π€) (π₯)
2οΏ½ππ₯ ππ
=
Γβ«Ξ©|π1(π€)β π2(π€)| ππ(π€)
More generally, if π β€ π < β, then (πΏπ, βπ₯ β π¦βπ/ππ ) is isometric to a subset of πΏπ. Wemay prove this later if we need it.
The second ingredient is due to Enflo .
Theorem 3.4.3 (Enflo, 1969). The best possible embedding of the hypercube into Hilbertspace has distortion
βπ:
πΆ2({0, 1}π, β Β· β1) =βπ
Note that we donβt just calculate πΆ2 up to a constants here; we know it exactly. This isvery rare.
Proof. We need to show an upper bound and a lower bound.
1. To show the upper bound, we show the identity mapping {0, 1}π β βπ2 has distortionβπ. We have
βπ₯β π¦β2 =(
πβπ=1
|π₯π β π¦π|2)1/2
=
(πβπ=1
|π₯π β π¦π|)1/2
=Γβπ₯β π¦β1,
since the values of π₯, π¦ are only 0, 1. Then
1βπβπ₯β π¦β1 β€ βπ₯β π¦β2 =
Γβπ₯β π¦β1 β€ βπ₯β π¦β1
Thus πΆ2({0, 1}π) β€βπ.
(So this theorem tells us that if you want to represent the boolean hypercube as aEuclidean geometry, nothing does better than the identity mapping.)
67
MAT529 Metric embeddings and geometric inequalities
2. For the lower bound πΆ2({0, 1}π, βΒ·β1) β₯βπ, we use the following.
Lemma 3.4.4 (Enfloβs inequality). We claim that for every π : {0, 1}π β β2,βπ₯βFπ2
βπ(π₯)β π(π₯+ π)β22 β€πβπ=1
βπ₯βFπ2
βπ(π₯+ ππ)β π(π₯)β22,
where π = (1, . . . , 1) and ππ are standard basis vectors of Rπ.
Here, addition is over Fπ2 .
This is specific to β2. See the next subsection for the proof.
In other words, Hilbert space is of Enflo-type 2 (Definition 1.1.13): the sum of thesquares of the lengths of the diagonals is at most the sum of the squares of the lengthsof the edges.
Suppose1
π·βπ₯β π¦β1 β€ βπ(π₯)β π(π¦)β2 β€ βπ₯β π¦β1.
Our goal is to show that π· β₯βπ. Letting π¦ = π₯+ π,
βπ(π₯+ π)β π(π₯)β2 β₯1
π·β(π₯+ π)β π₯β1 =
π
π·.
Plugging into Enfloβs inequality 3.4.4, we have, since there are 2π points in the space,
2π(π/π·)2 β€ π Β· 2π Β· 12 =β π·2 β₯ π
as desired.
Proof of Theorem 3.4.1. Let π©πΏ be a πΏ-net in π΅βπ1. Let π : βπ1 β πΏ2 satisfy βπ (π₯)βπ (π¦)β2 =Γ
βπ₯β π¦β1. π restricted to π©πΏ has distortion β€Γ
2πΏbecause for π₯, π¦ β π©πΏ with π₯ = π¦, we
have1β2βπ₯β π¦β1 β€
Γβπ₯β π¦β1 β€
1βπΏβπ₯β π¦β1.
However, the distortion of βπ1 in β2 isβπ. The condition πΆπ (π©πΏ) & πΆπ (π) means that for
some constant πΎ, Γ2
πΏβ₯ πΎ
βπ =β πΏ β€ 2
πΎ2π.
68
MAT529 Metric embeddings and geometric inequalities
4.1 Fourier analysis on the Boolean cube and Enfloβs inequality
We can prove this by induction on π (exercise). Instead, I will show a proof that is Fourieranalytic and which generalizes to other situations.
Definition 3.4.5 (Walsh function): Let π΄ β {1, . . . , π}. Define the Walsh function
ππ΄ : Fπ2 β {Β±1}
byππ΄(π₯) = (β1)πβπ΄.
Proposition 3.4.6 (Orthonormality): For π΄,π΅ β {1, . . . , π}.
Eπ₯βFπ2
ππ΄(π₯)ππ΅(π₯) = πΏπ΄π΅.
Thus, {ππ΄}π΄β{1,...,π} is an orthonormal basis of πΏ2(Fπ2 ).
Proof. Without loss of generality π β π΄βπ΅. Then
E (β1)β
πβπ΄π₯π+β
πβπ΅π₯π = 0.
Corollary 3.4.7. Let π : Fπ2 β π where π is a Banach space. Then
π =β
π΄β{1,Β·Β·Β· ,π}π(π)ππ΄
where π(π΄) = 12π
βπ₯βFπ2 π(π₯)(β1)
βπβπ΄
π₯π .
Proof. How do we prove two vectors are the same in a Banach space? It is enough to showthat any composition with a linear functional gives the same value; thus it suffices to provethe claim for π = R. The case π = R holds by orthonormality.
Proof of Lemma 3.4.4. It is sufficient to prove the claim for π = R; we get the inequalityfor β2 by summing coordinates. (Note this is a luxury specific to β2.)
The summand of the LHS of the inequality is
π(π₯)β π(π₯+ π) =β
π΄β{1,...,π}π(π΄) (ππ΄(π₯)βππ΄(π₯+ π))
=β
π΄β{1,...,π}π(π΄)ππ΄(π₯)
(1β (β1)|π΄|
οΏ½=
βπ΄β{1,...,π},|π΄| odd
2π(π΄)ππ΄(π₯)
69
MAT529 Metric embeddings and geometric inequalities
The summand of the RHS of the inequality is
βπ(π₯+ ππ) + π(π₯) =β
π΄β{1,...,π}π(π΄)(ππ΄(π₯)βππ΄(π₯+ ππ))
=β
π΄β{1,...,π},πβπ΄2π(π΄)ππ΄(π₯).
Summing gives
βπ₯βFπ2
(π(π₯)β π(π₯+ π))2 = 2π
β|π΄| odd
2π(π΄)ππ΄
2
πΏ2(Fπ2 )
= 2πβ
|π΄| odd4(π(π΄))2
πβπ=1
βπ₯βFπ2
(π(π₯+ ππ)β π(π₯))2 =πβπ=1
2πβπ΄:πβπ΄
4(π(π΄)
)2= 2π
βπ΄
βπβπ΄
4π(π΄)2
= 2πβπ΄
4|π΄|π(π΄)2
From this we see that the claim is equivalent toβπ΄β{1,...,π},|π΄| odd
π(π΄)2 β€β
π΄β{1,...,π}|π΄|π(π΄)2
which is trivially true.
This inequality invites improvements, as this is a large gap. The Fourier analytic proofmade this gap clear.
5 Improvement for πΏπ
Theorem 3.5.1. thm:bourgain-lp Suppose π β₯ 1 and π is an π-dimensional normed space. Then
πΏπ βΛπΏπ(π) &π2
π5/2
In the world of embeddings into πΏπ, we are in the right ballpark for the value of πΏβweknow it is a power of π, if not exactly 1/π. (It is an open question what the right power is.)This is a special case of a more general theorem, which I wonβt state. This proof doesnβt usemany properties of πΏπ.
Proof. This follows from Theorem 3.5.2 and Theorem 3.5.3.
We will prove the theorem for π = 12. The proof needs to be modified for the π-version.
70
MAT529 Metric embeddings and geometric inequalities
Theorem 3.5.2. thm:bourgain-lp2 Let π, π be Banach spaces, with dimπ = π < β. Supposethat there exists a universal constant π such that πΏ β€ ππ2
π5/2 , and that π©πΏ is a πΏ-net of π΅π .3
Then there exists a separable probability space (Ξ©, π)4, a finite dimensional subspace π β πand a linear operator π : π β πΏβ(π, π) such that for every π₯ β π with βπ₯βπ = 1,
1β π
π·β€β«Ξ©βππ₯(π)βπ ππ(π) β€ esssupπβΞ©β(ππ₯)πβπ β€ 1 + π
Note that the inequality can also be written as
1β π
π·βππ₯βπΏ1(π,π) β€ βππ₯βπΏβ(π,π).
Now what happens when π = πΏπ([0, 1])? We have
πΏπ(π, π) β πΏπ(π, π ) = πΏπ(π, πΏπ) = πΏπ(πΓ π)
A different way to think of this is that a function π : π β πΏβ(π, πΏπ([0, 1])) can be thoughtof as a function of π₯ β π and π β [0, 1], π(π, π₯) : Ωà [0, 1] β R.
This is a very classical fact in measure theory:
Theorem 3.5.3 (Kolmogorovβs representation theorem). Any separable πΏπ(π) space is iso-metric to a subset of πΏπ.
This is an immediate corollary of the fact that if I give you a separable probability space,there is a separable transformation of the probability space (Ξ©, π) βΌ= ([0, 1], π) where π isLebesgue measure. So there is a measure preserving isomorphism if the probability measureis atom-free. In general, the possible cases are
β (Ξ©, π) βΌ= ([0, 1], π) if it is atom free.
β (Ξ©, π) βΌ= [0, 1]Γ {1, . . . , π} if there are finitely many (π) atoms
β (Ξ©, π) βΌ= [0, 1]Γ N if there are countably many atoms,
β (Ξ©, π) βΌ= {1, . . . , π} or N if itβs completely atomic.
See Halmosβs book for deails. This is just called Kolmogorovβs representation theorem.Now for this to be an improvement of Bourgainβs discretization theorem for πΏπ, we need
that if π β π is a finite dimensional subspace and π β πΏβ(π, π) is a finite dimensionalsubspace such that on π , the πΏβ and πΏ1 norms are equivalent (a very strong restriction),then π embeds in π . Thatβs the only property of πΏπ weβre using.
However, not every π has this property.
3Let me briefly say something about vector-valued πΏπ spaces. Whenever you write πΏπ(π,π), this equals
all functions π : Ξ© β π such that (β«βπ(π)βππππ(π))
1/π< β (our norm is bounded).
4it will in the end be a uniform measure on half the ball
71
MAT529 Metric embeddings and geometric inequalities
Theorem 3.5.4 (Ostrovskii and Randrianantoanina). Not every π has this property.
You can construct spaces of functions taking values in a finite dimensional subspace ofit which cannot be embedded back into π . Nevertheless, you have to work to find badcounterexamples. This is not published yet.
Next class weβll prove Theorem 3.5.2, which implies our first theorem. What is theadvantage of formulating the theorem in this way? We can use Bourgainβs almost-extentiontheorem along the ball. The function is already Lipschitz, so we can already differentitateit. The measure itself will be the unit ball. Weβll look at the derivative of the extendedfunction in direction π at point π₯. That is what our π will be. We will look in all possibledirections. Then the lower bound we have here says that the derivative of the function isonly invertible on average, not at every point in the sphere. That tends to be big on average.We will work for that, but you see the advantage: In this formulation, weβre going to get anaverage lower-bound, not an always lower-bound, and in this way we will be able to shaveoff two exponents. But this is not for Banach spaces in general. The advantage we have hereis that πΏπ of πΏπ is πΏπ.
3/23: We will prove the improvement of Bourgainβs Discretization Theorem.
Proof of Theorem 3.5.2. There exists π : π©πΏ β π such that
1
π·βπ₯β π¦βπ β€ βπ(π₯)β π(π¦)βπ β€ βπ₯β π¦βπ
for π₯, π¦ β π©πΏ. By Bourgainβs almost extension theorem, letting π = span(π(π©πΏ)), thereexists πΉ : π β π such that
1. βπΉβπΏπ. 1.
2. For every π₯ β π©πΏ, βπΉ (π₯)β π(π₯)β β€ ππΏ.
3. πΉ is smooth.
Let Ξ© = 12π΅π , with π the normalized Lebesgue measure on Ξ©:
π(π΄) =vol(π΄ β©
(12π΅π
οΏ½)
vol(12π΅π)
.
Define π : π β πΏβ(π, π) as follows. For all π¦ β π,
(ππ¦)(π₯) = πΉ β²(π₯)(π¦) = ππ¦πΉ (π₯).
This function is in πΏβ because
β(ππ¦)(π₯)β = βπΉ β²(π₯)(π¦)β β€ βπΉ β²(π₯)β βπ¦β . βπ¦β .
This proves the upper bound.
72
MAT529 Metric embeddings and geometric inequalities
We need
1
π·βπ¦β . βππ¦βπΏ1(π,π)
=β«
12π΅π
βπΉ β²(π₯)π¦β ππ(π₯)
=1
vol(12π΅π)
β«12π΅π
βπΉ β²(π₯)π¦β ππ₯.
We need two simple and general lemmas.
Lemma 3.5.5. There exists an isometric embedding π½ : π β ββ.
This is a restatement of the Hahn-Banach theorem.
Proof. π* is separable, so let {π₯*π}βπ=1 be a set of linear functionals that is dense in ππ* =ππ΅π* . Let
π½π = (π₯*π(π₯))βπ=1 β ββ.
For all π₯, |π₯*π(π₯)| β€ βπ₯*πβ β€ βπ₯*πβ βπ₯β = βπ₯β. By Hahn Banach, since every π₯ admits anormalizing functional,
βπ½π₯βββ = supπ
|π₯*π(π₯)| = supπ₯*βππ
|π₯*(π₯)| = βπ₯β .
This may see a triviality, but there are 4 situations where this theorem will make itsappearance.
Remark 3.5.6: Every separable metric space (π, π) admits an isometric embedding intoββ, given as follows. Take {π₯π}βπ=1 dense in π, and let
π(π₯) = (π(π₯, π₯π)β π(π₯, π₯0))βπ=1.
Proof is left as an exercise.
Lemma 3.5.7 (Nonlinear Hahn-Banach Theorem). Let (π, π) be any metric space and π΄ βπ any nonempty subset. If π : π΄ β R is πΏ-Lipschitz then there exists πΉ : π β R thatextends π and
βπΉβLip = πΏ = βπβLip .
The lemma says we can always extend the function and not lose anything in the Lips-chitz constant. Recall the Hahn-Banach Theorem says that we can extend functionals fromsubspace and preserve the norm. Extension in vector spaces and extension in R are different.
This lemma seems general, but it is very useful.
We mimic the proof of the Hahn-Banach Theorem.
73
MAT529 Metric embeddings and geometric inequalities
Proof. We will prove that one can extend π to one additional point while preserving theLipschitz constant.
Then use Zornβs Lemma on the poset of all extensions to supersets ordered by inclusion(with consistency) to finish.
Let π΄ β π, π₯ β πβπ΄. We define π‘ = πΉ (π₯) β R. To make πΉ is Lipschitz, for all π β π΄,we need
|π‘β π(π)| β€ ππ(π₯, π).
Then we needπ‘ β
βπβπ΄
[π(π)β ππ(π₯, π), π(π) + ππ(π₯, π)].
The existence of π‘ is equivalent toβπβπ΄
[π(π)β ππ(π₯, π), π(π) + ππ(π₯, π)].
By compactness it is enough to check this is true for finite intersections. Because we are onthe real line (which is a total order), it is in fact enough to check this is true for pairwiseintersections.
[π(π)β ππ(π₯, π), π(π) + ππ(π₯, π)] β© [π(π)β ππ(π₯, π), π(π) + ππ(π₯, π)].
We need to check
π(π) + π(π₯, π) β₯ π(π)β π(π₯, π)
ββ |π(π)β π(π)| β€ π(π₯, π) + π(π₯, π).
This is true by the triangle inequality:
|π(π)β π(π)| β€ π(π, π) β€ π(π₯, π) + π(π₯, π).
This theorem is used all over the place. Most of the time people just give a closed formula:define πΉ on all the points at once by
πΉ (π₯) = inf {π(π) + π(π₯, π) : π β π΄} .
Now just check that this satisfies Lipschitz. From our proof you ses exactly why they defineπΉ this way: it comes from taking the inf of the upper intervals (we can also take the sup ofthe lower intervals). The extension is not unique; there are many formulas for the extension.
There is a whole isometric theory about exact extensions: look at all the possible ex-tensions, what is the best one? πΉ is the pointwise smallest extension; the other one is thepointwise largest. The theory of absolutely minimizing extensions is related to an interestingnonlinear PDE called the infinite Laplacian. This is different from what weβre doing becausewe lose constants in many places.
74
MAT529 Metric embeddings and geometric inequalities
Corollary 3.5.8. Let (π, π) be any metric space. Let π΄ β π be a nonempty subset π : π΄βββ Lipschitz. Then there exists πΉ : π β ββ that extends π and βπΉβLip = βπβLip.
Proof. Note that saying π : π΄ β ββ, π(π) = (π1(π), π2(π), . . .) has βπβLip = 1 means that ππis 1-Lipschitz for every π.
π takes π©πΏ β π΅π to π(π©πΏ) β π. πβ1 takes it back to π; π½ is the embedding to ββ.
π΅π π πΊ
οΏ½οΏ½
π©πΏ
?οΏ½
OO
π// π(π©πΏ)?οΏ½
OO
πβ1|π(π©πΏ)//
π½βπβ1|π(π©πΏ)
<<π
π½ // ββ
Note π½ β πβ1|π(π©πΏ) is π·-Lipschitz. By nonlinear Hahn-Banach, there exists πΊ : π β ββ withβπΊβLip β€ π· such that
πΊ(π(π₯)) = π½(π₯)
for all π₯ β π©πΏ.
We want πΊ to be differentiable. We do this by convolving with a smooth bump functionwith small support to get π» : π β ββ such that
1. π» is smooth
2. βπ»βLip β€ π·,
3. for all π₯ β πΉ (π΅π), βπΊ(π₯)βπ»(π₯)β β€ ππ·πΏ.
Define a linear operator
π : πΏ1(π, π) β ββ
by for all β β πΏ1(π, π), β : 12π΅π β π,
πβ =β«
12π΅π
π» β²(πΉ (π₯))(β(π₯)) ππ(π₯).
(Type checking: F is a point in π, we can differentiate at this point and get linear operatorπ β ββ. β(π₯) is a point in π, so we can plug into linear operator and get a point in ββ.This is a vector-valued integration; do the integration coordinatewise. So πβ is in ββ.)
75
MAT529 Metric embeddings and geometric inequalities
Now
βπββββ β€β«
12π΅π
βπ» β²(πΉ (π₯))(β(π₯))βββ ππ(π₯)
β€β«
12π΅π
βπ» β²(πΉ (π₯))βπββββ β =:π·
ββ(π₯)β ππ(π₯)
β€ π·β«
12π΅π
ββ(π₯)β
β€ π· βββπΏ1(π,π)
=β βπβπΏ1(π,π)ββββ€ π·
π ππ¦β β β
=β«
12π΅π
π» β²(πΉ (π₯))((ππ¦)(π₯)) ππ(π₯)
=β«
12π΅π
π» β²(πΉ (π₯))(πΉ β²(π₯)(π¦)) ππ(π₯)
=β«
12π΅π
(π» β πΉ )β²(π₯)(π¦) ππ(π₯) chain rule
Now we show π» β πΉ is very close to π½ ; itβs close to being invertible. (Recall πΊ(π(π₯)) = π½π₯.)This does not a priori mean the derivative is close. We use a geometric argument to say
that if a function is sufficiently close to being an isometry, then the integral of its derivativeon a finite dimensional space is close to being invertible. π» β πΉ was a proxy to π½ .
Check that π» β πΉ is close to π½ . For π¦ β π©πΏ, by choice of π»,
βπ»(πΉ (π¦))β π½π¦βββ β€ βπ»(πΉ (π¦))βπΊ(πΉ (π¦))βββ + βπΊ(πΉ (π¦))βπΊ(π(π¦))ββββ€ ππ·πΏ +π· βπΉ (π¦)β π(π¦)βπ. ππ·πΏ.
For general π₯ β 12π΅π , there exists π¦ β π©πΏ such that βπ₯β π¦βπ β€ 2πΏ. Then
βπ»(πΉ (π₯))β π½π₯βββ β€ ππ·πΏ +π·πΏ + πΏ
. ππ·πΏ.
For all π₯ β 12π΅π , βπ» β πΉ (π₯)β π½π₯β β€ πΆππ·πΏ. Define π(π₯) = π» β πΉ (π₯) β π½π₯. Then
βπβπΏβ( 12π΅π) β€ πΆππ·πΏ.
We need a geometric lemma.
Lemma 3.5.9. Suppose (π, βΒ·βπ ) is any Banach space, π = (Rπ, βΒ·βπ) is a π-dimensionalBanach space. Let π : π΅π β π be continuous on π΅π and differentiable on int(π΅π). Then
1
vol(π΅π)
β«π΅π
πβ²(π’) ππ’
πβπ
β€ π βπβπΏβ(π΅π ) .
76
MAT529 Metric embeddings and geometric inequalities
3/28: Finishing up Bourgainβs TheoremPreviously we had π, π Banach spaces with dim(π) = π. Let ππΏ β π΅π be a πΏ-net,
πΏ β€ π/π2π·, π : ππΏ β π . We have
1
π·βπ₯β π¦β β€ βπ(π₯)β π(π¦)β β€ βπ₯β π¦β
WLOG π = span(π(ππΏ)). The almost extension theorem gave us a smooth map fromπΉ : π β π , βπΉβπΏππ . 1 and for all π₯ β ππΏ, βπΉ (π₯) β π(π₯)β . ππΏ. Letting π be thenormalized Lebesgue measure on 1/2π΅π , define π : π β πΏβ(π, π ) by
(ππ¦)(π₯) = πΉ β²(π₯)π¦
for all π¦ β π.The goal was to prove
βππ¦βπΏ1(π,π ) &1
π·βπ¦β
and the approach was to show for π¦ β π the average 1vol(1/2π΅π)
β«1/2π΅π
βπΉ β²(π₯)π¦βππ¦ is big.Fix a linear isometric embedding π½ : π β πβ., we proved that there exists πΊ : π β πβ
such that βπ₯ β ππΏ, πΊ(π(π₯)) = π½(π₯). We have βπΊβπΏππ β€ π·. Fix π» : π β πβ where π» issmooth, βπ»βπΏππ β€ π·, and βπ₯ β πΉ (π΅π), βπ»(π₯)βπΊ(π₯)β β€ ππ·πΏ.
Define π : πΏ1(π, π ) β πβ by
πβ =β«1/2π΅π
π» β²(πΉ (π₯))(β(π₯))ππ(π₯)
for all β β πΏ1(π, π ). Note that βπβπΏ1(π,π )βπβ β€ π·. By the chain rule, πππ¦ =β«1/2π΅π
(π» βπΉ )β²(π₯)π¦ππ(π₯). We checked for every π₯ β π΅π , βπ»(πΉ (π₯))β π½π₯βπβ] . ππ·πΏ.
Suppose you are on a point on the net. Then you know π» is very close to πΊ. πΉ was ππΏclose to π , and π» is π·-Lipschitz, which is how we get ππΏπ· (take a net point, find the closestpoint on the net, and use the fact that the functions are π·-Lipschitz).
This is where we stopped. Now how do we finish?We need a small geometric lemma:
Lemma 3.5.10. Geometric Lemma.π, π are Banach spaces, dim(π) = π. Let π = (Rπ, β Β· βπ). Suppose that π : π΅π β π iscontinuous on π΅π and smooth on the interior. Then
1
vol(π΅π)
β«π΅π
πβ²(π’)ππ’
πβπ
β€ πβπβπΏβ(ππ)
where the inside of the LHS norm should be thought of as an operator. Let π be thenormalized integral operator. Then
π π₯ =1
vol(π΅π)
β«π΅π
πβ²(π’)π₯ππ’
77
MAT529 Metric embeddings and geometric inequalities
The way to think of using this is if you have an πΏβ bound, you get an πΏ1 bound.Assuming this lemma, letβs apply it to π = π» β πΉ β π½ . We getβ«
1/2π΅π
((π» β πΉ )β² β π½) ππ(π₯)πβπβ
. π2π·πΏβ«1/2π΅π
(π» β πΉ )β²(π₯)ππ(π₯)β π½
πβπβ
= βππ β π½βπβπβ . π2π·πΏ
Now we want to bound from below βππ¦βπΏ1(π,π ). We have
βππ¦βπΏ1(π,π ) β₯βπππ¦βπβ
βπβπΏ1(π,π )βπβ
β₯ 1
π·βπππ¦βπβ β₯ 1
π·(βπ½π¦β β βππ β π½βπβπββπ¦β)
=1
π·
(βπ¦β β (π2π·πΏβπ¦β)
οΏ½There is a bit of magic in the punchline. We want to bound the operator below, and weunderstand it as an average of derivatives. We succeeded to show the function itself is small,and there is our geometric lemma which gives a bound on the derivative if you know a boundon the function.
Now letβs prove the lemma.
Proof of Lemma 3.5.10. Fix a direction π¦ β Rπ and normalize so that βπ¦β2 = 1. For everyπ’ β Projπ¦β₯(π΅π), let ππ β€ ππ β R be such that π’ + Rπ¦ β© π΅π = π’ + [ππ , ππ ]π¦ (basically, thisis the intersection of the projection line with the ball).
Using Fubini,1
vol(π΅π)
β«π΅π
πβ²(π’)ππ’
π
=
1
vol(π΅π)
β«Proj
π¦β₯ (π΅π )
β« ππ
ππ
π
ππ π(π’+ π π¦)ππ ππ’
π
=
1
vol(π΅π)
β«Proj
π¦β₯ (π΅π )(π(π’+ πππ¦)β π(π’+ πππ¦)) ππ’
π
β€ 1
volπ(π΅π)Β· 2volπβ1(Projπ¦β₯(π΅π))βπβπΏβ(ππ ,π )
We need to show that2volπβ1
(Projπ¦β₯(π΅π)
οΏ½volπ(π΅π)
β€ πβπ¦βπ
The convex hull of π¦ over the projection is the cone over the projection.
volπ(conv
οΏ½π¦
βπ¦βπβͺ Projπ¦β₯(π΅π)
οΏ½) =
1
πβπ¦βπΒ· volπβ1
(Projπ¦β₯(π΅π)
οΏ½Letting πΎ = conv
({Β± π¦
βπ¦βπ
}β© Projπ¦β₯(π΅π)
), this is the same as saying that volπ(πΎ) β€
volπ(π΅π). Note that πΎ is the double cone, that is where the factor of 2 got absorbed. Thisis an application of Fubiniβs theorem.
78
MAT529 Metric embeddings and geometric inequalities
Geometrically, the idea is that when we look on the projective lines for whatever part ofthe cone is not in our ball set, we will be able to fit the outside inside the set. In formulas,ππ is the largest multiple of π’ which is inside the boundary of the cone. ππ β₯ 1.
πΎ =β
π’βProjπ¦β₯ (π΅π )
οΏ½π’+
οΏ½β ππ β 1
ππβπ¦βπ,+
ππ β 1
ππβπ¦βπ
οΏ½οΏ½We also have
1
ππ(πππ’+ ππππ’π¦)Β± (1β 1
ππ)
π¦
βπ¦βπβ π΅π
and we get that πΎ β π΅π by Fubini.
Thus, weβve completed the proof of Bourgainβs theorem for the semester. The big re-maining question is can we do something like this in the general case?
6 Kirszbraunβs Extension Theorem
Last time we proved nonlinear Hahn-Banach theorem. I want to prove one more LipschitzExtension theorem, which I can do in ten minutes which we will need later in the semester.
Theorem 3.6.1 (Kirszbraunβs extension theorem (1934)). Let π»1, π»2 be Hilbert spaces andπ΄ β π»1. Let π : π΄β π»2 Lipschitz. Then, there exists a function πΉ : π»1 β π»2 that extendsπ and has the same Lipschitz constant (βπΉβLip = βπβLip).
We did this for real valued functions and πβ functions. This version is non-trivial, andrelates to many open problems.
Proof. There is an equivalent geometric formulation. Let π»1, π»2 be Hilbert spaces {π₯π}πβπΌ βπ»1 and {π¦π}πβπΌ β π»2, {ππ}πβπΌ β (0,β). Suppose that βπ, π β πΌ, βπ¦πβ π¦πβπ»2 β€ βπ₯πβ π¦πβπ»1 . Ifβ
πβπΌπ΅π»1(π₯π, ππ) = β
then βπβπΌπ΅π»1(π¦π, ππ) = β
as well.Intuitively, this says the configuration of points in π»2 are a squeezed version of the π»1
points. Then, weβre just saying something obvious. If thereβs some point that intersects allballs in π»1, then using the same radii in the squeezed version will also be nonempty. Ourgeometric formulation will imply extension. We have π : π΄ β π»2, WLOG βπβπΏππ = 1. Forall π, π β π΄, βπ(π)β π(π)βπ»2 β€ βπβ πβπ»1 . Fix any π₯ β π»1 β π΄. What can we say about theintersection of the following balls?:
βπβπ΄π΅π»1(π, βπβ π₯βπ»1). Well by design it is not empty
since π₯ is in this set. So the conclusion from the geometric formulation saysβπβπ΄
π΅π»2 (π(π), βπβ π₯βπ»1) = β
79
MAT529 Metric embeddings and geometric inequalities
So take some π¦ in this set. Then βπ¦β π(π)βπ»2 β€ βπ₯β πβπ»2 βπ β π΄. Define πΉ (π₯) = π¦. Thenwe can just do the one-more-point argument with Zornβs lemma to finish.
Let us now prove the geometric formulation. It is enough to prove the geometric formu-lation when |πΌ| < β and π»1, π»2 are finite dimensional. To show all the balls intersect, itis enough to show that finitely many of them intersect (this is the finite intersection prop-erty: balls are weakly compact). Now the minute πΌ is finite, we can write πΌ = {1, Β· Β· Β· , π},π» β²
1 = span{π₯1, Β· Β· Β· , π₯π}, π» β²2 = span{π¦1, Β· Β· Β· , π¦π}. This reduces everything to finite dimen-
sions. We have a nice argument using Hahn-Banach. FINISH THIS
Remark 3.6.2: In other norms, the geometric formulation in the previous proof is just nottrue. This effectively characterizes Hilbert spaces. Related is the Kneser-Poulsen conjecture,which effectively says the same thing in terms of volumes:
Conjecture 3.6.3 (Kneser-Poulsen). Let π₯1, . . . , π₯π; π¦1, . . . , π¦π β Rπ and βπ¦πβπ¦πβ β€ βπ₯πβπ₯πβfor all π, π. Then βππ β₯ 0
vol
(πβπ=1
π΅(π¦π, ππ)
)β₯ vol
(πβπ=1
π΅(π₯π, ππ)
)This is known for π = 2, where volume is area using trigonometry and all kinds of ad-hoc
arguments. Itβs also known for π = π+ 2.
3/30/16Here is an equivalent formulation of the Kirszbraun extension theorem.
Theorem 3.6.4. Let π»1, π»2 be Hilbert spaces, {π₯π}πβπΌ β π»1, {π¦π}πβπΌ β π»2, {ππ}βπ=1 β (0,β).Suppose that βπ¦π β π¦πβπ»2
β€ βπ₯π β π₯πβπ»2for all π, π β πΌ. Then
βπβπΌ π΅π»1(π₯π, ππ) = π impliesβ
πβπΌ π΅π»2(π¦π, ππ) = π.
Proof. By weak compactness and orthogonal projection, it is enough to prove this whenπΌ = {1, . . . , π} and π»1, π»2 are both finite dimensional. Fix any π₯ β βπ
π=1π΅π»1(π₯π, ππ). Ifπ₯ = π₯π0 for some π0 β {1, . . . , π}, βπ¦π0 β π¦πβπ»2
β€ βπ₯π0 β π₯πβπ»1β€ ππ, and π¦π0 β
βππ=1π΅π»2(π¦π, ππ).
Assume π₯ β {π₯1, . . . , π₯π}.Define π : π» β R by
π(π¦) = max
{βπ¦ β π¦1βπ»2
βπ₯β π₯1βπ»1
, . . . ,βπ¦ β π¦πβπ»2
βπ₯β π₯πβπ»2
}.
Note that π is continuous and limβπ¦βπ»2ββ π(π¦) = β.
So, the minimum of π(π¦) over Rπ is attained. Let
π = minπ¦βRπ
π(π¦).
Fix π¦ β Rπ, π(π¦) = π.
80
MAT529 Metric embeddings and geometric inequalities
Observe that we are done if π β€ 1. Suppose by way of contradiction π > 1. Let π½ bethe indices where the ratio is the minimum.
π½ =
{π β {1, . . . , π} :
βπ¦ β π¦πβπ»2
βπ₯β π₯πβπ»1
= π
}.
By definition, if π β π½ , then βπ¦ β π¦πβπ»2= π βπ₯β π₯πβπ»1
, and for π β π½ , βπ¦ β π¦πβπ»2<
π βπ₯β π₯πβπ»1.
Claim 3.6.5. π¦ β conv({π¦π}πβπ½).
Proof. If not, find a separating hyperplane. If we move π¦ a small enough distance towardsconv({π¦π}πβπ½) perpendicular to the separating hyperplane to π¦β², then for π β π½ ,
βπ¦β² β π¦πβπ»2< βπ¦ β π¦πβπ»2
= π βπ₯β π₯πβπ»1.
For π β π½ , it is still true that βπ¦β² β π¦πβπ»2< π βπ₯β π₯πβπ»1
. Then π(π¦β²) < π, contradictingthat π¦ is a minimizer.
By the claim, there exists {ππ}πβπ½ , ππ β₯ 0,βπβπ½ ππ = 1, π¦ =
βπβπ½ πππ¦π.
We use this the coefficients of this convex combination to define a probability distribution.Define a random vector in π»1 by π, with
P(π = π₯π) = ππ, π β π½.
For all π β {1, . . . , π}, let π¦π = β(π₯π). Let E[β(π)] =βπβπ½ πππ¦π = π¦. Let π β² be an
independent copy of π.Using ββ(π)β π¦βπ»2
= π βπ β π₯βπ»1, we have
E ββ(π)β Eβ(π)β2π»2= E ββ(π)β π¦β2π»2
(3.25)
= π2E βπ β π₯β2π»1(3.26)
> E βπ β π₯β2π»1(3.27)
β₯ E βπ β πΈπβ2π»1(3.28)
In Hilbert space, it is always true that
E βπ β πΈπβ2π»1=
1
2E βπ βπ β²β2π»2
.
Using this on both sides of the above,
1
2E ββ(π)β β(π β²)β2π»2
>1
2βπ βπ β²β2π»1
.
So far we havenβt used the only assumption on the points. By the assumption βπ βπ β²βπ»1β₯
ββ(π)β β(π β²)βπ»2. This is a contradiction.
81
MAT529 Metric embeddings and geometric inequalities
This is not true in πΏπ spaces when π = 2, but there are other theorems one can formulate(ex. with different radii). We have to ask what happens to each of the inequalities in theproof. As stated the theorem is a very βHilbertianβ phenomenon.
Definition 3.6.6: Let (π, βΒ·βπ) be a Banach space and π β₯ 1. We say that π hasRademacher type π if for every π, for every π₯1, . . . , π₯π β π,(
Eπ=(π1,...,ππ)β{Β±1}π
[πβπ=1
ππππ
ππ
]) 1π
β€ π
(πβπ=1
βππβππ
) 1π
.
The smallest π is denoted ππ(π).
Definition (Definition 1.1.13): A metric space (π, ππ) is said to have Enflo type π if forall π : {Β±1}π β π,
οΏ½Eπ[ππ(π(π), π(βπ))π]
οΏ½ 1π
.π
οΏ½πβπ=1
Eπ[π(π(π), π(π1, . . . , ππβ1,βππ, ππ+1, Β· Β· Β· ))π]
οΏ½ 1π
.
If π is also a Banach space of Enflo type π, then it is also of Rademacher type π.
Question 3.6.7: Let π be a Banach space of Rademacher type π. Does π also have Enflotype π?
We know special Banach spaces for which this is true, like πΏπ spaces, but we donβt knowthe anser in full generality.
Theorem 3.6.8 (Pisier). thm:pisier Let π be a Banach space of Rademacher type π. Then πhas Enflo type π for every 1 < π < π.
We first need the following.
Theorem 3.6.9 (Kahaneβs Inequality). For every β > π, π β₯ 1, there exists πΎπ,π such thatfor every Banach space, (π, βΒ·βπ) for every π₯1, . . . , π₯π β π,(
Eπβπ=1
πππ₯π
ππ
) 1π
β€ πΎπ,π
(Eπβπ=1
πππ₯π
ππ
) 1π
.
In the real case this is Khintchineβs inequality.By Kahaneβs inequality, π has type π iff(
Eπβπ=1
πππ₯π
ππ
) 1π
.π,π,π
(πβπ=1
βπ₯πβππ
) 1π
.
Define π : βππ (π) β πΏπ({Β±1}π, π) by π (π₯1, . . . , π₯π)(π) =βππ=1 ππππ.
82
MAT529 Metric embeddings and geometric inequalities
(Here βπβπΏπ({Β±1}π,π) =(
12π
βπβ{Β±1}π βπ(π)β
ππ
οΏ½ 1π .)
Rademacher type π mens
βπββππ (π)βπΏπ({Β±1}π,π) .π,π,π 1.
For an operator π : π β π, the adjoint π * : π* β π * satisfies
βπβπβπ = βπ *βπ*βπ * .
Let π* = ππβ1
be the dual of π (1π+ 1
π*= 1). Now
πΏπ({Β±1}π, π)* = πΏπ*({Β±1}π, π*).
Here, π* : {Β±1}π β π*, π : {Β±1}π β π, π*(π) = Eπ*(π)(π(π)), βππ (π) = βππ*(π*).
Noteβπ *βπΏπ* ({Β±1}π,π*)ββπ
π* (π*) . 1,
For π : π β π, π * : π* β π * is defined by π *(π§*)(π¦) = π§*(ππ¦).For π* : {Β±1}π β π*, π*
βπ΄β{1,...,π} π*(π΄)ππ΄. We claim
π *π* = (π*({1}), . . . , π({π})).We check
π *π*(π₯1, . . . , π₯π) = π*(π (π₯1, . . . , π₯π))
β€ Eπ*(π)(
πβπ=1
ππππ
)=
πβπ=1
(Eπ*(π)ππ)(ππ)
=πβπ=1
οΏ½ βπ΄β{1,...,π}
π*(π΄)(Eππ΄(π)ππ)
οΏ½(π₯π)
=πβπ=1
π*({π})(π₯)= (π*({1}), . . . , π*({π}))(π₯1, . . . , π₯π).
Rademacher type π means for all π*π : {Β±1}π β π*, for all π β [1,β),(πβπ=1
βπ*({π})βπ*π) . βπ*βπΏπ({Β±1}π,π*)
Theorem 3.6.10 (Pisier). If π has Rademacher type π and 1 β€ π < π, then for every π > 1,for all π : {Β±1}π β π with Eπ = 0,
βπβπ -
οΏ½
πβπ=1
βπππβππ
οΏ½ 1π
π
.
83
MAT529 Metric embeddings and geometric inequalities
Here, for π : {Β±1}π β π, πππ : {Β±1}π β π is defined by
πππ(π) =π(π)β π(π1, . . . , ππβ1,βππ, ππ+1, . . . ππ)
2=
βπ΄ β {1, . . . , π}
π β π΄
π(π΄)ππ΄.
Note π2π = ππ.The Laplacian is Ξ =
βππ=1 ππ =
βππ=1 π
2π . Weβve proved that
Ξπ =β
π΄β{1,...,π}|π΄| π(π΄)ππ΄.
The heat semigroup on the cube5 is for π‘ > 0,
πβπ‘Ξπ =β
π΄β{1,...,π}πβπ‘|π΄| π(π΄)ππ΄.
This is a function on the cube
π β¦β
οΏ½πβπ=1
βπππ(π)βππ
οΏ½ 1π
β [0,β).
For π = π,
βπβπ .
οΏ½πβπ=1
βπππβππΏπ({Β±1}π,π)
οΏ½ 1π
.
Let π : {Β±1}π β π be
βπ(π)β π(βπ)βπ β€ βπ(π)β Eπβπ + E βEπ β Eπ(βπ)βπ
.
οΏ½πβπ=1
βπ(π)β π(π1, . . . ,βππ, . . . , ππ)βππ
οΏ½ 1π
.
We need some facts about the heat semigroup.
Fact 3.6.11 (Fact 1): Heat flow is a contraction:πβπ‘Ξπ
πΏπ({Β±1}π,π)
β€ βπβπΏπ({Β±1}π,π) .
Proof. We have
πβπ‘Ξπ =β
π΄β{1,...,π}πβπ‘|π΄| π(π΄)ππ΄ =
βπ π‘ * π
5also known as the noise operator
84
MAT529 Metric embeddings and geometric inequalities
Here the convolution is defined as 12π
βπΏβ{Β±1}π π π‘(ππΏ)π(πΏ). This is a vector valued function.
In the real case itβs Parseval, multiplying the Fourier coefficients. Itβs enough to prove forthe real line. Identities on the level of vectors. Here
π π‘(π) =β
π΄β{1,...,π}πβπ‘|π΄|ππ΄(π);
this is the Riesz product.
The function is
=πβπ=1
(1 + πβπ‘ππ) = 0, π‘ > 0.
Eπ π‘ = 1 so π π‘ is a probability measure and heat flow is an averaging operator.
4/11: Continue Theorem 3.6.8. The dual to having Rademacher type π: for all π* :
{Β±1}π β π*,(βπ
π=1 βπ({π})βπ*π*
) 1π* . βπ*βπΏπ* ({Β±1}π,π*) for every 1 < π <β.
Note that for the purpose of proving Theorem 3.6.8 we only need π = π in Kahaneβsinequality. However, I recommend that you read up on the full inequality.
For π : {Β±1}π β π, define πππ : {Β±1}π β π by πππ = π(π)βπ(π1,...,βππ ,...,ππ)2
. Then
π2π = ππ = π*π and ππππ΄ =
β§β¨β©ππ΄, π β π΄
0, π β π΄.
The Laplacian is Ξ =βππ=1 ππ =
βππ=1 π
2π ,
Ξπ =βπ΄β[π]
|π΄| π(π΄)ππ΄.
If π‘ > 0 then
πβπ‘Ξ =βπ΄β[π]
πβπ‘|π΄|ππ΄
is the heat semigroup. It is a contraction. πβπ‘Ξ = (βππ=1(1 + πβπ‘ππ)) * π .
We have πβπ‘Ξπ
πΏπ({Β±1}π,π)
β₯ πβππ‘ βπβπΏπ({Β±1}π,π) (3.29)
πβπ‘Ξ(π[π]πβπ‘Ξπ) = πβπ‘ππ[π]π (3.30)
85
MAT529 Metric embeddings and geometric inequalities
where π[π](π) =βππ=1 ππ. For π =
βπ΄β[π]
π(π΄)ππ΄, we have
π[π]πβπ‘Ξπ =
βπ΄β[π]
πβπ‘|π΄| π(π΄)π[π]βπ΄ (3.31)
πβπ‘Ξ(π[π]πβπ‘Ξπ) =
βπ΄β[π]
πβπ‘|π΄| π(π΄)πβπ‘(πβ|π΄|) (3.32)
= πβπ‘πβπ΄β[π]
π(π΄)π[π]βπ΄ (3.33)
= πβπ‘ππ[π]π (3.34)
πβπ‘ππ[π]π
πΏπ({Β±1}π,π)
=πβπ‘Ξ(π[π](π
βπ‘Ξπ))π
(3.35)
β€οΏ½οΏ½οΏ½π[π]π
βπ‘Ξππ. (3.36)
The key claim is the following.
Claim 3.6.12. clm:pis1 Suppose π has Rademacher type π. For all 1 < π < β, for all π‘ > 0,for all π* : {Β±1}π β π*,οΏ½
πβπ‘Ξπππ*(π)
π*π*
οΏ½ 1π*πΏπ* ({Β±1}π)
.π,π,π
βπ*βπΏπ* ({Β±1}π,π*)
(ππ‘ β 1)π*π*
.
Assume the claim. We prove the following.
Claim 3.6.13. clm:pis2 If π has Rademacher type π, 1 < π < π, and 1 < π < β, then forevery function π : {Β±1}π β π with Eπ = 0 = π(π) we have
βπβπΏπ({Β±1}π,π) .1
πβ π
οΏ½
πβπ=1
βπππ(π)βππ
οΏ½ 1π
πΏπ({Β±1}π,π)
.
When π = π,
βπ β Eπβπ .οΏ½
1
πβ π
οΏ½πEπ
πβπ=1
βπππ(π)βππ (3.37)
=
οΏ½1
πβ π
οΏ½π 1
2π
πβπ=1
βπ(π)β π(π1, . . . ,βππ, ππ)βππ (3.38)
βπ(π)β π(βπ)βπ β€ (3.39)
= 2 βπ(π)β Eπβπ (3.40)
.1
πβ π
οΏ½πβπ=1
E βπ(π)β π(π1, . . . ,βππ, ππ)β2οΏ½
(3.41)
by definition of Enflo type.We show that the Key Claim 3.6.12 implies Claim 3.6.13.
86
MAT529 Metric embeddings and geometric inequalities
Proof. Normalize so that οΏ½
πβπ=1
βπππ(π)βππ
οΏ½ 1π
πΏπ({Β±1}π,π)
= 1.
For every π > 0, by Hahn-Banach, take π*π : {Β±1}π β π* such that βπ*π βπΏπ* ({Β±1}π,π*) = 1and
EοΏ½π*π (π)(Ξπ
βπ Ξπ(π))οΏ½=Ξπβπ Ξπ
πΏπ({Β±1}π,π)
.
Write this asΒ¬π*π ,Ξπ
βπ Ξπ).
Recall that πΏπ({Β±1}π, π)* = πΏπ*({Β±1}π, π*). Taking β β πΏπ({Β±1}π, π)* and π* βπΏπ*({Β±1}π, π*), we have
π*(β) = β¨π*, ββ© = E[π*(π)β(π)] = E[β¨π*, ββ© (π)].
We have
Ξπβπ Ξπ =βπ΄β[π]
|π΄|πβπ |π΄| π(π΄)ππ΄ (3.42)Ξπβπ Ξπ
πΏπ(π)
=Β¬π*π ,
βπ2π π
βπ Ξπ)
(3.43)
=πβπ=1
Β¬π*π , πππ
βπ Ξπππ)
(3.44)
=πβπ=1
Β¬πβπ Ξπππ
*π , πππ
)(3.45)
=(πβπ Ξπππ
*π
οΏ½ππ=1β β
βπΏπ* (βππ* (π
*))
(πππ)ππ=1β β
βπΏπ(βππ (π))
. (3.46)
Note that (πΏπ(βππ (π)))* = πΏπ*(β
ππ*(π
*)), so we have a pairing here.
β€
οΏ½
πβπ=1
πβπ Ξπππ
*π (π)
π*π*
οΏ½ 1π
πΏπ* (ππ)
οΏ½
πβπ=1
βπππ(π)βππ
οΏ½ 1π
πΏπ(π)
(3.47)
β€βπ*π βπΏπ* (π*)
(ππ‘ β 1)π*π*
by Key Claim 3.6.12
(3.48)
.1
(ππ‘ β 1)π*π*.Ξπβπ Ξπ
πΏπ(π)
.π,π,π1
(ππ‘ β 1)π*π*. (3.49)
Integrating, β« β
0Ξπβπ Ξπ =
βπ΄β[π]
β« β
0|π΄|πβπ |π΄| π(π΄)ππ΄. (3.50)
87
MAT529 Metric embeddings and geometric inequalities
Then
βπβπΏπ(π) =β« β
0Ξπβπ Ξπ ππ
πΏπ(π)
(3.51)
β€β« β
0
Ξπβπ Ξπ
πΏπ(π)
ππ (3.52)
β€β« β
0
ππ
(ππ‘ β 1)π*π*<β. (3.53)
We now prove the key claim 3.6.12.
Proof of Key Claim 3.6.12. This is an interpolation argument. We first introduce a naturaloperation.
Given π* : {Β±1}π β π*, for every π‘ > 0 define a mapping π*π‘ : {Β±1}π Γ {Β±1}π β π* asfollows. For πΏ β {Β±1}π,
π*(π, πΏ) =βπ΄β[π]
π*(π΄)βπβπ΄
(πβπ‘ππ + (1β πβπ‘)πΏπ) (3.54)
= π*(πβπ‘π+ (1β πβπ‘)πΏ) (3.55)
Think of π*(π) =βπ΄β[π] π*(π΄)βπβπ΄ ππ (extended outside {Β±1}π by interpolation).
Observe
π*(π, πΏ) =βπ΅β[π]
πβπ‘|π΅|(1β πβπ‘)πβ|π΅|π*
οΏ½βπβπ΅
ππππ +βπ βπ΅
πΏπππ
οΏ½(3.56)
π*
οΏ½βπβπ΅
ππππ +βπ βπ΅
πΏπππ
οΏ½=
βπ΄β[π]
π*(π΄)π (π΄ β©π΅)(π)ππ΄βπ΅(πΏ) (3.57)βπ΅β[π]
πβπ‘|π΅|(1β πβπ‘)πβ|π΅|π*(βπβπ΅
ππππ +βπ βπ΅
πΏπππ) (3.58)
=βπ΅β[π]
πβπ‘|π΅|(1β πβπ‘)πβ|π΅| βπ΄β[π]
π*(π΄)ππ΄β©π΅(π)ππ΄βπ΅(πΏ) (3.59)
=βπ΄β[π]
π*(π΄) βπ΅β[π]
πβπ‘|π΅|(1β πβπ‘)πβ|π΅|ππ΄β©π΅(π)ππ΄βπ΅(πΏ). (3.60)
Let πΎ =
β§β¨β©ππ, w.p. πβπ‘
πΏπ, w.p. 1β πβπ‘.. Then
π‘
(βπβπ΄
πΎπ
)=βπβπ΄
(πβπ‘ + (1β πβπ‘)πΏπ (3.61)
βπ΄
π*(π΄)Eπ€π΄(πΎ) = E(βπ΄
π*(π΄)ππ΄(πΎ)
)(3.62)
=βπ΅β[π]
πβπ‘|π΅|(1β πβπ‘)πβ|π΅|π*(πΎ) (3.63)
88
MAT529 Metric embeddings and geometric inequalities
Instead of a linear interpolation, we did a random interpolation.What is βπ*π‘ βπΏπ* ({Β±1}πΓ{Β±1}π,π*) β€
βπ΅β[π] π
βπ‘|π΅|(1βπβπ‘)πβ|π΅|?
4/13: Continue Theorem 3.6.8.Weβre proving a famous and nontrivial theorem, so there are more computations (this is
the only proof thatβs known).We previously reduced everything to the following βKey Claimβ 3.6.12.Fix π‘ > 0. Define π*π‘ : {Β±1}π Γ {Β±1}π β π*. Then
π*π‘ (π, πΏ) = π*(πβπ‘π+ (1β πβπ‘)πΏ)
=β
π΄β{1,Β·Β·Β· ,π}π*(π΄)
βπβπ΄
(πβπ‘ππ + (1β πβπ‘)πΏπ
οΏ½Then itβs a fact that βπ*π‘ βπΏπ* ({Β±1}πΓ{Β±1}π,π*)β€βπ*βπΏπ({Β±1}π,π*) which is true for any Banach
space from a Bernoulli interpretation.Let us examine the Fourier expansion of the original definition of π* in the variable πΏπ.
What is the linear part in πΏ? We have
π*π‘ (π, πΏ) =πβπ=1
πΏπ
οΏ½ βπ΄β{1,Β·Β·Β· ,π},πβπ΄
(1β πβπ‘(|π΄|β1)ππ)
οΏ½+ Ξ¦(π, πΏ)
where Ξ¦ is the remaining part. The way to write this is to write EΞ¦(π, πΏ)πΏπ = 0, for allπ = 1, Β· Β· Β· , π since we know the extra part is orthogonal to all the linear parts, since theWalsh function determines an orthogonal basis.
So βπ΄β{1,Β·Β·Β· ,π},πβπ΄
(1β πβπ‘)πβπ‘|π΄|ππ‘β
πβπ΄β{π}π*(π΄) = (ππ‘ β 1)πππ
βπ‘Ξπππ*
What does ππ do to π*? It only keeps the πs which belong to it, and it keeps it with thesame coefficient. Then you hit it with πβπ‘Ξ, which corresponds to πβπ‘|π΄|. Then the last partis just the Walsh function.
All in all, you get that π*π‘ (π, πΏ) = (ππ‘ β 1)βππ=1 πΏππππ
βπ‘Ξπππ*(π) + Ξ¦(π, πΏ).
Now fix π β {Β±1}π. Recall the dual formulation of Rademacher type π: For every functionβ* : {Β±1}π β π*, compute β(βπ
π=1 ββ*(π)βπ*
π )1/π*βπΏπ* ({Β±1}π,π*) β€ ββ*βπΏπ* ({Β±1}π,π*).
Then, applying this inequality to our situation with π*π‘ (π, πΏ) where we have fixed π, weget
(
πβπ=1
βπβπ‘Ξπππ*(π)βπ*
π*
)1/π*πΏπ* ({Β±1}π,π*)
. (ππ‘ β 1)β1(EπΏβπ*π‘ (π, πΏ)βπ
*
π*
οΏ½Then, we get
(
πβπ=1
βπβπ‘Ξπππ*βπ*
π*
)1/π*πΏπ* ({Β±1}π,π*)
. (ππ‘ β 1)β1βπ*π‘ βπΏπ* ({Β±1}πΓ{Β±1}π,π*)
. (ππ‘ β 1)β1βπβπΏπ* ({Β±1}π,π*)
89
MAT529 Metric embeddings and geometric inequalities
Now we want to take π‘β β. Let us look at βπβπ‘Ξπππ*βπΏβ({Β±1}π,π*). πβπ‘Ξ is a contraction,
and so does ππ. We proved the first term is an averaging operator, which does not decreasenorms. We also have that ππ is averaging a difference of two values, so it also does notdecrease norm. Therefore, we can put a max on it and still get our inequality:
max1β€πβ€π
βπβπ‘Ξπππ*βπΏβ({Β±1}π,π*) β€ βπ*βπΏβ({Β±1}π,π*)
Then, (
πβπ=1
βπβπ‘Ξπππ*βππ*
)1/ππΏπ({Β±1}π,π*)
= β(βπβπ‘Ξπππ*βπ*)ππ=1βπΏπ(πππ (π*))
Define π : πΏπ*(π*) β πΏπ*(π
ππ*(π
*)), and it maps
π(π*) = (πβπ‘Ξπππ*(π))ππ=1
The first inequality is the same thing as saying the operator norm of π
βπβπΏπ* (π*)βπΏπ* (πππ* (π
*)) .1
ππ‘ β 1
We also know that
βπβπΏβ(π*)βπΏβ(ππβ(π*)) β€ 1
We now want to interpolate these two statements (π* and β) to arrive at π*, as in thelast remaining portion of the proof which is left.
Define π β [0, 1] by 1π*
= ππ*
+ 1βπβ = π
π*, so π = π*
π*.
By the vector-valued Riesz Interpolation Theorem (proof is identical to real-valued func-tions) (this is in textbooks),
βπβπΏπ* (π*)βπΏπ* (πππ* (π
*)) .1
(ππ‘ β 1)π
provided 1π*
= ππ*
+ 1βπβ , or π* = π*
π= π*π*
π*. π*
π*> 1, as π ranges from 1 β β, so does π*.
In Pisierβs paper, he says we can get any π* that we want. However, this seems to be false.But this does not affect us: If we choose π = π, then we get π = π, which is all we needed tofinish the proof.
So in 3.6.12, we take π* > π*
π*, and this completes the claim.
Later I will either prove or give a reference for the interpolation portion of the argument.Remember what we proved here and the definition of Enflo type π. You assign 2π points
in Banach space, to every vertex of the hyper cube. We deduce the volume of parallelpipedfor any number of points. The open question is do we need to pass to this smaller value,and we needed it to get the integral to converge.
90
Chapter 4
Grothendieckβs Inequality
1 Grothendieckβs Inequality
Next week is student presentations, letβs give some facts about Grothendieckβs inequality.Letβs prove the big Grothendieck theorem (this has books and books of consequences). Ap-plications of Grothendieckβs inequality is a great topic for a separate course.
The big Grothendieck Inequality is the following:
Theorem 4.1.1. Big Grothendeick.There exists a universal constant πΎπΊ (the Grothendieck constant) such that the followingholds. Let π΄ = (πππ) βππΓπ(R), and π₯1, . . . , π₯π, π¦1, . . . , π¦π be unit vectors in a Hilbert spaceπ». Then there exist signs π1, . . . , ππ, πΏ1, . . . , πΏπ β {Β±1}π such that if you look at the bilinearform β
π,π
πππβ¨π₯π, π¦πβ© β€ πΎπΊ
βπ,π
ππππππΏπ
So whenever you have a matrix, and two sets of high-dimensional vectors in a Hilbertspace, there is at most a universal constant (less than 2 or 3) times the same thing but withsigns.
Letβs give a geometric interpretation. Consider the following convex bodies in (Rπ)2. So π΄is the convex hull of all matrices of the form ππππ£({(πππΏπ) : π1, Β· Β· Β· , ππ, πΏ1, Β· Β· Β· , πΏπ β {Β±1}}) βππΓπ(R) βΌ= Rππ. These matrices all have entries Β±1. This is a polytope.
π΅ = ππππ£({β¨π₯π, π¦πβ© : π₯π, π¦π unit vectors in a Hilbert space }) β ππΓπ(R). Itβs obviousthat π΅ contains π΄ since π΄ is a restriction to Β±1 entries.
Grothendieck says the latter half of the containment π΄ β π΅ β πΎπΊπ΄. If there is a pointoutside πΎπΊπ΄, you can find a separating hyperplane, which is a matrix. If π΅ is not in πΎπΊπ΄,then there exists β¨π₯π, π¦πβ© β πΎπΊπ΄ which means by separation there exists a matrix πππ suchthat
βπππβ¨π₯π, π¦π > πΎπΊ
βππππππ for all πππ β π΄, which is a contradiction by Grothendieckβs
theorem taking πππ = πππΏπ.
We previously proved πβ β π2, the following is a harder theorem:
91
MAT529 Metric embeddings and geometric inequalities
Lemma 4.1.2. Every linear operator π : ππβ β ππ1 satisfies for all π₯1, Β· Β· Β· , π₯π β βπβ.(πβπ=1
βππ₯πβ2ππ1
)1/2
β€ πΎπΊ Β· βπβππββππ1max
βπ₯β1β€1,π₯πβππ1
(πβπ=1
β¨π₯, π₯πβ©2)1/2
As an exercise, see how the lemma follows from Grothendieck, and how the little Grothendieckinequality follows from the big one in the case where πππ is positive definite (so itβs just a
special case; there the best constant isΓπ/2).
Note that maxβπ₯β1β€1β¨π₯, π₯πβ© just gives βπ₯πββ. But Grothendieck says for any number ofπ₯1, Β· Β· Β·π₯π, we can for free improve our norm bound by a constant universal factor.
In the little Grothendieck inequality, what we did was prove the above fact for an operatorfor π1 β π2, and we deduced Pietch domination from that conclusion: There is a probabilitymeasure on the unit ball of βπ1 such that βππ₯β21 β€ πΎ2
πΊβπββπβββπ1
β«π΅βπ
1
β¨π¦*, π₯β©2ππ(π¦*). If you
know there exists such a probability measure, you know the above lemma, since you just dothis for each π₯π, and this is at most the maximum so you can just multiply.
This is an amazing fact though: Look at π : βπβ β βπ1 , and look at πΏ2(π, ). This issaying that if you think of an element in ββ as a bounded function on the πΏ2 unit ball of itsdual. You can think like this for any Banach space. Then, we have a diagram mapping fromβπ2 β πΏ2(π) by identity into a Hilbert space subspace π» of πΏ2(π), and π : π» β βπ1 . Yougot these operators to factor through a Hilbert space π», so we form a commutative diagramand the norm of π is at most βπβ β€ πΎπΊβπβ.
This is how this is used. These theorems give you for free a Hilbert space out of nothing,and we use this a lot. After the presentations, from just this duality consequence, Iβll proveone or two results in harmonic analysis and another result in geometry, and it really lookslike magic. You can end up getting things like Parseval for free. We already saw the powerof this in Restricted Invertibility.
Letβs begin a proof of Grothendieckβs inequality.
Proof. We will give a proof getting πΎπΊ β€ π2 log(1+
β2)
= 1.7 Β· Β· Β· (this is not from the original
theorem). We donβt actually know what πΎπΊ is. People donβt really care what πΎπΊ is beyondthe first digit. Grothendieck was interested in the worst configuration of points on the sphere,but it would not tell us what the configuration of points is. We want to know the actualnumber, or some description of the points. We know this is not the best bound since itdoesnβt give the worst configuration of points.
You do the following: The input was points π₯1, Β· Β· Β· , π₯π, π¦1, Β· Β· Β· , π¦π β unit sphere in Hilbertspace π». We will construct a new Hilbert space π» β² and new unit vectors π₯β²π, π¦
β²π such that if
π§ is uniform over the sphere of π» β² according to surface area measure, then the expectation
of Esign(β¨π§, π₯β²πβ©) * sign(β¨π§, π¦β²πβ©) =2 log(1+
β2)
πβ¨π₯π, π¦πβ© for all π, π.
So we have a sphere with points π₯π, π¦π, and there is a way to nonlinearly transform into asphere in the same dimension with π₯β²π, π¦
β²π. On the new sphere, if you take a uniformly random
direction on the sphere, and look at a hyperplane defined by π§, then the sign just indicateswhether π₯β²π, π¦
β²π are on the same or opposite sides. Letβs call ππ the first βrandom signβ, and
πΏπ the second βrandom signβ. Then, EβππππππΏπ = 2 log(1+
β2)
π
βπππβ¨π₯π, π¦πβ©. So there exists
92
MAT529 Metric embeddings and geometric inequalities
an instance (random construction) of ππ, πΏπ such thatβπππβ¨π₯π, π¦πβ© β€ π
2 log(1+β2)
βπππβ¨π₯π, π¦πβ©.
If you succeed in making a bigger constant bound when bounding the expectation of themultiplication of the signs, you get a better Grothendieck constant. For the argument wewill give, we will see that the exact number we get will be the best constant.
Undergrad Presentations.
2 Grothendieckβs Inequality for Graphs (by Arka and
Yuval)
Weβre going to talk about upper bounds for the Grothendieck constant for quadratic formson graphs.
Theorem 4.2.1. Grothendieckβs Inequality.
πβπ=1
πβπ=1
πππβ¨π₯π, π¦πβ© β€ πΎπΊ
πβπ=1
πβπ=1
ππππππΏπ
where π₯π, π¦π are on the sphere. This is a bipartite graph where π₯πs form one side and π¦πsform the other side. Then you assign arbitrary weights πππ. So you can consider arbitrarygraphs on π matrices.
For every π₯π, π¦π, there exist signs ππ, πΏπ such thatβπ,πβ{1,Β·Β·Β· ,π}
πππβ¨π₯π, π¦πβ© β€ πΆ log πβ
ππππππΏπ
We will prove an exact bound on the complete graph, where the Grothendieck constantis πΆ log π.
For a general graph, we will prove that πΎ(πΊ) = cπ(logΞ(πΊ)), where we will define theΞ function of a graph later.
Definition 4.2.2: πΎ(πΊ) is the least constant πΎ s.t. for all matrices π΄ : π Γ π β R,
supπ :πβππβ1
β(π’,π£)βπΈ
π΄(π’, π£)β¨π(π’), π(π£)β© β€ πΎsupπ:πβπΈ
β(π’,π£)βπΈ
π΄(π’, π£)π(π’)π(π£)
Definition 4.2.3: The Gram representation constant of πΊ.Denote by π (πΊ) the infimum over constants π s.t. for every π : π β ππβ1. There existsπΉ : π β πΏ[0,1]
β such that for every π£ β π we have βπΉ (π£)ββ β€ π and β¨π(π’), π(π£)β© =β¨πΉ (π’), πΉ (π£)β© =
β« 10 πΉ (π’)(π‘)πΉ (π£)(π‘)ππ‘. For (π’, π£) which is an edge, you find the constant π
so that you can embed functions π into πΉ such that the ββ norm is less than an explicitconstant.
93
MAT529 Metric embeddings and geometric inequalities
Lemma 4.2.4. Let πΊ be a loopless graph. Then πΎ(πΊ) = π (πΊ)2.
Proof. Fix π > π (πΊ) and π : π β ππβ1. Then there exists πΉ : π β πΏβ[0, 1] such that forevery π£ β π , we have βπΉ (π£)ββ β€ π and for (π’, π£) β πΈ β¨π(π’), π(π£)β© β€ β¨πΉ (π’), πΉ (π£)β©. Then,β(π’,π£)βπΈ
π΄(π’, π£)β¨π(π’), π(π£)β© =β
(π’,π£)βπΈπ΄(π’, π£)β¨πΉ (π’), πΉ (π£)β© =
β« β(π’,π£)βπΈ
π΄(π’, π£)πΉ (π’)(π‘)πΉ (π£)(π‘)ππ‘
β€β«supπ:πβ[βπ ,π ]
βπ΄(π’, π£)π(π’)π(π£)ππ‘
We use definition and linearity to reverse the sum and integral. Now we use the looplessassumption to get to the definition of the right side of the Grothendieck inequality. Wejust need to fix the [βπ ,π ] to [β1, 1]. We then get the last term above is equivalent toπ 2supπ:πβ[β1,1]
β(π’,π£)βπΈ π΄(π’, π£)π(π’)π(π£), which completes the proof.
Now, for each π : π β ππβ1, consider π(π) β R|πΈ| = (β¨π(π’), π(π£))(π’,π£)βπΈ. For eachπ : π β {β1, 1}, define π(π) = (π(π’), π(π£))(π’,π£) β Rπ. Now we use the convex geometryinterpretation of Grothendieck from the last class. Mimicking it, we write
conv {π(π) : π : π β {β1, 1}} β convβπ(π) : π : π β ππβ1
{The implication of Grothendieckβs inequality gives us
convβπ(π) : π : π β ππβ1
{β πΎ(πΊ) Β· conv {π(π) : π : π β {β1, 1}}
So there exist weights {ππ : π : π β {β1, 1}} which satisfyβπ:πβ{β1,1} ππ = 1, ππ β₯ 0
and for all (π’, π£) β πΈ β¨π(π’), π(π£)β© = βπ:π£β{β1,1} πππ(π’)π(π£)πΎ(πΊ).
Now consider
πΉ (π’) =ΓπΎ(π) Β· π(π’)[π1 + Β· Β· Β·+ ππβ1, π1 + Β· Β· Β·+ ππ]
Then,
β¨πΉ (π’), πΉ (π£)β© =β
π:{πβ{β1,1}}πΎ(πΊ)π(π’)π(π£)ππ
Then we consider our interval becomes [βΓπΎ(πΊ),
ΓπΎ(πΊ)], and thus π (πΊ) β€
ΓπΎ(πΊ) and
π (πΊ)2 β€ πΎ(πΊ). This proof can be modified for graphs with loops, but this is not quitetrue.
An obvious corollary is that if π» is a subgraph of πΊ, then π (π») β€ π (πΊ), and πΎ(π») β€πΎ(πΊ). This inequality is not obvious from the Grothendieck inequality directly, but isobvious going with our point of view.
Lemma 4.2.5. Let πΎβπ denote the complete graph on π-vertices with loops. Then π (πΎβ
π) =cπ(
βlog π).
94
MAT529 Metric embeddings and geometric inequalities
Proof. Let π be the normalized surface measure on ππβ1. By computation, there exists π s.t.
ποΏ½βπ₯ β ππβ1 : βπ₯ββ β€ π
βlogππ
}οΏ½β₯ 1β 1
2π, which can be calculated through integration.
When you get a function π on the sphere to the π β 1 dimensional sphere, you want tofind a rotation so that all these vectors have low coordinate vector value. We basically usethe union bound. For each of the π vectors, the probability you get a vector with low valuedlast coordinate is 1β 1/(2π), do this for all the vectors and you get probability greater than1/2. Then you can magnify this to get almost surely.
For every π₯ β ππβ1, the random variable on the orthogonal group π(π) given by π β ππ₯is uniformly distributed on ππβ1. Thus for every π : π β ππβ1, there is a rotation π β π(π)
such that βπ£ β π βπ(π(π£))ββ β€ πβ
logππ
.
We want πΉ (π£) to be equal to the ππ‘β coordinate of ππ(π£) on interval of length 1/π. I.e.,Let πΉ (π’)(π‘) = (π(π(π£)))
βπ on πβ1
πβ€ π‘ π
π.
Thus π β€ πβlog π.
Now I will prove the thing mentioned at the beginning:
Theorem 4.2.6. πΎ(πΊ) β€ logπ(πΊ), where π(πΊ) is the chromatic number.
This theorem generalizes since on bipartite graphs π(πΊ) is a constant.One thing we want to observe about the Grothendieck inequality is that it only cares
about the Hilbert space structure. So we will just prove this in one specific (nice) Hilbertspace and then weβll be done. Fix a probability space (Ξ©, π ) such that π1, Β· Β· Β· , ππ be i.i.d.standard Gaussians on Ξ© (for instance, Ξ© is the infinite product of R).
Now define the Gaussian Hilbert space π» = {ββπ=1 ππππ :
βπ2π <β} β πΏ2(Ξ©). Every
function in here will have mean zero since these are linear combinations of Gaussians. Notethat the unit ball π΅(π») consists of all Gaussian distributions with mean zero and varianceat most 1 (since the variance is norm squared).
Let Ξ be the left hand side of the Grothendieck inequality: supπ :πβ{β1,1}β
(π’,π£)βπΈ π΄(π’, π£)β¨π(π’), π(π£)β©,and let Ξ = supπ :πβ{β1,1}
β(π’,π£)βπΈ π΄(π’, π£)π(π’)π(π£).
Definition 4.2.7: Truncation.For all π > 0, π β πΏ2(Ξ©), define the truncation of π at π to be
ππ(π₯) =
β§βͺβͺβ¨βͺβͺβ©π(π₯) |π(π₯)| β€π
π π(π₯) β₯π
βπ π(π₯) β€ βπ
Weβre just cutting things off at the interval [βπ,π ]. So fix π maximizing Ξ.
Lemma 4.2.8. There exists Hilbert space π», β : π β π», π > 0 with ββ(π£)β2π» β€ 1/2for all vertices π£ β π , π .
Γlogπ(πΊ), and we can now write Ξ as a sum over all edges
Ξ =β
(π’,π£)π΄(π’, π£)β¨π(π’)π , π(π£)πβ©+β(π’,π£)βπΈ π΄(π’, π£)β¨β(π’), β(π£)β©, where the second term is anerror term.
95
MAT529 Metric embeddings and geometric inequalities
Proof. Let π = π(πΊ). A fact for all π : π β π2 s.t. β¨π (π’), π (π£)β© = β1πβ1
for all (π’, π£) β πΈ, andβπ (π’)β = 1.
Define another Hilbert space π = π2 β R. Define π‘, π‘, two functions from π β π . Theyare defined as follows:
π‘(π’) = (
Γπ β 1
ππ (π’))β (
1βππ1)
π‘(π’) = (βΓπ β 1
ππ (π’))β (
1βππ1)
Now, π‘(π’), π‘(π’) are unit vectors for all (π’, π£) β πΈ. Then β¨π‘(π’), π‘(π£)β© = β¨(π‘)(π’), π‘(π£)β© = 0. Wealso have β¨π‘(π’), π‘(π£)β© = 2
π.
Our goal is to prove that such a function exists. Set π» β² = π β πΏ2(Ξ©). We now writedown a function
β(π’) =1
4π‘(π’)β (π(π’) + π(π’)π) + ππ‘(π’)β (π(π’)β π(π’)π)
It turns out that this function does everything we want. A couple of key facts: Itβs easy tocheck that the definition of Ξ holds, defined in terms of β(π’) (for the error term). Checkingββ(π£)β2π» β€ 1/2 holds is also possible. You can bound ββ(π’)β2 β€ (1/2 + πβπ(π’)β π(π’)πβ)2after evaluating inner products. Now we use th eproperty of the Hilbert space. π(π’) is in theball of π», with mean zero and variance at most 1. Then, βπ(π£)β π(π£)πβ2 is the probabilitythat the Gaussian falls outside the interval [βπ,π ]. Therefore, from basic probabilitytheory,
βπ(π£)β π(π£)πβ2 β€β2πβ« β
π
π₯2πβπ₯2/2ππ₯ β€ 2ππβπ
2/2
where the last part follows by calculus. This is great since if we pick π βΌβlog π. So this
will decay super quickly. Picking π = 8βlog π gives that the entire thing will be β€ 1/2.
That proves the lemma, so weβre done.
This lemma is actually all we need. Suppose we have proved this. Now we can say the fol-lowing. We can bound the first term by taking expectations:
β(π’,π£)βπΈ π΄(π’, π£)β¨π(π’)π , π(π£)πβ© =
Eβ
(π’,π£)βπΈ π΄(π’, π£)π(π’)ππ(π£)π β€π2Ξ by the definition of Ξ.
For the second term, we writeβ
(π’,π£)βπΈ π΄(π’, π£)β¨β(π’), β(π£)β© β€ (maxπ£βπ ββ(π£)β2)Ξ β€ 12Ξ.
by re-scaling. You need to pull out each of these maxima separately, using linearity each time.You can also multiply each thing by
β2 makes it be in π΅(π»). You just have to multiply byβ
2 and divide byβ2 and theyβre not dependent anymore. By the conditions of the lemma,
our β has norm at most 1/2, so the This entire thing is Hilbert space independent, so thewhole thing is < Ξ
2. Thus Ξ β€π2Ξ+ 1
2Ξ =β Ξ β€ 2π2Ξ.
This implies Ξ . logπ(πΊ)Ξ, which is exactly what we wanted to say, since the lefthand side of the Grothendieck inequality is the first term, and the right hand side of theGrothendieck inequality is the second term with Grothendieck constant logπ(πΊ).
96
MAT529 Metric embeddings and geometric inequalities
3 Noncommutative Version of Grothendieckβs Inequal-
ity - Thomas and Fred
First weβll phrase the classical Grothendieck inequality in terms of an optimization problem.Given π΄ βππ(R), we can consider
maxππ,πΏπβ
πβπ,π=1
π΄πππππΏπ
In general this is hard to solve, so we do a semidefinite relaxation
supπ dimensionssupπ₯,π¦β(ππβ1)π
πβπ,π=1
π΄ππβ¨π₯π, π¦πβ©
which implies that itβs polynomially solveable, and Grothendieck inequality ensures youβreonly a constant factor off from the best.
We can give a generalization to tensors. Given π βππ(ππ(R)) consider
supπ’,π£βππ
πβπ,π,π,π=1
πππππππππππ
Set πππππ = π΄ππ, then we obtain
supβ
π΄ππππππππ = supπ₯,π¦β{β1,1}π
πβπ,π=1
π΄πππ₯ππ¦π = maxπ,πΏβ{β1,1}π
βπ΄πππππΏπ
We relax to SDP over R of π :
supπβN supπ,π βππ(Rπ)
πβπ,π,π,π=1
πππππβ¨πππ, πππβ©
Recall that π β ππ means that
πβπ=1
ππππππ =πβπ=1
ππππππ = πΏππ
If π βππ(Rπ), let ππ*, π*π βππ(R) defined by
(ππ*)ππ =πβπ=1
β¨πππ, πππβ©
(π*π)ππ =πβπ=1
β¨πππ, πππβ©
Then ππ(Rπ) =βπ :ππ(Rπ) : ππ* = π*π = πΌ
{.
97
MAT529 Metric embeddings and geometric inequalities
We want to say something about when we relax the search space, we get within a constantfactor of the non-relaxed version of the program. We will prove this in the complex case.
We will write
OptC(π) = supπ,π βππ|
πβπ,π,π,π=1
πππππππππ ππ|
The fact that Opt is less than the SDP, Pisier proved thirty years after Grothendieck con-jectured it. What we will actually prove is that the SDP solution is at most twice theoptimal:
SDPC(π) β€ 2OptC(π)
Theorem 4.3.1. Fix π, π β N and π β (0, 1). Suppose we have π β ππ(ππ(C)) andπ, π β ππ(Cπ) such that
|πβ
π,π,π,π=1
πππππβ¨πππ, πππβ©| β₯ (1β π)SDPC(π)
So what weβre saying is suppose some SDP algorithm gave a solution satisfying this frominput π, π β ππ(Cπ). Then we can give a rounding algorithm which will output π΄,π΅ β ππsuch that
E|πβ
π,π,π,π=1
ππππππ΄πππ΅ππ| β₯ (1/2β π)SDPC(π)
Now whatβs the algorithm? Itβs a slightly clever version of projection. We have π, π βππ(C), and we want to get to actual unitary matrices. First sample π§ β {1,β1, π,βπ}π(complex unit cube). Then π₯π§ = 1β
2β¨π, π§β© (take inner products columnwise), similarly
ππ§ = 1β2β¨π, π§β©. Now we take the Polar Decomposition π΄ = πΞ£π * where π, π are unitary.
The polar decomposition is just π΄ = (ππ *)(π Ξ£π *), and then the first guy is unitary, andthe second guy is PSD (think of it as πππ * π).
Then we have (π΄,π΅) = (ππ§|ππ§|ππ‘, ππ§|ππ§|βππ‘) where π‘ is sampled from the hyperbolic secantdistribution. Itβs very similar to a normal distribution. The PDF is more precisely
π(π‘) =1
2sech(
π
2π‘) =
1
πππ‘/2 + πβππ‘/2
When you have a positive definite matrix, you can raise it to imaginary powers, so weβrerotating only the positive semidefinite part, and keeping the rotation side (ππ *). Note thatthe (π΄,π΅) are unitary. (You can also think of this as raising diagonals to ππ‘, these necessarilyhave eigenvalue of magnitude 1).
4-20-16For π, π β ππ(Cπ), do the following.
1. Sample π§ β {Β±1,Β±π} uniformly. Sample π‘ from the hyperbolic secant distribution.
2. Let ππ§ =1β2β¨π₯, π§β©, ππ = 1β
2β¨π, π§β©.
98
MAT529 Metric embeddings and geometric inequalities
3. Let (π΄,π΅) := (ππ |ππ |ππ‘, ππ |ππ |βππ‘) β ππ Γ ππ where ππ = ππ |ππ |, ππ = ππ |ππ |(polar decomposition).
π(π, π ) =βππ,π,π,π=1πππππ β¨πππ, πππβ©. We want to show
Eπ‘[π(π΄,π΅)] β₯(1
2β π
)π(π, π ) β₯ (
1
2β π)SDPC(π)
We want to show that the rounded solution still has large norm.Itβs possible that |ππ | has 0 eigenvalues. One solution is to add some Gaussian noise to
the original π, π because the set of non-invertible matrices is an algebraic set of measure 0.Alternatively, replace the eigenvalues by πβ 0.
We have
Eπ§[π(ππ§, ππ§)] =1
2Eπ§
β‘β£ πβπ,π =1
π§ππ§π πβ
π,π,π,π=1
(πππ)π(πππ)π
β€β¦ =1
2π(π, π ).
Now we analyze step 3. The characteristic function of the hyperbolic secant distributionis, for π > 0,
E[πππ‘] =β«πππ‘π(π‘) ππ‘ =
2π
π»π2
by doing a contour integral. Then
Eπ‘[πππ‘] = 2πβ E
π‘[π2+ππ‘] (4.1)
Eπ‘[π΄ππ‘] = 2π΄β E
π‘[π΄2+ππ‘] (4.2)
Eπ‘[π΄βπ΅] = E
π‘[(ππ§|ππ§|ππ‘)β (ππ§|ππ§|ππ‘)] (4.3)
= (ππ§ β ππ§)Eπ‘[(|ππ§| β |ππ‘|)ππ‘] (4.4)
= 2ππ§ β ππ§ β Eπ‘[(ππ§|ππ§|2+ππ‘)β (ππ§|ππ§|
2βππ‘)] (4.5)
We apply π .
Eπ§,π‘[π(π΄,π΅)] =π(π, π )β E
π§,π‘[π(ππ§|ππ§|2+ππ‘, ππ§|ππ§|2βππ‘)]
Because π΄,π΅ βππ(C), we can write π(π΄,π΅) in terms of the tensor product,
π(π΄,π΅) =πβ
π,π,π,π=1
π΄πππ΅ππ (4.6)
=πβ
π,π,π,π=1
πππππ(π΄βπ΅)(ππ),(ππ) (4.7)
Claim 4.3.2 (Key claim). For all π‘ β R,Eπ§
οΏ½π(ππ§|ππ§|2+ππ‘, ππ§|ππ§|2βππ‘)
οΏ½β€ 1
2SDPC(π).
99
MAT529 Metric embeddings and geometric inequalities
Proof. We have πΉ (π‘), πΊ(π‘) βππ(C{Β±1,Β±π}π). where
(πΉ (π‘)ππ)π§ =1
2π(ππ§|ππ§|2+ππ‘)ππ (4.8)
(πΊ(π‘)ππ)π§ =1
2π(ππ§|ππ§|2βππ‘)ππ.π(πΉ (π‘), πΊ(π‘)) =
1
4πβ
π§β{Β±1,Β±π}ππ(ππ§|ππ§|2+ππ‘, ππ§|ππ§|2βππ‘) (4.9)
= Eπ§[π(ππ§|ππ§|2+ππ‘, ππ§|ππ§|2βππ‘)]. (4.10)
Lemma 4.3.3. πΉ (π‘), πΊ(π‘) satisfy
max {βπΉ (π‘)πΉ (π‘)*β , βπΉ (π‘)*πΉ (π‘)β , βπΊ(π‘)πΊ(π‘)*β , βπΊ(π‘)*πΊ(π‘)β}
Lemma 4.3.4. Suppose π, π β β³π(Cπ) and max{βππ*β , βπ*πβ , βπ π *β , βπ *π β}. Thenthere exist π , π β ππ(Cπ+2π2
) such thatπ(π , π) =π(π, π ) β€ 1 for everyπ βππ(ππ(C)).
From the two lemmas, there exist π (π‘), π(π‘) β ππ(Cπ+2π2) such that π(π (π‘), π(π‘)) =
π(β2πΉ (π‘),
β2πΊ(π‘)). Then
|π(πΉ (π‘), πΊ(π‘))| = 1
2|π(π (π‘), π(π‘))| β€ 1
22SDPC(π) (4.11)
πΉ (π‘)πΉ (π‘)* =1
4πβ
π§β{Β±1,Β±π}πππ§|ππ§|*π*
π§ (4.12)
= Eπ§[ππ§|ππ§|*π*
2 ] = Eπ§[(ππ§π
*π§ )]. (4.13)
Claim 4.3.5. For π β ππ(Cπ), define for each π£ β [π], (ππ)ππ = (πππ)π, ππ§ = β¨π, π§β©.Then
Eπ§[(ππ§π
*π§ )
2] = (ππ *)2 +πβπ=1
ππ(π*π βπ *
πππ)π*π .
Note ππ§ =βππ=1Ξ£πππ and
ππ * =πβπ=1
πππ*π π *π =
πβπ=1
π *πππ. (4.14)
We compute
Eπ§[(ππ§π
*π§ )
2] = Eπ§
β‘β£π,π,π,π β=1
ππ§ππ§ππ§ππ§π πππ*ππππ
*π
β€β¦ (4.15)
=πβπ=1
πππ*ππππ
*π +
πβπ, π = 1π = π
(πππ*ππππ
*π +πππ
*ππππ
*π ) (4.16)
100
MAT529 Metric embeddings and geometric inequalities
πβπ,π=1
πππ*ππππ
*π +
πβπ,π=1
πππ*ππππ
*π β
πβπ=1
πππ*ππππ
*π
=
οΏ½πβπ=1
πππ*π
οΏ½2
+πβπ=1
ππ
οΏ½πβπ=1
π *πππ
οΏ½π *π β
πβπ=1
πππ*ππππ
*π . (4.17)
Apply the claim with π = 1β2π. Recall that ππ* = π*π = πΌ. Then
πΉ (π‘)πΉ (π‘)* +1
4
πβπ=1
πππ*ππππ
*π =
1
2πΌ = πΉ (π‘)πΉ (π‘)* +
1
4
πβπ=1
π*ππππ
*πππ
and similarly for πΊ.
Proof of Lemma 2. Let π΄ = πΌ βππ*, π΅ = πΌ βπ*π. Note π΄,π΅ βͺ° 0, Tr(π΄) = Tr(π΅). Wehave
π΄ =πβπ=1
ππ(π£ππ£*π ) (4.18)
π΅ =πβπ=1
ππ(π£ππ£*π ) (4.19)
π =πβπ=1
ππ =πβπ=1
ππ (4.20)
π = π β
οΏ½πβ¨
π,π=1
Γπππππ
(π’ππ£*π )
οΏ½βπππ(Cπ2 ) βππ(Cπ β Cπ
2 β Cπ2
) (4.21)
π = π βπππ(Cπ2 ) β
οΏ½πβ¨
π,π=1
Γπππππ
(π’ππ£*π )
οΏ½(4.22)
Check π β ππ(Cπ+2π2),
π π * = ππ* + π΄ (4.23)
π *π = π*π +π΅ (4.24)
π(π , π) =π(π, π ). (4.25)
The π disappears becauseβππ = π.
The factor 2 in the noncommutative inequality is sharp. That the answer is 2 (ratherthan some strange constant) means there is something going on algebraically. The hyperbolicsecant is the unique distribution that makes this work.
4-27
101
MAT529 Metric embeddings and geometric inequalities
4 Improving the Grothendieck constant
Theorem 4.4.1 (Krivine (1977)).
πΎπΊ β€ π
2 ln(1 +β2)
β€ 1.7...
The strategy is preprocessing.
There exist new vectors π₯β²1, . . . , π₯β²π, π¦
β²1, . . . , π¦
β²π β S2πβ1 such that if π§ β S2πβ1 is chosen
uniformly at random, then for all π, π,
Eπ§(sign β¨π§, π₯β²πβ©)β β
ππ
(signΒ¬π§, π¦β²π
))β β
πΏπ
=2 ln(1 +
β2)
πβ β π
β¨π₯π, π¦πβ© .
Then βπππ β¨π₯π, π¦πβ© = E
[π
2 ln(1 +β2)
βππππππΏπ
].
We will take vectors, transform them in this way, take a random point on the sphere,take a hyperplane orthogonal, and then see which side the points fall on.
Theorem 4.4.2 (Grothendieckβs identity). For π₯, π¦ β Sπ and π§ uniformly random on Sπ,
Eπ§[sign(β¨π₯, π§β©) sign(β¨π¦, π§β©)] = 2
πsinβ1(β¨π₯, π¦β©).
Proof. This is 2-D plane geometry.The expression is 1 if both π₯, π¦ are on the same side of the line, and β1 if the line cuts
between the angle.
Once we have the idea to pre-process, the rest of the proof is natural.
102
MAT529 Metric embeddings and geometric inequalities
Proof.
Eπ§
οΏ½sign(β¨π§, π₯β²πβ©) sign(
Β¬π§, π¦β²π
))οΏ½=
2
πsinβ1(
Β¬π₯β²π, π¦
β²π
)) = π β¨π₯π, π¦πβ© (4.26)Β¬
π₯β²π, π¦β²π
)= sin
οΏ½ ππ
2β β π’
β¨π₯π, π¦πβ©οΏ½. (4.27)
We write the Taylor series expansion for sin.Β¬π₯β²π, π¦
β²π
)= sin(π’ β¨π₯π, π¦πβ©) (4.28)
=ββπ=0
(π’ β¨π₯π, π¦πβ©)2π+1
(2π + 1)!(β1)π (4.29)
=ββπ=0
(β1)ππ’2π+1
(2π + 1)!
β¨π₯β(2π+1)π , π¦
β(2π+1)π
β©. (4.30)
(For π, π β β2, πβ π = (ππππ), πβ2 = (ππππ), π
β2 = (ππππ), β¨πβ2, πβ2β© = βππππππππ = β¨π, πβ©2.)
Define an infinite direct sum corresponding to these coordinates.Define
π₯β²π =ββ¨π=0
οΏ½(β1)ππ’
2π+12Γ
(2π + 1)!π₯β2π+1π
οΏ½(4.31)
π¦β²π =ββ¨π=0
οΏ½π’
2π+12Γ
(2π + 1)!π₯β2π+1π
οΏ½β
ββ¨π=1
(Rπ)β2π+1 (4.32)
Β¬π₯β²π, π¦
β²π
)=
ββπ=0
(β1)ππ’2π+1
(2π + 1)!
Β¬π₯β2π+1π , π¦β2π+1
π
). (4.33)
The infinite series expansion of sin generated an infinite space for us.We check
βπ₯β²πβ2=
ββπ=0
π’(2π+1)
(2π + 1)!= sinh(π’) =
ππ’ β πβπ’
2= 1. (4.34)
We donβt care about the construction, we care about the identity. All I want is to findπ₯β²π, π¦
β²π with given
Β¬π₯β²π, π¦
β²π
); this can be done by a SDP. Weβre using this infinite argument just
to show existence.A different way to see this is given by the following picture. Take a uniformly random
line and look at the orthogonal projection of points on the line. Is it positive or negative. Apriori we have vectors in Rπ that donβt have an order.
We want to choose orientations. A natural thing to do is to take a random projectiononto R and use the order on R. Krevine conjectured his bound was optimal.
103
MAT529 Metric embeddings and geometric inequalities
What is so special about positive or negative? We can partition it into any 2 measurablesets. Any partition into 2 measurable sets produces a sign. Itβs a nontrivial fact that nopartition beats this constant. This is a fact about measure theory.
The moment you choose a partition, it forced the construction of the π₯β², π¦β². The wholeidea is the partition; then the preprocessing is uniquely determined; itβs how we reverse-engineered the theorem.
Hereβs another thing you can do. Whatβs so special about a random line? The wholeproblem is about finding an orientation.
Consider a floor (plane) chosen randomly, look at shadows. If you partition the planeinto 2 half-planes, you get back the same constant. We can generalize to arbitrary partitionsof the plane. This is equivalent to an isoperimetric problem. In many such problems the bestthing is a half-space. We can look at higher-dimensional projections. It seems unnatural todo thisβexcept that you gain!
In R2 there is a more clever partition that beats the halfspace! Moreover, as you takethe increase the dimension, eventually you will converge to the Grothendieck constant. Thepartitions look like fractal objects!
5 Application to Fourier series
This is a classical theorem about Fourier series. Helgason generalized it to a general abeliangroup.
Definition 4.5.1: Let π = R/Z. The space of continuous functions is πΆ(π β²). Given π :Zβ R (the multiplier), define
Ξπ : πΆ(Sβ²) β ββ (4.35)
Ξπ(π) = (π(π) π(π))πβZ. (4.36)
For which multipliers π is Ξπ(π) β β1 for every π β πΆ(S1)?A more general question is, what are the possible Fourier coefficients of a continuous
functions? For example, if log π works, thenβ | π(π)||π(π)| <β. This is a classical topic.
An obvious sufficient condition is that π β β2, by Cauchy-Schwarz and Parseval.βπβZ
|π(π) π(π)| β€ (βπ(π)2
οΏ½ 12(β
| π(π)|2οΏ½ 12β β
βπβ2β€βπββ
(4.37)
βΞπ(π)ββ1 β€ βπβ2 βπββ . (4.38)
This theorem says the converse.
Theorem 4.5.2 (Orlicz-Paley-Sidon). If Ξπ(π) β β1 for all π β πΆ(S1), then π β β2.
We know the fine line of when Fourier coefficients of continuous functions converge.
104
MAT529 Metric embeddings and geometric inequalities
We can make this theorem quantitative. Observe that if you know Ξπ : πΆ(S1) β β1,then Ξπ has a closed graph (exercise). The closed graph theorem (a linear operator betweenBanach spaces with closed graph is bounded) says that βΞπβ <β.
Corollary 4.5.3.β |π(π) π(π)| β€ πΎ βπββ.
We show βΞπβ β€ βπβ2 β€ πΎπΊ βΞπβ.We will use the following consequence of Grothendieckβs inequality (proof omitted).
Corollary 4.5.4. Let π : βπβ β βπ1 . For all π₯1, . . . , π₯π β ββ,(βπ
βππ₯πβ21
) 12
β€ πΎπΊ βπβ supπ¦ββ1
πβπ=1
β¨π¦, π₯πβ©2.
Equivalently, there exists a probability measure π on [π] such that βππ₯β β€ πΎπΊ βπβ β« πΎπΊ βπβ(β«π₯2π ππ(π)
οΏ½ 12 .
Proof. Use duality. To get the equivalence, use the same proof as in Piesch Domination.
Proof. Given Ξπ : πΆ(S1) β β1, there exists π on S1 such that for every π , letting ππ(π₯) =π(ππππ₯), (β
|π(π) ππ(π)|οΏ½2 β€ πΎ2πΊ βΞπβ2
β«π1(ππ(π₯))
2 ππ(π₯) (4.39)(β|π(π) π(π)|οΏ½2 β€ πΎ2
πΊ βΞπβ2β«π1(π(π₯))2 ππ(π₯). (4.40)
Apply this to the following trigonometric sum
π(π₯) =πβ
π=βππ(π)πβπππ,
to get οΏ½πβ
π=βππ(π)2
οΏ½2
β€ πΎ2πΊ βΞβ2π
πβπ=βπ
π(π)2.
RIP also had this kind of magic.
6 ? presentation
Theorem 4.6.1. Let (π, π) be a metric space with π = π΄ βͺπ΅ such that
β π΄ embeds into βπ2 with distortion π·π΄, and
β π΅ embeds into βπ2 with distortion π·π΅.
105
MAT529 Metric embeddings and geometric inequalities
Then π embeds into βπ+π+12 with distortion at most 7π·π΄π·π΅ + 5(π·π΄ +π·π΅).
Furthermore, given π : π΄β βπ2 and ππ΅ : π΅ β βπ2 with βππ΄βLip β€ π·π΄ and βππ΅βLip β€ π·π΅,
there is an embedding Ξ¨ : π βΛ βπ+π+12 with distortion 7π·π΄π·π΅ + 5(π·π΄ + π·π΅) such that
βΞ¨(π’)βΞ¨(π£)β β₯ βππ΄(π’)β ππ΄(π£)β for all π’, π£ β π΄.
For π β π΄, let π π = π(π,π΅) and for π β π΅, let π π = π(π΄, π).
Definition 4.6.2: For πΌ > 0, π΄β² β π΄ is an πΌ-cover for π΄ with respect to π΅ if
1. For every π β π΄, there exists πβ² β π΄β² where π πβ² β€ π π and π(π, πβ²) β€ πΌπ π.
2. For distinct πβ²1, πβ²2 β π΄β², π(πβ²1, π
β²2) β₯ πΌmin(π πβ²1
, π πβ²2).
This is like a net, but the π of the net is a function of the distance to the set. The requirementis weaker for points that are farther away.
Lemma 4.6.3. For all πΌ > 0, there is an πΌ-cover π΄β² for π΄ with respect to π΅.
Proof. Induct. The base case is π, which is clear.Assume the lemma holds for |π΄| < π; weβll show it holds for |π΄| = π. Let π’ β π΄ be the
point closest to π΅. Let π = π΄βπ΅πΌπ π’(π’) =: π΅; we have |π| < |π΄|. Let π β² be a cover of π,and let π΄β² = π β² βͺ {π’}.
We show that π΄β² is an πΌ-cover. We need to show the two properties.
1. Divide into cases based on whether π β π΅ or π β π΅.
2. For πβ²1, πβ²2 β π΄β², if both are in π§β², weβre done.
Otherwise, without loss of generality, πβ²1 = π’ and πβ²2 β π β².
Lemma 4.6.4. Define π : π΄β² β π΅ to send every point in the πΌ-cover to a closest point inπ΅, π(πβ², π(πβ²)) = π πβ².
Then βπβLip β€ 2(1 + 1
πΌ
οΏ½. By the triangle inequality,
π(π(πβ²1), π(πβ²2)) β€ π(π(πβ²1), π
β²1) + π(πβ²1, π
β²2) + π(πβ²2, π(π
β²2)).
We have
π πβ²1 +π πβ²2 + π(πβ²1, πβ²2) = 2min(π πβ²1
, π πβ²2) + |π πβ²1
+π πβ²2+ π(πβ²1, π
β²2) (4.41)
β€ 1
πΌπ(πβ²1, π
β²2) + π(πβ²1, π
β²2) + π(πβ²1, π
β²2) (4.42)
= 2(1β 1
πΌ
)π(πΌβ²
1, πΌβ²2) (4.43)
Let ππ΅ : π΅ βΛ βπ2 be an embedding with βπβLip β€ π·π΅; it is noncontracting.
106
MAT529 Metric embeddings and geometric inequalities
Lemma 4.6.5. There is π : π β βπ2 such that
1. For all π1, π2 β π΄,
βπ(π1)β π(π2)β β€ 2(1 +
1
πΌ
)π·π΄π·π΅π(πΌ1, πΌ2)
2. For all π1, π2 β π΅,
π(π1, π2) β€ βπ(π1)β π(π2)β = βππ΅(π1)β ππ΅(π2)β β€ π·π΅π(π1, π2).
3. For all π β π΄, π β π΅, π(π, π) β (1 + πΌ)(2π·π΄π·π΅ + 1)π π β€ βπ(π)β π(π)β β€ 2(1 +πΌ)(π·π΄π·π΅ + (2 + πΌ)π·π΅)π(π, π).
Proof. Let π = ππ΅ππβ1π΄ .
We have maps
π΄ οΏ½οΏ½ π //
ππ΄
οΏ½οΏ½
π΄β²
ππ΄
οΏ½οΏ½
π// π΅
ππ΅
οΏ½οΏ½
π(π΄) π(π΄β²)πoo // π(π΅) οΏ½οΏ½
// βπ2.
Now
βπβLip β€ βππ΅βLip βπβLipπβ1π΄
Lip
β€ π·π΅2(1 +
1
πΌ
)By the Kirszbraun extension theorem 3.6.1, construct π : βπ2 β βπ2 with
βπβLip β€ π·π΅2(1 +
1
πΌ
).
Define π(π₯) =
β§β¨β©π(ππ΄(π₯)), if π₯ β π΄
ππ΅(π₯), if π₯ β π΅..
We show the three parts.
1.
βπ(π1)β π(π2)β = βπ(ππ΄(π1))β π(ππ΄(π2))β (4.44)
β€ βπβLip βππ΄βLip π(π1, π2). (4.45)
2. This is clear.
3. Let π = π(πβ²). Then π(πβ², πβ²) β€ π π, π(πβ²) = π(πβ²). We have
βπ(π)β π(π)β β€ βπ(π)β π(πβ²)β β βπ(πβ²)β π(πβ²)β β βπ(π)β π(πβ²)β (4.46)
βπ(π)β π(π)β β€ 2(1 +
1
πΌ
)π·π΄π·π΅π(π, π
β²) +π·π΅π(π, πβ²) (4.47)
π(π, πβ²) β€ πΌπ π β€ πΌπ(π, π) (4.48)
π(π, πβ²) β€ π(π, π) + π(π, πβ²) + π(πβ², πβ²) (4.49)
β€ (2 + πΌ)π(π, π). (4.50)
107
MAT529 Metric embeddings and geometric inequalities
This shows the first half of (3).
For the other inequality, use the triangle inequality against and get
βπ(π)β π(π)β β₯ βπ(π)β π(πβ²)β β βπ(πβ²)β π(πβ²)β β βπ(πβ²)β π(π)β (4.51)
β₯ π(π, πβ²)β 2(1 + πΌ)π·π΄π·π΅π π. (4.52)
Let
ππ΅ = π (4.53)
π½ = (1 + πΌ)(2π·π΄π·π΅ + 1) (4.54)
πΎ =(1
2
)π½ (4.55)
πΞ : π β R (4.56)
ππ΄(π) = πΎπ π, π β π΄ (4.57)
ππ΄(π) = βπΎπ π, π β π΅ (4.58)
Ξ¨ : π β βπ+π+1 (4.59)
Ξ¨(π₯) = ππ΄ β ππ΅ β πΞ β βπ+π+12 . (4.60)
For π1, π2 β π΄,
βΞ¨(π1)βΞ¨(π2)β β₯ βΞ¨π΄(π1)βΞ¨π΄(π2)β β₯ π(π1, π2).
For π β π΄, π β π΅,
βΞ¨(π)βΞ¨(π)β2 = βππ΄(π)β ππ΄(π)β2 + βππ΅(π)β ππ΅(π)β2 + βππ΄(π)β ππ΄(π)β2 .
βππ΄(π)β ππ΄(π)β β₯ π(π, π)β π½π π (4.61)
βππ΅(π)β ππ΅(π)β β₯ π(π, π)β π½π π (4.62)
βππ΄(π)β ππ΄(π)β = πΎ(π π +π π). (4.63)
Claim 4.6.6. We have βπ(π)β π(π)β β₯ π(π, π).
Proof. Without loss of generality π π β π π. Consider 3 cases.
1. π½π π β€ π(π, π).
2. π½π π β€ π(π, π) β€ π½π π.
3. π(π, π) β€ π½π π.
108
MAT529 Metric embeddings and geometric inequalities
Consider case 2. The other cases are similar.
βπ(π)β π(π)β2 β₯ (π(π, π)β π½π π)2 + π½2(π π +π π)
2/2 (4.64)
= π(π, π)2 β 2π½π(π, π)π π +π½2
2(3π 2
π + 2π ππ π +π 2π) (4.65)
β₯ π(π, π)2 β 2π½π ππ π +π½2
2(3π 2
π + 2π ππ π +π 2π) (4.66)
= π(π, π)2 + π½2((β3π π βπ π)
2 + 2(β3 +π ππ π))/2 (4.67)
> π(π, π) (4.68)
For π1, π2 β π,
βΞ¨(π1)βΞ¨(π2)β = βππ΄(π1) + ππ΄(π2)β2 + βππ΅(π1)β ππ΅(π2)β2 + βππ΅(π1)β ππ΄(π2)β
(4.69)
β€οΏ½π·2π΄ + 4
(1 +
1
πΌ
)2
π·2π΄π·
2π΅
οΏ½π(π1, π2)
2 + πΎ2(π π1 βπ π2) (4.70)
β€οΏ½π·2π΄ + 4
(1 +
1
πΌ
)2
π·2π΄π·
2π΅ + πΎ2
οΏ½π(π1, π2)
2 (4.71)
|π π βπ π2| β€ π(π1, π2) (4.72)
βΞ¨(π1)βΞ¨(π2)β2 β€οΏ½π·2π΅ + 4
(1 +
1
πΌ
)2
π·2π΄π·
2π΅ + πΎ2
οΏ½π(π, π)2 (4.73)
|Ξ¨(π)βΞ¨(π)|2 β€ βππ΄(π)β ππ΅(π)β2 + βππ΅(π)β ππ΅(π)β2 + βππ΄(π)β π(π)β2 (4.74)
β€ Β· Β· Β· πΎ2(π π +π π)2 β€ Β· Β· Β· (4.75)
π π, π π β€ π(π, π).
109
MAT529 Metric embeddings and geometric inequalities
110
Chapter 5
Lipschitz extension
β
1 Introduction and Problem Setup
We give a presentation of the paper βOn Lipschitz Extension From Finite Subsetsβ, by AssafNaor and Yuval Rabani, (2015). For the convenience of the reader referencing the originalpaper, we have kept the numbering of the lemmas the same.
Consider the setup where we have a metric space (π, ππ) and a Banach space (π, β Β· βπ).For a subset π β π, consider a 1-Lipschitz function π : π β π. Our goal is to extend πto πΉ : π β π without experiencing too much growth in the Lipschitz constant βπΉβπΏππ overβπβπΏππ.
Definition 5.1.1: π(π,π, π) and its variants.Define π(π,π, π) to be the infimum over the sequence of πΎ satisfying βπΉβπΏππ β€ πΎβπβπΏππ(i.e., π(π,π, π) is the least upper bound for
βπΉβπΏππ
βπβπΏππfor a particular π, π, π).
Then, define π(π,π) to be the supremum over all subsets π for π(π,π, π): So of allsubsets, whatβs the largest least upper bound for the ratio of Lipschitz constants?
We may also want to consider supremums over π(π,π, π) for π with a fixed size. Wecan formulate this in two ways. ππ(π,π) is the supremum of π(π,π, π) over all π such that|π| = π. We can also describe ππ(π,π) as the supremum of π(π,π, π) over all π which areπ-discrete in the sense that ππ(π₯, π¦) β₯ π Β· diam(π) for distinct π₯, π¦ β π and some π β [0, 1].
Definition 5.1.2: Absolute extendability.We define ae(π) to be the supremum of ππ(π,π) over all possible metric spaces (π, ππ) andBanach spaces (π, β Β· βπ). Identically, ae(π) is the supremum of ππ(π,π) over all (π, ππ)and (π, β Β· βπ).
From now on, we will primarily discuss subsets π β π with size |π| = π. Bounding thesupremum, the absolute extendability ae(π) < πΎ allows us to make general claims about the
111
MAT529 Metric embeddings and geometric inequalities
extendability of maps from metric spaces into Banach spaces. Any Banach-space valued 1-Lipschitz function defined on metric space (π,ππ) can therefore be extended to any metricspaceπ β² such thatπ β² containsπ (up to isometry; as long as you can embedπ inπ β² withan injective distance preserving map) such that the Lipschitz constant of the extension is atmost πΎ.
Therefore, for the last thirty years, it has been of interest to understand upper and lowerbounds on ae(π), as we want to understand the asymptotic behavior as π β β. In the1980s, the following upper and lower bound were given by Johnson and Lindenstrauss andSchechtman: β
log π
log log π. ae(π) . log π
In 2005, the upper bound was improved:βlog π
log log π. ae(π) .
log π
log log π
In this talk, we improve the lower bound for the first time since 1984 toΓlog π . ae(π) .
log π
log log π
1.1 Why do we care?
This improvement is of interest primarily not because of the removal of aβlog log π term in
the denominator. It is due to the fact that the approach taken to get the lower bound pro-vided by Johnson-Lindenstrauss 1984 has an inherent limitation. The approach of Johnson-Lindenstrauss to get the lower bound is to prove the nonexistance of linear projections ofsmall norm. By considering a specific case for π,π, π, π, we can get a lower bound on ae(π).Consider a Banach space (π, βΒ·βπ ) and let π β π be a π-dimensional linear subspace ofπwith ππ an π-net in the unit sphere of π , and then define ππ = ππβͺ{0}. Fix π β (0, 1/2). Wetake π : ππ β π to be the identity mapping, and wish to find an extension to πΉ : π β π .Then, in our setup, we let π = π , π = ππ, π = π . We seek to bound the magnitude of theLipschitz constant of πΉ , call it πΏ. Johnson-Lindenstrauss prove that for π . 1
π2, there exists
a linear projection π : π β π with βπβ . πΏ. We can now proceed to lower bound πΏ bylower-bounding βπβ for all π . The classical Kadecβ-Snobar theorem says that there alwaysexists a projection with βπβ β€
βπ. Therefore, the best (largest) possible lower bound we
could get will be πΏ &βπ by Kadecβ-Snobar. But this is bad:
Taking π = |ππ|, by bounds on π-nets we get π β logπlog(1/π)
which implies
πΏ &
βlog π
log(1/π)
In order to get the lower bound on ae(π) ofβlog π, we must take π to be a universal constant.
However, from a lemma by Benyamini (in our current setting), πΏ . ππ(π,π) . 1/π = π(1),which means that any lower bound we get on πΏ will be too small (and wonβt even tend toβ). Therefore, we must make use of nonlinear theory to get the
βπ lower bound on ae(π).
112
MAT529 Metric embeddings and geometric inequalities
1.2 Outline of the Approach
Let us formally state the theorem, and then give the approach to the proof.
Theorem 5.1.3. Theorem 1.For every π β N we have ae(π) &
βlog π.
We give a metric space π, a Banach space π, a subset π β π, a function π : π β πsuch that π extends to πΉ : π β π where βπΉβπΏππ β€ πΎβπβπΏππ.
Let ππΊ be the vertices of a finite graph πΊ with distance metric the shortest path metricππΊ where edges all have length 1.
We define our metric space π = (ππΊ, ππΊπ(π)) where πΊπ(π) is the π-magnification of theshortest path metric on ππΊ. π is an π-vertex subset (π, ππΊπ(π)). Our Banach space π =(Rπ0 , βΒ·βπ1(π,ππΊπ(π)
) is equipped with the Wasserstein-1 norm induced by the π-magnification
of the shortest path metric on the graph. Note that Rπ0 is just weight distributions on thevertices of π which sum to zero in the image. Our π : π β Rπ0 β π, and we extend toπΉ : π β π. We will show how to choose π and |π| optimally to get the result.
The rest of my section of the talk will give the requisite definitions and lemmas tounderstand the full proof.
2 π-Magnification
Definition 5.2.1: π-magnification of a metric space.Given metric space (π, ππ) and π > 0, for every subset π β π we define ππ(π) as a metricspace on the points of π equipped with the following metric:
πππ(π)(π₯, π¦) = ππ(π₯, π¦) + π|{π₯, π¦} β© π|
and where πππ(π)(π₯, π₯) = 0. All this is saying is that when we have distinct points π₯, π¦ β π,we have the metric is 2π+ππ(π₯, π¦), when one point is in π and one point is outside, we haveπ + ππ(π₯, π¦), and when both π₯, π¦ are outside, the metric is unchanged.
The significance of this definition is as follows: Itβs easier for functions on π to be Lipschitz(we enlarge the denominator) without affecting functions on π β π. Thus, there are morepotential π we can draw from which satisfy 1-Lipschitzness which can have potentially largeLipschitz extensions (i.e., large πΎ) since we donβt make it easier to be Lipschitz on π β π(which we must deal with in the extension space).
However, we canβt make π too large: the minimum distance between π₯, π¦ in π becomesclose to diam(π) under π-magnification as π increases. Let us assume the minimum distancebetween π₯, π¦ is 1 (as it would be in an undirected graph with an edge between π₯, π¦ underthe shortest path metric). Particularly, for distinct π₯, π¦ β π, since diam(π, πππ(π)) = 2π +
diam(π, ππ),
πππ(π)(π₯, π¦) β₯ 2π + 1 =2π + 1
2π + diam(π, ππ)Β· diam(π, πππ(π))
113
MAT529 Metric embeddings and geometric inequalities
Then recall that ππ(π,π) is the supremum over π such that are π-discrete, where here,π = 2π+1
2π+diam(π,ππ). Earlier we saw a bound that
ππ(π,π) . 1/π =2π + diam(π, ππ)
2π + 1β€ 1 + diam(π, ππ)
π
Thus, if we make π too large, we again are bounding ππ(π,π) . 1 = π(1), which means ourchoice of π and π is not good to get a large lower bound (again, weβre not even going toβ).
Thus we must balance our choice of π appropriately.
3 Wasserstein-1 norm
Now we come to the second part of our choice of π. Note that we will define Rπ0 to be theset of functions on the points of π such that for each π β Rπ0 ,
βπ₯βπ π(π₯) = 0. We use ππ₯ to
denote the indicator weight map with 1 at point π₯ and 0 everywhere else.
Definition 5.3.1: Wasserstein-1 Norm.The Wasserstein-1 norm is the norm induced by the following origin-symmetric convex bodyin finite metric space (π, ππ):
πΎ(π,ππ) = conv
βππ₯ β ππ¦ππ(π₯, π¦)
: π₯, π¦ β π, π₯ = π¦
Β«This is a unit ball on Rπ0 . We denote the induced norm by β Β· βπ1(π,ππ).
We can give an equivalent (proven with the Kantorovich-Rubinstein duality theorem)definition of the Wasserstein-1 distance:
Definition 5.3.2: Wasserstein-1 distance and norm.Let Ξ (π, π) be all measures on π on π Γπ such thatβ
π¦βππ(π¦, π§) = π(π§)
for all π§ β π and βπ§βπ
π(π¦, π§) = π(π¦)
for all π¦ β π. Then, the Wasserstein-1 distance (earthmover) is
π ππ1 (π, π) = inf
πβΞ (π,π)
βπ₯,π¦βπ
ππ(π₯, π¦)π(π₯, π¦)
In the case that π = π, we automatically have (πΓ π)/π(π) β Ξ (π, π) (normalizing by onemeasure trivially gives the other), so Ξ is nonempty. The norm induced by this metric forπ β Rπ0 is
βπβπ1(π,ππ) = π ππ1 (π+, πβ)
114
MAT529 Metric embeddings and geometric inequalities
where π+ = max(π, 0) and πβ = max(βπ, 0) (sinceβπ₯βπ π(π₯) = 0, we need to make sure
that π, π are both nonnegative measure). Futhermore, we haveβπ₯βπ π
+(π₯) =βπ₯βπ π
β(π₯)(same total mass), which means that Ξ (π+, πβ) is nonempty.
Definition 5.3.3: β1(π) norm.Note that using standard notation, we can also define an β1 norm on π by
βπββ1(π) =βπ₯βπ
|π(π₯)| =βπ₯βπ
π+(π₯) + πβ(π₯)
Thus, for our restrictions on π ,βπ₯βπ π
+(π₯) =βπ₯βπ π
β(π₯) = βπββ1(π)/2.
Now we give a simple lemma which gives bounds for the Wasserstein-1 norm induced bythe π-magnification of a metric on π.
Lemma 5.3.4. Lemma 7.For (π, ππ) a finite metric space, we have
1. βππ₯ β ππ¦βπ1(π,ππ) = ππ(π₯, π¦) for every π₯, π¦ β π.
2. For all π β Rπ0 ,1
2min
π₯,π¦βπ;π₯ =π¦ππ(π₯, π¦)βπββ1(π) β€ βπβπ1(π,ππ) β€
1
2diam(π, ππ)βπββ1(π)
3. For every π > 0, π β π, for all π β Rπ0 ,
πβπββ1(π) β€ βπβπ1(π,πππ(π)) β€οΏ½π + diam(π, ππ)
2
οΏ½βπββ1(π)
Proof. 1. This follows directly from the unit ball interpretation of the Wasserstein-1 norm,since ππ₯βππ¦
ππ(π₯,π¦)is by the first definition on the unit ball.
2. Let π = minπ₯,π¦ ππ(π₯, π¦) > 0. For distinct π₯, π¦ β π, we have
maxπ₯,π¦βπ;π₯ =π¦
ππ₯ β ππ¦ππ(π₯, π¦)
β1(π)
β€ maxπ₯,π¦βπ;π₯ =π¦
βππ₯ β ππ¦ββ1(π)
π=
2
π
since 0 < π β€ ππ(π₯, π¦) and 1 + 1 = 2. Therefore ππ₯βππ¦ππ(π₯,π¦)
β 2ππ΅β1(π). These elements
span πΎπ,ππ , so we have πΎπ,ππ β 2ππ΅β1(π) and we get the first inequality. The second
inequality follows from
βπβπ1(π,ππ) = infπβΞ (π+,πβ)
βπ₯,π¦βπ
ππ(π₯, π¦)π(π₯, π¦) β€ diam(π, ππ)βπ₯βπ
π+(π₯) = diam(π, ππ)βπββ1(π)/2
3. This inequality is a special case of the previous inequality. We have that for ππ(π),
π β₯ 2π (so 2π
β€ 1π) and 1
2diam(π, πππ(π)) β€ 12
(2π + diam(π, ππ)
οΏ½=οΏ½π + diam(π,ππ)
2
οΏ½.
Plugging in these estimates give the inequality.
115
MAT529 Metric embeddings and geometric inequalities
4 Graph theoretic lemmas
4.1 Expanders
We will need several properties of edge-expanders in our proof of the main theorem. For thissection, we fix π, π β₯ 3 and let πΊ be a connected π-vertex π-regular graph. We can imagineπ = 3 in this section, all that matters is that π is fixed.
First we record two basic average bounds on distance in the shortest-path metric, denotedππΊ.
Lemma 5.4.1. Average shortest-path metric and π-magnified average shortest-path bounds.
1. ππΊ lower bound: For nonempty π β ππΊ,
1
|π|2βπ₯,π¦βπ
ππΊ(π₯, π¦) β₯log |π|4 log π
2. ππΊπ(π) equality: For some π β ππΊ and π > 0,
1
|πΈπΊ|β
(π₯,π¦)βπΈπΊ
ππΊπ(π₯,π¦) = 1 +2π|π|π
Proof. 1. The smallest nonzero distance in πΊ is at least 1. Thus, the average is boundedbelow by 1
|π|2 |π|(|π|β1) = 1β 1|π| since πΊ is connected (shortest case is complete graph
on π vertices). Then, 1 β 1/π β₯ (log π)/4 log 3 for π β [15] (π = 3 maximizes), sowe proceed assuming |π| β₯ 16. Letβs bound the distance in the average. Since πΊ isπ-regular, for every π₯ β ππΊ the number of vertices π¦ such that ππΊ(π₯, π¦) β€ π β 1 is atmost
βπβ1π=0 π
π. The rest of the vertices are farther away. Since 1 + Β· Β· Β· + ππβ2 < ππβ1
we have #{π¦ : ππΊ(π₯, π¦) β€ π β 1} β€ 2ππβ1. Choosing π = 1 + βlogπ(|π|/4)β gives that
2ππβ1 β€ |π|2. Therefore
1
|π|2βπ₯,π¦βπ
ππΊ(π₯, π¦) β₯1
|π|2* |π| * |π|/2 * (π β 1) =
π β 1
2=
log(|π|/4)2 log π
β₯ log |π|4 log π
since |π| β₯ 16.
2. Let πΈ1 be edges completely contained in π and πΈ2 be edges partially contained in π.Because πΊ is π-regular, 2|πΈ1|+ |πΈ2| = π|π| (2 vertices in π for πΈ1, only 1 vertex for πΈ2,then divide by π for overcounting since each vertex in π hits π other vertices, and wecount exactly the edges which have at least one vertex in π). Note that |πΈπΊ| = ππ/2 bydouble-counting vertices. Then for each edge in πΈ1 we add 2π, for each edge in πΈ2 we
116
MAT529 Metric embeddings and geometric inequalities
add π, and otherwise we add 0 to the base distance of an edge, which is 1. Therefore,
1
|πΈπΊ|β
(π₯,π¦)βπΈπΊ
ππΊπ(π₯,π¦) =((0 + 1)|πΈπΊ β (πΈ1 βͺ πΈ2)|+ (π + 1)|πΈ2|+ (2π + 1)|πΈ1|)
|πΈπΊ|
= 1 +π(2|πΈ1|+ |πΈ2|)
|πΈπΊ|= 1 +
ππ|π|ππ/2
= 1 +2π|π|π
Now we introduce the definition of edge expansion.
Definition 5.4.2: Edge expansion π(πΊ).πΊ is a connected π-vertex π-regular graph. Consider π, π β ππΊ disjoint subsets. LetπΈπΊ(π, π ) β πΈπΊ denote the set of edges which bridge π and π . Then the edge-expansionπ(πΊ) is defined by
π(πΊ) = sup
βπ : |πΈπΊ(π, ππΊ β π)| β₯ π
|π|(πβ |π|)π2
|πΈπΊ|,βπ β ππΊ, π β [0,β)
Β«We give an equivalent formulation of edge expansion via the cut-cone decomposition:
Lemma 5.4.3. Edge-Expansion: Cut-cone Decomposition of Subsets of β1.π(πΊ) is the largest π such that for all β : ππΊ β β1,
π
π2
βπ₯,π¦βππΊ
ββ(π₯)β β(π¦)β1 β€1
|πΈπΊ|β
(π₯,π¦)βπΈπΊ
ββ(π₯)β β(π¦)β1
Proof. We will assume this for this talk.
Now we combine Lemma 7 and the cut-cone decomposition to get Lemma 8:
Lemma 5.4.4. Lemma 8.Fix π β N and π β (0, 1]. Suppose πΊ is an π-vertex graph with edge expansion π(πΊ) β₯ π.For all nonempty π β ππΊ and π > 0, every πΉ : ππΊ β Rπ0 satisfies
1
π2
βπ₯,π¦βππΊ
βπΉ (π₯)βπΉ (π¦)βπ1(π,ππΊπ(π)) β€2π + diam(π, ππΊ)
(2π + 1)πΒ· 1
|πΈπΊ|β
(π₯,π¦)βπΈπΊ
βπΉ (π₯)βπΉ (π¦)βπ1(π,ππΊπ(π))
Basically, whereas before we were bounding average norms (normal and π-magnified)in the domain space ππΊ, we are now bounding average norms in the image space using theWasserstein-1 norm induced by the π-magnification of the shortest path metric on the graph.
Proof. First, plug in β = πΉ into the cut-cone decomposition, where the norm is now definedby β1(π). Then, we know that diam(π, ππΊπ(π)) = 2π+diam(π, ππΊ) and the smallest positivedistance in (π, ππΊπ(π)) is 2π + 1. Applying Lemma 7, every π₯, π¦ β ππΊ satisfy
2π + 1
2βπΉ (π₯)βπΉ (π¦)ββ1(π) β€ βπΉ (π₯)βπΉ (π¦)βπ1(π,ππΊπ(π)) β€
2π + diam(π, ππΊ)
2βπΉ (π₯)βπΉ (π¦)ββ1(π)
117
MAT529 Metric embeddings and geometric inequalities
Plugging these estimates directly into the cut-cone decomposition
1
π2
βπ₯,π¦βππΊ
βπΉ (π₯)β πΉ (π¦)ββ1(π) β€1
π|πΈπΊ|β
(π₯,π¦)βπΈπΊ
βπΉ (π₯)β πΉ (π¦)ββ1(π)
gives the desired result.
4.2 An Application of Mengerβs Theorem
Here, we want to bound below the number of edge-disjoint paths.
Lemma 5.4.5. Lemma 9.Let πΊ be an π-vertex graph and π΄,π΅ β ππΊ be disjoint. Fix π β (0,β) and suppose π(πΊ) β₯ π.Then
#βedge-disjoint paths joiningπ΄ andπ΅
{β₯ π|πΈπΊ|
2πΒ·min{|π΄|, |π΅|}
Proof. Let π be the maximal number of edge-disjoint paths joining π΄ and π΅. By Mengerβstheorem from classical graph theory, there exists a subset of edges πΈ* β πΈπΊ with |πΈ*| = πsuch that every path in πΊ joining π β π΄ to π β π΅ contains an edge from πΈ*.
Now consider the graph πΊ* = (ππΊ, πΈπΊ β πΈ*). In this graph, there are no paths betweenπ΄ and π΅. Now, if we let πΆ β ππΊ be the union of all connected components of πΊ* containingan element of π΄, then π΄ β πΆ and π΅ β© πΆ = β . Since we covered the maximal possiblevertices reachable from π΄ with πΆ, all edges between πΆ and ππΊ β πΆ are in πΈ*. Therefore,|πΈπΊ(πΆ, ππΊ β πΆ)| β€ |πΈ*| = π. By the definition of expansion,
π β₯ |πΈπΊ(πΆ, ππΊβπΆ)| β₯ πmax{|πΆ|, πβ |πΆ|} Β·min{|πΆ|, πβ |πΆ|}
π2|πΈπΊ| β₯
πmin{|π΄|, |π΅|} Β· |πΈπΊ|2π
since max{|πΆ|, πβ |πΆ|} β₯ π/2 and since π΄ β πΆ,π΅ β ππΊ β πΆ, we have min{|πΆ|, πβ |πΆ|} β₯min{|π΄|, |π΅|}.
5 Main Proof
We fix π, π β N, π β (0, 1) and let πΊ be a π-regular graph on π vertices with π(πΊ) β₯ π. Wealso fix a nonempty subset π β ππΊ and π > 0 and define a mapping
π : (π, ππΊπ(π)) β¦β(Rπ0 , β Β· βπ1(π,ππΊπ(π))
οΏ½s.t. π(π₯) = ππ₯ β
1
|π|βπ§βπ
ππ§ β π₯ β π
Then suppose we have that some πΉ : ππΊ β¦β Rπ0 extends π and for some πΏ β (0,β) we have
βπΉ (π₯)β πΉ (π¦)βπ1(π,ππΊπ(π)) β€ πΏππΊπ(π)(π₯, π¦)
118
MAT529 Metric embeddings and geometric inequalities
For π₯ β ππΊ, π β (0,β) define
π΅π(π₯) = {π¦ β ππΊ : βπΉ (π₯)β πΉ (π¦)βπ1(π,ππΊπ(π)) β€ π }
i.e.e π΅π is the inverse image of πΉ of the ball (in the Wasserstein 1-norm) of radius π centeredat πΉ (π₯). By the lemma consequence of Mengerβs Theorem, since π β₯ π(πΊ) we have
π β₯ ππ
4min{|πβπ΅π (π₯)|, |π΅π (π₯)|} (1)
forπ edge-disjoint paths between πβπ΅π (π₯) and π΅π (π₯), i.e. we can find indices π1, . . . , ππ β Nand vertex sets {π§π,1, . . . , π§π,ππ}ππ=1 β ππΊ s.t. {π§π,1}ππ=1 β πβπ΅π (π₯), {π§π,ππ}ππ=1 β π΅π (π₯) (i.e. thebeginnings and ends of paths are in different disjoint subsets) and such that {{π§π,1, π§π,π+1 :π β {1, . . . ,π}β§ π β {1, . . . , ππβ1}} are distinct edges in πΈπΊ (i.e. edge-disjointedness). Nowtake an index subset π½ β {1, . . . ,π} s.t. {π§π,1}πβπ½ are distinct and {π§π,1}πβπ½ = {π§π,1}ππ=1. Forπ β π½ denote by ππ the number of π β {1, . . . ,π} for which π§π,1 = π§π,1. Then since πΊ isπ-regular and {{π§π,1, π§π,2}}ππ=1 are distinct, max
πβπ½ππ β€ π. Since
βπβπ½
ππ = π, from (1) we have
that
|π½ | β₯ π
πβ₯ π
4min{|πβπ΅π (π₯)|, |π΅π (π₯)|} (2)
The quantity |π½ | can be upperbounded as follows:
Lemma 5.5.1. Lemma 10: |π½ | β€ max{π16(π βπ), 16ππΏπ log π
logπ
(1 + 2π|π|
π
)}Proof. Assume |π½ | β€ π16(π βπ) (otherwise we are done). This is equivalent to
π β π <log |π½ |16 log π
Now since {π§π,1}πβπ½ β π and πΉ (π₯) = π(π₯) β π₯ β π and is an isometry on (π, ππΊπ(π)), by thedefinition of the π-magnified metric
βπΉ (π§π,1)β πΉ (π§π,1)βπ1(π,ππΊπ(π)) = ππΊπ(π)(π§π,1, π§π,1) = 2π + ππΊ(π§π,1, π§π,1)
This gives usβπβπ½
βπΉ (π§π,1)β πΉ (π§π,ππ)βπ1(π,ππΊπ(π)) =1
2|π½ |βπ,πβπ½
(βπΉ (π§π,1)β πΉ (π§π,ππ)βπ1(π,ππΊπ(π)) + βπΉ (π§π,1)β πΉ (π§π,ππ)βπ1(π,ππΊπ(π))
οΏ½β₯ 1
2|π½ |βπ,πβπ½
(βπΉ (π§π,1)β πΉ (π§π,1)βπ1(π,ππΊπ(π)) + βπΉ (π§π§π,ππ)β πΉ (π§π,ππ)βπ1(π,ππΊπ(π))
οΏ½Since {π§π,ππ}πβπ½ β π΅π (π₯), by the definition of π΅π (π₯) we have
βπΉ (π§π,ππ)βπΉ (π§π,ππ)βπ1(π,ππΊπ(π)) β€ βπΉ (π§π,ππ)βπΉ (π₯)βπ1(π,ππΊπ(π))+βπΉ (π₯)βπΉ (π§π,ππ)βπ1(π,ππΊπ(π)) β€ 2π β π, π β π½
119
MAT529 Metric embeddings and geometric inequalities
so the previous inequality can be further continued asβπβπ½
βπΉ (π§π,1)βπΉ (π§π,ππ)βπ1(π,ππΊπ(π)) β₯1
2|π½ |βπ,πβπ½
ππΊ(π§π,1, π§π,1)β(π βπ)|π½ | β₯ |π½ | log |π½ |8 log π
β(π βπ)|π½ | > |π½ | log |π½ |16 log π
where the second inequality above is a property of the expander graph. The same quantitycan be bounded from above using the Lipschitz condition
βπβπ½
βπΉ (π§π,1)β πΉ (π§π,ππ)βπ1(π,ππΊπ(π)) β€ πΏβπβπ½
ππΊπ(π)(π§π,1, π§π,ππ) β€ πΏβπβπ½
ππβ1βπ=1
ππΊπ(π)(π§π,1, π§π,π+1)
By the edge-disjointedness of the paths (specifically since {{π§π,1, π§π,π+1} : π β π½ β§ π β{1, . . . , ππ β 1}}) are distinct edges in πΈπΊ, we have that
βπβπ½
ππβ1βπ=1
ππΊπ(π)(π§π,1, π§π,π+1) β€β
{π’,π£}βπΈπΊ
ππΊπ(π)(π’, π£) =ππ
2
οΏ½1 +
2π|π|π
οΏ½Everything together gives
πΏππ
2
οΏ½1 +
2π|π|π
οΏ½β₯ |π½ | log |π½ |
16 log π
By the simple fact that π log π β€ π =β π β€ 2πlog π
for π β [1,β), π β (1,β), using π = |π½ |and π = 8πΏππ log π
(1 + 2π|π|
π
)β₯ π we have that
|π½ | β€ 16ππΏπ log π
log π
οΏ½1 +
2π|π|π
οΏ½completing the proof.
The lemma has two corollaries, both depending on the following condition:
π16(π βπ) β€ π|π|8
πΏ β€ π|π| log π128
(1 + 2π|π|
π
)ππ log π
(3)
Corollary 5.5.2. Corollary 11: maxπ₯βππΊ
|π΅π (π₯)| < |π|2
Proof. Pick an π₯ β ππΊ. If π΅π (π₯)β©π is nonempty, then again using the standard estimate onexpander graphs we have that β π¦, π§ β π΅π (π₯) β© π s.t.
ππΊ(π¦, π§) β₯log |π΅π (π₯) β© π|
4 log π
Since π¦, π§ β π and πΉ (π₯) = π(π₯) β π₯ β π and furthermore πΉ is an isometry on (π, ππΊπ(π)), wehave similarly to before that
ππΊ(π¦, π§)+2π = ππΊπ(π)(π¦, π§) = βπΉ (π¦)βπΉ (π§)βπ1(π,ππΊπ(π)) β€ βπΉ (π¦)βπΉ (π₯)βπ1(π,ππΊπ(π))+βπΉ (π₯)βπΉ (π§)βπ1(π,ππΊπ(π)) β€ 2π
120
MAT529 Metric embeddings and geometric inequalities
where we have used first the triangle inequality and second the fact that π¦, π§ β π΅π (π₯). Thenusing the first inequality in the proof we have
|π΅π (π₯) β© π| β€ π8(π βπ) β€βπ|π|8
β€ 2|π|5
where the second inequality comes from the first assumption. Now the above inequalityimplies that |πβπ΅π (π₯)| β₯ 3|π|
5, which combined with Lemma 10 yields
min
β3|π|5, |π΅π (π₯)|
Β«< max
{4π16(π βπ)
π,64ππΏπ log π
π log π
οΏ½1 +
2π|π|π
οΏ½}However, the two original assumptions together imply that 3|π|
5is in fact greater than either
value in the maximum, so
|π΅π (π₯)| β€ max
{4π16(π βπ)
π,64ππΏπ log π
π log π
οΏ½1 +
2π|π|π
οΏ½}β€ |π|
2
Corollary 5.5.3. Corollary 12: πΏ β₯ ππ
2(1+
diam(πΊ,ππΊ)
2π
οΏ½(1+ 2π|π|
π )
Proof. By the definition ofπ΅π (π₯) for π₯ β ππΊ and π¦ β ππΊβπ΅π (π₯) we have βπΉ (π₯)βπΉ (π¦)βπ1(π,ππΊπ(π)) >π , so
1
π2
βπ₯,π¦βππΊ
βπΉ (π₯)β πΉ (π¦)βπ1(π,ππΊπ(π)) β₯1
π2
βπ₯βππΊ
βπ¦βππΊβπ΅π (π₯)
βπΉ (π₯)β πΉ (π¦)βπ1(π,ππΊπ(π))
β₯ π
π2
βπ₯βππΊ
(πβ |π΅π (π₯)|) β₯ π
οΏ½1β
maxπ₯βππΊ
|π΅π (π₯)|
π
οΏ½β₯ π
2
where the last inequality comes from Corollary 11. Then since Lemma 8 gives
1
π2
βπ₯,π¦βππΊ
βπΉ (π₯)β πΉ (π¦)βπ1(π,ππΊπ(π)) β€2π + diam(π, ππΊ)
(2π + 1)π|πΈπΊ|β
π₯,π¦βππΊβπΉ (π₯)β πΉ (π¦)βπ1(π,ππΊπ(π))
β€πΏ(1 + diam(πΊ,ππΊ)
2π
)π|πΈπΊ|
β{π₯,π¦}βπΈπΊ
ππΊπ(π)(π₯, π¦)
=πΏ(1 + diam(πΊ,ππΊ)
2π
) (1 + 2π|π|
π
)π
where the ineqaulity on the second line is true since πΉ is an isometry on (π, ππΊ) and fromthe Lipschitz constant of π and the equality is average length of the π-magnification ofthe expander graph πΊ. Combining these inequalities with the previous ones yields thecorollary.
121
MAT529 Metric embeddings and geometric inequalities
Theorem 5.5.4. Theorem 13: if 0 < π β€ diam(πΊ, ππΊ) then
πΏ β₯πΆπ
1 + π|π|π
min
β§β¨β© |π| log πππ log π
,16π2 log π+ π log
(π|π|8
)diam(πΊ, ππΊ) log π
β«β¬βProof. Assume 16π log π+log
(π|π|8
)> 0 (otherwise we are done) and choose π = π+
log(π|π|8 )
16 log π
s.t. π > 0 and π16(π βπ) = π|π|8. Then the first inequality in (3) is satisfied, so either the second
fails and πΏ thus has that expression as a lower bound, or both are satisfied and we have thelower bound in Corollary 12.
Theorem 5.5.5. Theorem 1: for every π β N we have ae(π) β₯πΆ
βlog π.
Proof. Substituting π β 1 and diam(πΊ, ππΊ) β logπlog π
(the β indicates asymptotically for large
π), and using the assumption 0 < π β€ diam(πΊ, ππΊ), the lower bound given in Theorem 13becomes
πΏ β₯πΆ1
1 + π|π|π
min
β |π| log πππ log π
,π(π log π+ log |π|)
log π
Β«Taking π β ππΊ s.t. |π| =
βπβπ log πβlogπ
β(we must have π
ππππ to ensure |π| β€ π) and π β log πβπ log π
, which gives a lower bound of πΏ β₯πΆ
βlogπβπ log π
.
Therefore
πβπβ
π log πβlogπ
β(πΊπ(π),π1(π, ππΊπ(π))) β₯πΆ
βlog πβπ log π
which completes the proof for fixed π.
122
List of Notations
πΏπ βΛπ (π) supremum over all those πΏ > 0 such that for every πΏ-net π©πΏ in π΅π , πΆπ (π©πΏ) β₯(1β π)πΆπ (π) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
Rπ {(π₯1, . . . , π₯π) β Rπ : π₯π = 0 if π β π} . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
srank(π΄) stable rank,οΏ½
βπ΄βπ2
βπ΄βπβ
οΏ½2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
βπ΄βππSchatten-von Neumann π-norm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
πΆπ (π) infimum of π· where π imbeds into π with distortion π· . . . . . . . . . . . . . . . . . . . . . . . . 16
123
Index
Bourgainβs almost extension theorem, 49
Corson-Klee Lemma, 13cotype π, 9
discretization modulus, 15, 16distortion, 15
Enflo type, 82Enflo type π, 11
finitely representable, 7
Jordan-von Neumann Theorem, 9
Kadecβs Theorem, 10Kahaneβs inequality, 9Ky Fan maximum principle, 23
local properties, 7
nonlinear Hahn-Banach Theorem, 73nontrivial type, 8
parallelogram identity, 9Pietsch Domination Theorem, 30Poisson kernel, 50, 57preprocessing, 102
Rademacher type, 82restricted invertibility principle, 19rigidity theorem, 10
Sauer-Shelah, 33Schatten-von Neumann π-norm, 20shattering set, 33stable πth rank, 22stable rank, 21symmetric norm, 25
type, 8
uniformly homeomorphic, 10
124