Geometric Probability

191

description

book on geo. probability

Transcript of Geometric Probability

  • Introduction to Geometric Probability

    This is the first modern introduction to geometric probability, also known as integral geometry. The subject is presented at an elementary level, requiring little more than first year graduate mathematics. The theory of intrinsic volumes due to Hadwiger, McMullen, Santal6 and others is presented, along with a complete and elementary proof of Hadwiger's characterization theorem of invariant measures in Euclidean n-space. The theory of the Euler characteristic is developed from an integral-geometric point of view. The authors then prove the fundamental theorem of integral geometry, namely the kinematic formula. Finally the analogies between invariant measures on polyconvex sets and mea-sures on order ideals of finite partially ordered sets are investigated. The relationship between convex geometry and enumerative combinatorics motivates much of the presentation. Every chapter concludes with a list of unsolved problems. Geometers and combinatorialists will find this a stimulating and fruitful tale.

    Daniel A. Klain is Assistant Professor of Mathematics at Georgia Institute of Technology.

    Gian-Carlo Rota is Professor of Applied Mathematics and Philos-ophy, Massachusetts Institute of Technology.

  • Lezioni Lincee Sponsored by Foundazione IBM Italia Editor: Luigi A. Radicati di Bmzolo, Scuola Normale Superiore, Pisa

    This series of books arises from lectures given under the auspices of the Accademia Nazionale dei Lincei and is sponsored by Foundazione IBM Italia. The lectures, given by international authorities, will range on scientific topics from mathematics and physics through to biology and economics. The books are intended for a broad audience of graduate students and faculty members, and are meant to provide a 'mise au point' for the subject with which they deal. The symbol of the Accademia, the lynx, is noted for its sharp-sightedness; the volumes in this series will be penetrating studies of scientific topics of contemporary interest.

    Already published

    Chaotic Evolution and Strange Attractors: D. Ruelle Introduction to Polymer Dynamics: P. de Gennes The Geometry and Physics of Knots: M. Atiyah Attractors for Semigroups and Evolution Equations: O. Ladyzhenskaya Asymptotic Behaviour of Solutions of Evolutionary Equations: M. 1. Vishik Half a Century of Free Radical Chemistry: D. N. R. Barton in collaboration with

    S. 1. Parekh Bound Carbohydrates in Nature: L. Warren Neural Activity and the Growth of the Brain: D. Purves Perspectives in Astrophysical Cosmology: M. Rees Molecular Mechanisms in Striated Muscle: S. V. Perry Some Asymptotic Problems in the Theory of Partial Differential Equations:

    O.Oleinik

  • PUBLISHED BY THE PRESS SYNDICATE OF THE UNIVERSITY OF CAMBRIDGE The Pitt Building, Trumpington Street, Cambridge CB2 1RP, United Kingdom

    CAMBRIDGE UNIVERSITY PRESS The Edinburgh Building, Cambridge CB2 2RU, UK

    http://www.cup.cam.ac.uk 40 West 20th Street, New York, NY 10011-4211, USA

    http://www.cup.org 10 Stamford Road, Oakleigh, Melbourne 3166, Australia

    Cambridge University Press 1997 This book is in copyright. Subject to statutory exception

    and to the provisions of relevant collective licensing agreements, no reproduction of any part may take place without the

    written permission of Cambridge University Press.

    First published 1997

    Printed in the United Kingdom at the University Press, Cambridge

    Typeset in Computer Modern 10/13pt

    A catalogue record for this book is available from the British Libmry ISBN 0 521 59362 X hardback ISBN 0 521 59654 8 paperback

  • Contents

    Preface page xi Using this book xiv

    1 The Buffon needle problem 1 1.1 The classical problem 1 1.2 The space of lines 3 1.3 Notes 5 2 Valuation and integral 6 2.1 Valuations 6 2.2 Groemer's integral theorem 8 2.3 Notes 11 3 A discrete lattice 13 3.1 Subsets of a finite set 13 3.2 Valuations on a simplicial complex 21 3.3 A discrete analogue of Helly's theorem 28 3.4 Notes 29 4 The intrinsic volumes for parallelotopes 30 4.1 The lattice of parallelotopes 30 4.2 Invariant valuations on parallelotopes 35 4.3 Notes 41 5 The lattice of polyconvex sets 42 5.1 Polyconvex sets 42 5.2 The Euler characteristic 46 5.3 Helly's theorem 50 5.4 Lutwak's containment theorem 54 5.5 Cauchy's surface area formula 55 5.6 Notes 58

  • viii Contents

    6 Invariant measures on Grassmannians 60 6.1 The lattice of subspaces 60 6.2 Computing the flag coefficients 63 6.3 Properties of the flag coefficients 70 6.4 A continuous analogue of Sperner's theorem 73 6.5 A continuous analogue of Meshalkin's theorem 77 6.6 Helly's theorem for subspaces 81 6.7 Notes 83 7 The intrinsic volumes for polyconvex sets 86 7.1 The affine Grassmannian 86 7.2 The intrinsic volumes and Hadwiger's formula 87 7.3 An Euler relation for the intrinsic volumes 93 7.4 The mean projection formula 94 7.5 Notes 95 8 A characterization theorem for volume 98 8.1 Simple valuations on polyconvex sets 98 8.2 Even and odd valuations 106 8.3 The volume theorem 109 8.4 The normalization of the intrinsic volumes 111 8.5 Lattice points and volume 112 8.6 Remarks on Hilbert's third problem 115 8.7 Notes 117 9 Hadwiger's characterization theorem 118 9.1 A proof of Hadwiger's characterization theorem 118 9.2 The intrinsic volumes of the unit ball 120 9.3 Crofton's formula 123 9.4 The mean projection formula revisited 125 9.5 Mean cross-sectional volume 128 9.6 The Buffon needle problem revisited 129 9.7 Intrinsic volumes on products 130 9.8 Computing the intrinsic volumes 135 9.9 Notes 140 10 Kinematic formulas for polyconvex sets 146 10.1 The principal kinematic formula 146 10.2 Hadwiger's containment theorem 150 10.3 Higher kinematic formulas 152 10.4 Notes 153

  • Contents

    11 Polyconvex sets in the sphere 11.1 Convexity in the sphere 11.2 A characterization for spherical area 11.3 Invariant valuations on spherical polytopes 11.4 Spherical kinematic formulas 11.5 Remarks on higher dimensional spheres 11.6 Notes Bibliography Index of symbols Index

    ix

    154 154 156 159 162 164 166 168 174 176

  • Preface

    If we were allowed to rename the field of geometric probability - some-times already renamed integral geometry - then we would be tempted to choose the oxymoron 'continuous combinatorics.' On more than one occasion the two fields, geometric probability and enumerative combina-torics, are brought together by mathematical analogy, that most effective breaker of barriers.

    Like combinatorial enumeration, where sequences of objects bearing a common feature are unified by the idea of a generating function, ge-ometric probability studies sets of geometric objects bearing a common feature, which are unified by the idea of an invariant measure. The ba-sic idea is extremely simple. When considering straight lines, pairs of points, or triangles in space, one determines the invariant measure on the variety of straight lines, of pairs of points, of triangles. This idea is strangely reminiscent of the underlying idea of enumerative geometry, with one major difference: whereas enumerative geometry is bound to the counting of finite sets, geometric probability is given greater free-dom, by extending the concept of enumeration to allow the assigning of invariant measures. Invariant measures are far easier to compute and, we dare add, more useful than the curiously large integers that are com-puted in enumerative geometry. This basic idea goes back to Crofton's article in the ninth edition of the Encyclopaedia Britannica, an article that created the subject from scratch and that is still worth reading to-day. The one other brilliant contribution to geometric probability in the past century was Barbier's solution of the Buffon needle problem, which remains to this day the basic trick of the subject, still being secretly exploited in ever unsuspected ways.

    Geometric probability has suffered in this century the fate of other fields that would have enjoyed a healthy autonomous development, had

  • xii Preface it not been for the overpowering development of representation theory. One can reduce integral geometry to the study of actions of Lie groups, to symmetric spaces, to the Radon transform; in so doing, however, the authentic problematic of the subject is lost. Geometric probability is a customer of representation theory, in the same sense that mechanics is a customer of the calculus.

    The purpose of this book is to present the three basic ideas of geo-metric probability, stripped of all reliance on group-theoretic techniques. First, we investigate measures on polyconvex sets (i.e., finite unions of compact convex sets) in Euclidean spaces of arbitrary dimension that are invariant under the group of Euclidean motions. A great many mathematicians are still basking in the illusion that there is only one such measure, namely, the volume. We merrily destroy this illusion by proving what is at present the fundamental result of the field (due to Hadwiger), stating that the space of such invariant measures is of di-mension n + 1 in a Euclidean space of dimension n. The proof of this fundamental result given in the text is new, due to the first author. It becomes clear, on reading the applications of the fundamental theorem, that the basic invariant measure to be singled out from such a bounty is not the volume, but the Euler characteristic (as Steve Schanuel was first to realize). Here again we meet with wide ignorance on the part of the mathematical public: the fundamental fact that the Euler characteristic is an invariant measure (in fact, it is the only integer-valued invariant measure) is not as well known as it should be. It leads to one-line proofs of most of the fundamental theorems on convex sets. We develop the theory of the Euler characteristic from scratch, in a way that makes it look like an ordinary integral.

    Second, we prove the fundamental formula of integral geometry, viz., the kinematic formula. Here we displace the common device of Mink-owski sums from its typically central role, not merely as a display of mathematical machismo, but with an ulterior motive.

    Third, we try to bring out from the beginning the striking analogy be-tween the computation of invariant measures and certain combinatorial properties of finite partially ordered sets. The second author pointed out in 1967 that the notion of Euler characteristic could be extended to such partially ordered sets by means of the Mobius function. We now go one step further and show that an analogue of the theory of invariant measures in Euclidean space can be worked out in partially ordered sets, including finite analogues of the kinematic formula and even of Helly's theorem. This analogy brings out in stark contrast the

  • Preface xiii unexplored terrain of classical geometric probability, namely, a thor-ough understanding of the integral geometric structure of the lattice of subspaces of Euclidean space under the action of the orthogonal group. It also brings us closer to the current outer limits of mathematics, to the theory of Hecke algebras, to Schubert varieties and to the quantum world.

    We hope that the reading of this introduction to the field of geometric probability will encourage further development of these analogies.

    The text is based on the 'Lezioni Lincee' given by the second author in 1986 at the Scuola Normale Superiore in Pisa. The authors wish to thank Ennio De Giorgi, Edoardo Vesentini, and Luigi Radicati for providing an interested audience for the original lectures. Thanks are also due to Stefano Mortola for his careful reading of the initial draft, and to Beifang Chen, Steve Fisk, Joseph Fu, Steven Holt, Erwin Lutwak, and three anonymous referees for their valuable comments and suggestions.

  • U sing this book

    Although parts of this book assume a knowledge of basic point-set topol-ogy, measure theory, and elementary probability theory, the greater part of the text should be accessible to advanced undergraduates. Proofs are either given in full or else stretched to the point from which the reader will be able to reconstruct them without effort. Only on certain techni-cal measure-theoretic points have we felt the need to omit details that, although indispensable in a detailed treatment, are of questionable rel-evance in an exposition that is meant to stress geometric insight and combinatorial analogy. Some notions that appear vague in the early sections will be revisited later on, after language has been developed for a treatment in clear and rigorous terms. References and open problems are deferred to the notes at the end of each chapter.

  • 1

    The Buffon needle problem

    We begin with what is probably the best-known problem of geometric probability, the Buffon needle problem. This solution of the needle problem via the characterization of an additive set functional serves to motivate the study of valuations on lattices, the topic of Chapter 2. Vari-ations and generalizations of the Buffon needle problem are presented in Chapters 8 and 9.

    1.1 The classical problem Parallel straight lines are drawn on the plane R2, at a distance d from each other. A needle of length L is dropped at random on the plane. What is the probability that the needle shall meet at least one of the lines?

    This problem can be solved by computations with conditional proba-bility (Feller, for example, solved it in this way in his well known trea-tise [23, p. 61]). It is, however, more instructive to solve it by another method, one that minimizes the amount of computation and maximizes the role of probabilistic reasoning.

    Let Xl be the number of intersections of a randomly dropped needle of length Ll with any of the parallel straight lines. If the needle is long enough, the random variable Xl can take several integer values, whereas if the needle is short, it can take only the values 0 or 1.

    If Pn is the probability that the needle meets exactly n of the straight lines, and if E(Xd denotes the expectation of the random variable Xl, then we have

    E(Xl ) = L npn n2:0

  • 2 1 The Buffon needle problem Thus, if L1 < d, then

    and P1 is the probability we seek. Therefore, it is sufficient to compute the expectation E(X1)' Suppose that another needle of length L2 is dropped at random. The number of intersections of this second needle with any of the parallel straight lines drawn on R 2 is another random variable, say X 2 . The random variables Xl and X2 are independent, unless the needles are welded together. Suppose that the needles are rigidly bound at one of their endpoints. They may form a straight line, or they may be at an angle. In either case, if the two rigidly bound needles are simultaneously dropped on R 2 , their total number of intersections will still be Xl + X 2 . The random variables Xl and X2 will no longer be independent, but their expectation will remain additive:

    (1.1) The same reasoning applies to the random variable Xl + X 2 + ... + X k , for the case in which k needles are welded together to form a polygonal line of arbitrary shape.

    Since E(X1) clearly depends on the length L 1, we can write E(Xd = I(L 1 ), where 1 is a function to be determined. By welding together two needles so that they form one straight line we find that E(X1 + X 2) = I(L1 + L 2), and we infer from (1.1)

    I(L1 + L 2) = I(L1) + I(L2). It then follows that 1 is linear when restricted to rational values of L. Since 1 is clearly a monotonically increasing function with respect to L, we infer that I(L) = rL for all L E R, where the constant r is to be determined.

    If C is a rigid wire of length L, dropped randomly on R2, and if Y is the number of intersections of C with any of the straight lines, then C can be approximated by polygonal wires, so that Y is approximately equal to Xl + X 2 + ... + Xk. Passing to the limit, we find that

    E(Y) = rL. (1.2) This allows us to determine the value of the constant r, by choosing a wire of suitable shape. Let C be a circular wire of diameter d. Obviously E(Y) = 2, and L = 7rd. It then follows from (1.2) that

    2 = r7rd,

  • 1.2 The space of lines whence r = 2j(7rd). Thus, for a short needle, we have

    2L E(X1 ) = P1 = 7rd

    3

    This result has been used (rather inefficiently) to compute the value of 7r. Instead, we shall use it as the theorem leading into the heart of geometric probability, following the ideas of Crofton and Sylvester.

    1.2 The space of lines Let Graff(2,1) denote the set of all straight lines in R2 (the reason for this notation shall be made clearer in Chapter 7). It is well known that this set enjoys some notable properties.

    To this end, denote by Z1 the number of intersections of a straight line taken at random with a straight line segment of length L 1 , and let Ai denote the invariant measure on Graff(2,1). The integral

    r Z1 dAi JGraff(2,1)

    depends only on L 1. Since Z1 takes only the values 0 or 1, this integral is equal to the measure of the set of all straight lines that meet the given straight line segment. Since the value of the integral depends only on the length L1 of the straight line segment, denote this value by f(L 1 ). We can now repeat the argument we used for the Buffon needle problem: given a polygonal line consisting of segments of length L 1 , L 2 , . , the number of intersections of a randomly chosen straight line with the polygonal line is

    r (Z1+Z2+ ... )dAr=f(L1+L2+ ... ). JGraff(2,1)

    Since integrals are linear, this becomes

    r Z1 dAr + r Z2 dAr + ... = f(L 1) + f(L2) + ... , JGraff(2,1) JGraff(2,1)

    and we again conclude that f(L) = rL. We shall not normalize the measure Ai by setting r = 1; rather, we shall decide later what the 'right' normalization should be.

    Again we may pass to the limit. Recall that a subset K of the plane is convex if any two points x and y in K are the endpoints of a line segment lying inside K. A curve C in the plane is called convex if C encloses a convex subset. Let C be a convex curve in the plane of length L, and let

  • 4 1 The Buffon needle problem Zc be the number of intersections of C with a randomly chosen straight line. Then

    ( zcdAi = rL. lGraff(2,1)

    In particular, let K1 and K2 be compact convex sets in the plane with non-empty interiors, and with boundaries C1 = oK1 and C2 = oK2 of length L1 and L 2. For each i, we have

    ( ZCi dAi = rLi' lGraff(2,1)

    On the other hand, since Ki is convex, a straight line meets Ki either twice or not at all (excluding the limiting cases of tangents, which can be shown to have measure zero). Thus, the function ZCi takes either the value 2 or the value O. If we denote by Di the set of all straight lines in R2 that meet K i , then we have

    ( ZCi dAi = 2Ai(Di ). lGraff(2,1)

    To re-state these results in terms of probability, assume that K1 ~ K 2. The conditional probability that a straight line shall meet the compact convex set K 1 , given that it meets K 2 , is the ratio

    AI(D1 ) AI(D2 )

    The computation above shows that this ratio is equal to

    L1 length(oK1) = :----"'--=,--7:::-::-::'C:-

    L2 length(oK2)" Note that the value of the normalization constant r is irrelevant to the computation of this conditional probability.

    The results above (sometimes designated Sylvester's theorem) can be compared to the analogous result for points: if K1 ~ K 2 , the conditional probability that a point taken at random shall belong to K 1 , given that it belongs to K 2 , is

    area(K1) area(K2)'

    Thus, we see a striking analogy: replacing every occurrence of the word 'point' by the word 'line' corresponds to replacing the word 'area' by the word 'perimeter.' This analogy suggests that a generalization of Sylvester's theorem to arbitrary dimension may prove worthwhile.

  • 1.3 Notes 5

    1.3 Notes The solution to Buffon's needle problem presented here is due to Bar-bier [5], and was later generalized still further by Crofton in [14, 15, 16]. Crofton's main paper, which set geometric probability on its modern footing, is the Encyclopaedia Britannica article [17]. It is still an excel-lent reference.

    In [95] Sylvester considered a variation of the Buffon needle problem in which the needle is replaced by a finite rigid collection of compact convex (and possibly disjoint) sets K 1 , .. , Km tossed randomly into a plane tiled by evenly spaced lines. Sylvester then considered the cases in which a line meets one, some, or all of the sets K i . In the previous section we measured the set of all lines meeting a compact convex set K in the plane. When dealing with multiple convex sets Sylvester was led to consider also the measure of the set of lines that separate two disjoint compact convex regions of the plane. This theme has also been pursued extensively in the work of Ambartzumian [1, 2].

    Buffon's result gives a very inefficient means of approximating the number 11"; for a history of this technique, see [30]. For additional modern treatments of geometric probability in the plane, see also [1, 2, 49, 82, 90].

  • 2

    Valuation and integral

    In Chapter 1 we expressed the Buffon needle problem in terms of a set functional (1.1) on a certain collection of sets in the plane satisfying a certain kind of additivity. We then solved the problem by characterizing this additive functional in (1.2), using in this case the fact that the functional was monotonically increasing and invariant with respect to certain motions of sets in the plane.

    In this chapter we make more precise the notion of 'additive set func-tional', or valuation, on a lattice of sets. The abstract notions devel-oped in this chapter will then be specialized to several different specific lattices in the chapters following, leading in turn to similarly elegant solutions to generalizations and analogues of Buffon's original problem. Section 2.2 is devoted to Groemer's integral theorem, which is needed to prove Groemer's extension theorems in Sections 4.1, 5.1, and 11.1.

    2.1 Valuations We now introduce a class of set functions that comprise the most basic and important tools of geometric probability, namely valuations. We begin with partially ordered sets and lattices. A partial ordering S; on a set L is a relation satisfying the following conditions for all x, y, z E L.

    (i) x S; x. (ii) If x S; y and y S; x then x = y.

    (iii) If x S; y and y S; z then x S; z. The partially ordered set L is called a lattice if, for all x, y E L, there exist a greatest lower bound (or meet) x 1\ y ELand a least upper bound (or join) x V y E L. A lattice L is said to be distributive if, for all x, y, z E L, we have the following.

  • 2.1 Valuations

    (i) x V (y 1\ z) = (x V y) 1\ (x V z). (ii) x 1\ (y V z) = (x 1\ y) V (x 1\ z).

    7

    Let S be a set, and let L be a family of subsets of S closed under finite unions and finite intersections. Such a family is clearly a distribu-tive lattice, in which the partial ordering is given by subset inclusion, while the meet and join are given by intersection and union of sets, respectively.

    A valuation on a lattice L of sets is a function JL defined on L that takes real values, and that satisfies the following conditions:

    JL(A U B) = JL(A) + JL(B) - JL(A n B),

    JL(0) = 0, where 0 is the empty set.

    (2.1)

    (2.2) By iterating the identity (2.1) we obtain the inclusion-exclusion prin-

    ciple for a valuation JL on a lattice L, namely

    JL(A 1 U A2 U ... U An) = L JL(Ai) - L JL(Ai n Aj) + L JL(Ai n Aj n A k) + ... (2.3)

    i i

  • 8 2 Valuation and integral

    IA1UA 2 U ... UA n = 1 - (1 - IAJ(1 - IA2 ) (1 - IAJ = L1Ai - LIAinAj + L IAinAjnA k +.... (2.7)

    i

  • 2.2 Groemer's integral theorem 9

    Theorem 2.2.1 (Groemer's integral theorem) Let G be a gener-ating set for a lattice L, and let f.L be a valuation on G. The following statements are equivalent.

    (i) f.L extends uniquely to a valuation on L. (ii) f.L satisfies the inclusion-exclusion identities

    f.L(B1UB2UUBn) = Lf.L(Bi)- Lf.L(BinBj )+ ... , (2.11) i

  • 10 2 Valuation and integral

    Suppose that x E Lq - U;=q+l L j Then (2.14) implies that P

    O'.q = LO'.ihi(x) = 0, i=q

    contradicting our assumption. It follows that

    so that

    For i > q, note that Lq n Li = L j , where j > q. Using the principle of inclusion-exclusion (ii) we obtain

    so that p L (3ij.l(L i ) -I- 0 (2.16)

    i=q+l

    by (2.15), where each (3i is obtained by collecting the terms containing j.l(Li). Meanwhile, application of the same inclusion-exclusion proce-dure to the indicator functions yields

    so that p

    L f3ih i =0 (2.17) i=q+l

    by (2.14). Together (2.16) and (2.17) contradict the maximality of q. This completes the proof that (ii) implies (iii).

    To show that (iii) implies (i), suppose that the function j.l defines an integral on the space of L-simple functions. For A E L define

    j.l(A) = J fA dj.l. The linearity of the integral together with the identity (2.6) implies that this extension of j.l is a valuation on L. D

  • 2.3 Notes 11

    A linear functional T on the vector space of simple functions deter-mines a valuation JL by setting

    for every A E L. It is easily verified that, for a simple function f,

    T(f) = J f dJL. Thus, insofar as simple functions are concerned, there is a bijective cor-respondence between linear functionals and valuations.

    Let B (L) be the relative Boolean algebra generated by the distributive lattice L; that is, the smallest family of subsets of S containing L that is closed under finite unions, finite intersections, and relative complements. Note that for A, BEL

    (2.18) Let I( L) denote the algebra of simple functions generated by finite sums, products, and differences of indicator functions of sets in L. If follows from (2.5), (2.6), and (2.18) that Ie E I(L) for all C E B(L).

    Corollary 2.2.2 A valuation JL defined on a distributive lattice L has a uniquely defined extension to the Boolean algebra B (L ). Proof By Theorem 2.2.1, JL defines an integral on the space of indicator functions I(L). For C E B(L) define

    JL(C) = J Ie dJL. The linearity of the integral together with identity (2.7) implies that this extension of JL is a valuation on B(L). D

    2.3 Notes The study of valuations, while natural enough on its own as a finitely additive precursor to the measure theory of modern probability, was in-vigorated especially by interest in dissection problems on polytopes, and Hilbert's third problem in particular [8, 71, 72, 81] (see also Section 8.6). In the sections that follow we shall see that most of the interesting func-tionals of geometric probability satisfy the valuation property in some respect.

  • 12 2 Valuation and integral

    The integral and extension theorems of Groemer may be found in [32). McMullen and Schneider gave a thorough survey of the modern theory of valuations on convex bodies in [72), later updated by McMullen in [71).

  • 3

    A discrete lattice

    In this chapter we focus on combinatorial properties of the lattice of subsets of a finite set, properties which carryover in analogous forms to the lattice of parallelotopes in Chapter 4, of subspaces in Chapter 6, of polyconvex sets in Chapters 5, 7-10, and of spherical polyconvex sets in Chapter 11. An especially important result of this chapter is the characterization of valuations invariant under the permutation group. The idea of characterizing valuations invariant with respect to a group action or a set of symmetries is central to our treatment of geometric probability; this theme will recur frequently in the chapters following.

    3.1 Subsets of a finite set Let S be a non-empty set with n elements, and denote by P(S) the set of all subsets of S, partially ordered by subset inclusion. The set P( S) is a (finite) Boolean algebra of subsets.

    Recall that the union and intersection of sets coincide with the least upper bound and greatest lower bound in the partially ordered set P(S). We denote the elements of P(S) by lower case letters x, y, etc.

    A segment of P(S), denoted by [x, y], where x ::::: y, consists of all elements Z E P(S) such that x::::: Z ::::: y. Every segment [x, y) is naturally isomorphic to the Boolean algebra P(y - x).

    A chain in P(S) is a linearly ordered subset; that is, a subset in which, for every pair x, y, either x ::::: y or y ::::: x. An antichain is a subset A

  • 14 3 A discrete lattice

    For x E P(S), the rank r(x) is the number of elements of the set x. The antichain consisting of all elements of P(S) of rank k shall be denoted by Pk(S). The size, or number of elements, of Pk(S) is the binomial coefficient

    A flag in P(S) is naturally identified with a linear order (Sl' S2,"" sn) on the one-element subsets Si of S. Hence, there are n! flags in P(S). A flag contains an element x E Pk (S) whenever x = Sl U S2 U ... U Sk. Thus, there are k!(n - k)! flags containing a particular x E Pk(S). This elementary argument gives the classical expression for the binomial coefficient:

    ( n) n! k - k!(n - k)!' Actually, the same argument can be made to yield a much stronger

    result. Denote by IAI the size of a finite set A.

    Theorem 3.1.1 (Sperner's theorem) Let A be an antichain in P(S). Then

    IAIS ((n~2))' Here the expression (n/2) denotes the greatest integer smaller than

    or equal to n/2. Evidently equality is attained in Theorem 3.1.1 when A = P(n/2) (S). Proof The proof of this theorem depends on a more precise result, known as the Lubell-Yamamoto-Meshalkin (L.Y.M.) inequality. Let A be an antichain in P(S), and let Ak consist of all elements of A of rank k. Then

    (3.1)

    To prove (3.1), notice that every flag meets A in at most one element of P(S). Therefore, the number p of flags meeting A is

    n

    p = L k!(n - k)!/Akl k=O

  • 3.1 Subsets of a finite set 15 Since there are n! flags in P( S), we have p :::; nL Dividing by n! proves the L.Y.M. inequality (3.1).

    To complete the proof of Sperner's theorem, recall that

    for all 0 :::; k :::; n. The L.Y.M. inequality (3.1) now gives

    that is,

    t~

  • 16 3 A discrete lattice

    then

    (3.4)

    If Co > ... > Cn > 0, then equality holds in (34) if and only if Xi = Ci for 0 :::; i :::; r - 1 and Xi = 0 for r :::; i :::; n.

    Proof Suppose that Xo, . .. ,Xn ::::: 0 minimize the sum

    (3.5)

    subject to the condition (3.3). If this minimum value of (3.5) is less than r then Xi < Ci for some i :::; r - 1. Let i be the smallest index such that Xi < Ci, so that Xk/Ck = 1 for all 0:::; k < i (unless i = 0).

    The inequality (3.3) then implies that there is an index j ::::: r such that Xj > O. Let j be the largest such index, so that Xk = 0 for k > j (unless j = n). In particular, note that j > i.

    Suppose Ci = Cj. Since Ci ::::: CHI ::::: Cj, we have Ci = CHI = ... = Cj. It then follows that

    Co + ... + Ci-I + Xi + ... + Xj Xo + ... +xn > Co + ... + Cr-I

    Co + ... + Ci-I + (r - i)Ci, since r - 1 < j. Therefore,

    Xi + ... + Xj ::::: (r - i)Ci. (3.6) It follows that

  • 3.1 Subsets of a finite set by (3.6). In other words,

    n L Xk 2: r. k=O Ck

    17

    (3.7)

    This contradicts the assumption that Xo, .. , Xn Illllllmizes (3.5) at a value less than r. Therefore, inequality (3.4) follows.

    Now suppose instead that Ci > Cj. If Xi + Xj :s; Ci set Yi = Xi + Xj and Yj = O. Otherwise, set Yi = Ci and Yj = Xj - (Ci - Xi). In either case set all other Yk = Xk

    Since Ci > Cj it follows in both cases that

    Yi + Yj < Xi + Xj, Ci Cj Ci Cj

    so that n n

    L Yk ... > Cn, then Ci > Cj is guaranteed, and so it follows that (3.5) is minimized only if Xi = Ci for 0 :s; i :s; r - 1 and Xi = 0 for r :s; i :s; n. D

    Proof of Theorem 3.1.2 Relabel the binomial coefficients Co, CI,, Cn in descending order, so that Co 2: CI 2: ... 2: en > 0; then relabel the numerators /Fk / in (3.2) by xo, Xl, .. , X n , so that each Xk is the numera-tor of that term of (3.2) having Ck as denominator. The inequality (3.2) now becomes

    It then follows from Lemma 3.1.3 that

    Xo + Xl + ... + Xn :s; Co + CI + ... + Cr-l

    In other words,

    D

    The theory of binomial coefficients and antichains generalizes nicely to a theory of multinomial coefficients and special collections of ordered

  • 18 3 A discrete lattice

    partitions, known as s-systems. Sperner's result can also be generalized from a bound on the size of an antichain to a bound on the size of an s-system.

    A map 15 : {I, ... ,r} --+ P(8) is called an r-decomposition of 8 if (i) 8(i) n 8(j) = 0 for i i= j, and

    (ii) 15(1) U U 8(r) = 8. Denote by Dec(8, r) the set of all r-decompositions of 8. Note that for each 15 E Dec(8, r)

    115(1)1 + ... + 18(r)1 = n. Given non-negative integers aI, a2, ... , ar such that al + ... + ar = n

    we denote by Pal , ... ,ar (8) the set of all r-decompositions 15 such that 18(i)1 = ai for i = 1, ... ,r. In other words, Pal , ... ,ar (8) is the set of all (ordered) partitions of 8 into disjoint unions of subsets having sizes aI, ... , ar . Evidently the set Dec( 8, r) can be expressed as the finite disjoint union

    Dec(8,r) =

    The size of Pal, ... ,ar (8) is given by the multinomial coefficient

    An s-system of order r (or an s-system in Dec(8, r)) is a subset (J ~ Dec( 8, r) such that the set

    { 15 (i) : 15 E (J} (3.8) is an antichain in P(8) for each 1 ::::; i ::::; r.

    An obvious example of an s-system of order r is Pal , ... ,ar (8) for some admissible selection of aI, ... , ar . If 15, ( E Pal, ... ,ar (8) then 15 (i) and ((i) both have size ai, so that either 8(i) = ((i) or the two sets are incomparable in the subset partial ordering on P(8). This holds for i = 1, ... , r, and so the antichain condition on (3.8) is satisfied.

    Other disguised examples with which we have already worked are the s-systems of order 2. Let A be an antichain in P(8). For each x E A we can express 8 as the disjoint union x lJ 8 - x, so that the pair (x, 8 - x) is a 2-decomposition in Dec(8, 2). Moreover, the set

    {8-x:xEA}

  • 3.1 Subsets of a finite set is also an antichain in P(S), so that the set

    IT = {(x, S - x) : x E A}

    19

    is an s-system of order 2. Thus the notion of s-system is a generalization ofthe notion of an antichain. Similarly, the collection Pk(S) can also be viewed as Pk,n-k(S) through the bijection x f-+ (x, S - x).

    For 15 E Pa1, ... ,ar(S) we say that a flag (XO,XI,""Xn) in P(S) is compatible with 15 if

    (i) x a1 = 15(1), and (ii) x a1+'+ai - xal+.+ai_l = 8(i), for i :2: 2.

    Here the difference X a1 +.+ai - X a1 +.+ai-l denotes the complement of the set xal+.+ai_l inside the larger set x a1+.+ai . For A

  • 20 3 A discrete lattice

    Proof For al + .. +ar = n the number of flags compatible with aal, ... ,ar is given by

    IFlag(a )1 = la lal"" a , al, ... ,ar al, ... ,ar r Suppose that a flag (XO,Xl, ... ,Xn) is compatible with both ,,(,8 E

    a. Then "(1) = xa1 and 8(1) = Xbll where al = h(l)1 and bl = 18(1)1. Since (XO, Xl,"" Xn) is a flag, we have Xa1 ~ Xbl or vice versa. However, a is an s-system, so that either "(1) = 8(1) or the two sets are incomparable. Therefore "(1) = 8(1) and al = bl . Continuing, we have "(2) = xa1 +a2 - Xa1 and 8(2) = Xbl+b2 - xa1 (since al = bI). A similar argument then implies that "(2) = 8(2) and a2 = b2 . Continuing in this manner we conclude that "(i) = 8(i) for each 1 ~ i ~ r, so that "( = 8. In other words, every flag in P(S) is compatible with at most one r-decomposition 8 E a. It follows that

    la lall. a , al, ... ,ar . r'

    L IFlag(aal, ... ,aJI = IFlag(a)I ~ n! al++ar=n

    so that

    D We are now ready to prove a multinomial generalization of Sperner's

    Theorem 3.1.1.

    Theorem 3.1.5 (Meshalkin's theorem) Let a be an s-system in Dec(S, r). Then

    lal ~ ((n/r) , ... , (n/r), (n7r) + 1"", (n/r) + 1) (3.10) , ", , ",

    v v

    r-b b

    where n == b mod r.

    Here (n/r) denotes the largest positive integer less than or equal to n/r.

    Proof It is not difficult to show that, for all compositions al + .. +ar = n of a positive integer n, we have

  • 3.2 Valuations on a simplicial complex 21

    For a sketch of a proof, see the analogous Proposition 6.5.2. Let aal, ... ,ar = an Pa1 , ... ,ar (8). It now follows from (3.9) that

    L ( /aal, ... ,arl ) < L /za1,;;.,a)/ ::; 1, al ++ar=n (n/r), .. ,(n/r),(n/r)+l, .. ,(n/r)+l al ++ar=n al,,ar so that

    /a/

    al+~r=laal, ... ,ar/ ::; Cn/r) , .' .. , (n/r) , (n7r) + 1"", (n/r) + 1)' D

    Several other results of extremal set theory follow from the L.Y.M. inequality or its variants, but we reluctantly move on to the next topic.

    3.2 Valuations on a simplicial complex Define a simplicial complex to be a subset A of P( 8) such that if x E A and y ::; x then YEA. A simplicial complex is a partially ordered set in the order induced by P(8). The set of maximal elements of a simpli-cial complex is an antichain. A simplicial complex having exactly one maximal element is called a simplex. A simplex whose unique maximal element is a set of size k is called a k-simplex.

    The (set-theoretic) union and intersection of any number of simpli-cial complexes is again a simplicial complex. Thus, the set L(8) of all simplicial complexes in P(8) is a distributive lattice, and we can study valuations on L(8).

    For x E P(8), denote by x the simplex whose maximal element is X; that is, the set of all y E P(8) such that y ::; x.

    It follows from Groemer's integral Theorem 2.2.1 and Corollary 2.2.2 that every valuation fJ on L(8) extends uniquely to a valuation, again denoted by fJ, on the Boolean algebra P(P(8)) of all subsets of P(8), which is generated by L(8). Such a valuation is evidently determined by its value on the one-element subsets of P( 8); that is, by arbitrarily assigning a value fJ( {x}) for each X E P(8).

    ,Let x be of rank k, and let A 1, A 2 , . ,Ak be the maximal simplices Ai E x such that Ai i=- x. (These simplices are sometimes called the facets of x.) Then

  • 22 3 A discrete lattice

    The right-hand side can be computed in terms of simplices of lower rank, by the inclusion-exclusion principle. Thus, by induction on the rank, we have the following theorem.

    Theorem 3.2.1 Every valuation J-l on the distributive lattice L(8) of all simplicial complexes is uniquely determined by the values J-l(X) , x E P(8). The values J-l(x) may be arbitrarily assigned. D

    A valuation J-l on L(8) is called invariant if it is invariant under the group of permutations of the set 8; that is, if J-l(A) = J-l(gA) for every simplicial complex A and for every permutation 9 of the set 8 (which induces a permutation on L(8), also denoted by g). We next establish the existence of the Euler characteristic. The following is an immediate consequence of Theorem 3.2.1.

    Theorem 3.2.2 (The existence of the Euler characteristic) There exists a unique invariant valuation J-l on L(8), called the Euler charac-teristic, such that J-lo(x) = 1 for every simplex x with rex) > 0, and such that J-lo(0) = 0. D

    Next, we derive the classical alternating formula for the Euler charac-teristic. Define a valuation on P(P(8)), denoted J-l~, by setting

    J-l~(0) = 0, and

    J-l~({x}) = (-I)k-l, if rex) = k. Then

    J-l~(X) = L J-l~( {y}) y

  • 3.2 Valuations on a simplicial complex 23

    so that JL~(x) = JLo(x), for all simplices x. It now follows from Theo-rem 3.2.1 that JL~ = JLo, and that the following formula holds.

    Theorem 3.2.3 (The discrete Euler formula) Let A be a simplicial complex, and let fk be the number of elements {or 'faces'} of rank k. Then

    JLo(A) = h - h + h - .... (3.11) D

    For i > 0, set

    and extend JLi to all of L(8) by Theorem 3.2.1. Clearly, for every sim-plicial complex A,

    The discrete Euler formula can now be rewritten

    JLo(A) = JL1(A) -1L2(A) + JL3(A) - ... , (3.12) for any simplicial complex A.

    The valuations JLk can also be expressed in terms of symmetric func-tions. Let P1 (8) = {a1' a2, ... , an}. Given the symmetric function

    let ti(X) = 1 if ai E x and ti(X) = 0 if ai fj. x. Evaluated at x, for x f 0, we have til ti2 ... tik = 1 if {ail' ... ,aik} E x, and til ti2 ... tik = 0 otherwise. It follows that ek(t1, t2, ... , tn) = JLk(X) for every simplex x other than {0}.

    Note also that if x E P(8) and r(x) = j then JLi({X}) = 1 if i = j, while JLi({X}) = 0 ifi fj.

    Theorem 3.2.4 (The discrete basis theorem) The invariant valu-ations JLo, JL1, ... ,JLn span the vector space of all invariant valuations JL on L(8) such that JL( {0}) = O. The only linear relation among them is formula {3.12}. Proof Suppose JL is an invariant valuation on L(8) such that JL( {0}) = o. Extend JL to all of P(P(8)). Note that the extended valuation, which is still denoted JL, is again invariant. If x and y have the same rank in P(8), say r(x) = r(y) = i, then there exists a permutation g of 8 such

  • 24 3 A discrete lattice

    that gx = y. Therefore, JL( {x}) = JL( {y}) = Ci, for some constant Ci' Thus, the valuation

    vanishes on all singleton sets {x}, for all x E P(8), and therefore vanishes on all of P(P(8)). D

    As an application of the discrete basis theorem, we shall derive a discrete analogue of the kinematic formula (whose classical geometric version appears in Chapter 10).

    One way to construct invariant valuations on L(8) is the following. Start with any valuation JL on L(8) such that JL( {0}) = 0, and let B be any simplicial complex. For any simplicial complex A, set

    1 JL(A; B) = , L JL(A n gB), n.

    9

    where 9 ranges over all permutations of the set 8 of size n. For fixed A, the set function JL(A; B) is a valuation in the variable B; in fact, it is an invariant valuation. It can therefore be expressed as a linear combination of the valuations JLi, with coefficients ci(A) depending on A:

    n

    JL(A; B) = L ci(A)JLi(B). (3.13) i=l

    Meanwhile, for fixed B, the set function JL(A; B) is a valuation in the variable A. From this it follows that each of the coefficients ci(A) is a valuation in the variable A. The coefficient ci(A) can be given an explicit expression in terms of JL. This follows from the fact that if x E P(8) and r ( x) = j then JLi ( {x}) = 1 if i = j and is zero if i =f- j.

    Consider the case in which JL is an invariant valuation. If so, then 1 1 ,LJL(AngB) =, LJL(g-lAnB) n. n.

    JL(A;B) 9 9

    1 , LJL(gA n B) = JL(B; A). n.

    9

    Moreover, the coefficients ci(A) are now invariant valuations in the vari-able A. Therefore, Theorem 3.2.4 implies that

    n

    JL(A; B) = L CijJLi(A)JLj(B). i,j=l

  • 3.2 Valuations on a simplicial complex 25

    Since JL(A; B) = JL(B; A), it is evident that Cij = Cji. It turns out that most of the constants Cij are equal to zero. In order to compute the constants Cij explicitly, extend the valuation JL to the Boolean algebra P(P(8)) generated by L(8), and let (Vi denote the value of JL on a singleton set in P(P(8)) whose element is a subset of 8 of size i.

    Theorem 3.2.5 (The discrete kinematic formula) Suppose that JL is an invariant valuation on L( 8). For all A, B E L( 8),

    Proof Suppose that Xi, Yj C 8 with size i and j respectively. Let A = {Xi} and B = {Yj}. For any permutation 9 of 8, the set An gB = 0 if i =f- j. If i = j then An gB = 0 if Xi =f- gYj. Since there are i!(n - i)! permutations 9 of 8 such that Xi = gYj (if i = j), we have

    1 i'(n - i)' (n)-l JL(A;B) =, LJL(AngB) =. , 'JL(A) = . (Vi n. n. 2

    9

    Meanwhile, JLk(A) = 1 if k = i and is equal to zero otherwise. Similarly, JLk(B) = 1 if k = j and is equal to zero otherwise. Hence,

    n

    JL(A; B) = L CijJLi(A)JLj(B) = Cij' i,j=l

    Therefore,

    if i = j and is equal to zero otherwise. D

    We shall be particularly concerned with the case JL = JLo. The discrete Euler formula (3.11) implies that JLo({xd) = (-I)i+I, so that

    ~! ~ JLo(A n gB) = t( _1)i+l (~) -1 JLi(A)JLi(B), for all A, BE L(8).

    If x and 'fJ are simplices, then either x n 'fJ is a smaller (non-empty) simplex, in which case JLo(x n 'fJ) = 1, or x n 'fJ = 0, in which case

  • 26 3 A discrete lattice

    J-lo (x n y) = O. The probability that a randomly chosen k-simplex Xk shall meet a fixed l-simplex Yl can now be computed as follows:

    so that

    1 n. (n) -1 (l) (k) n! ~J-lO(YI n gXk) = t;( _1),+1 iii . (3.14)

    The equation (3.14) leads to a notable example of how the discrete kinematic formula can be used to generate identities for the binomial coefficients. To generate such an identity, we use elementary probabilis-tic reasoning to compute instead the probability that Yl n gXk = 0, for a random permutation g. Label the elements of S by {Sl, .. . , Sn} so that Xk = {Sl, ... ,Sk}. In order for Yl ngxk = 0 to hold, we require gSl E S-Yl, of which there are n-l choices. There then remain n-(l+I) possible values for gS2, etc., so that there are

    (n -l)(n -l-I) (n -l- k + 1) choices of values for gSl, ... , gSk. Having chosen these values, there are n - k possible choices remaining for gSk+1, then n - k -1 possible values for gSk+2, and so on, up to one possible value remaining for gSn. It follows that there are

    (n -l) (n -l - k + 1)(n _ k) ... 1 = (n -l)!(n - k)! (n - k -l)!

    permutations g of S such that Yl n gXk = 0. Hence the probability that Yl n gXk = 0 for a random permutation g is given by

    ~ (n -l)!(n - k)! = k!(n - k)!(n -l)! = (n)-l (n -l) n! (n-k-l)! n!k!(n-k-l)! k k'

    It now follows from (3.14) that

    (3.15)

    By adding the term corresponding to i = 0 to both sides of (3.15) and multiplying by -1 we obtain the following identity:

  • 3.2 Valuations on a simplicial complex 27

    Theorem 3.2.6

    for all positive integers 0 :::; k, l :::; n. 0 Note that, if k + l > n, then

    In the preceding argument this corresponds to the case in which the two sets Yl and gXk have non-empty overlap for any permutation g, i.e. the case in which Yl n gXk = 0 with probability zero.

    We conclude this discussion of subsets and simplices with an applica-tion of the results of Section 3.1 to a question posed by Spemer. Let A E (8), and suppose that all ofthe maximal elements of A have rank kj i.e. suppose that every face of A is contained in a k-face of A. Let [All denote the collection of all l-faces of A, for 0 :::; l :::; k. Is there a lower bound for the number of l-faces I[Altl, given the number of k-faces (maximal faces) I[Alkl? The L.Y.M. inequality gives one answer to this question.

    Theorem 3.2.7 Suppose A E (S) such that every maximal element of A has rank k. For 0 :::; l :::; k,

    k!(n - k)! I[Alti ~ l!(n -l)! I[Alkl

    Proof Let Bl = PI(S) - [A]t. If Y E Bl then Y cannot be contained in any x E A. In other words, the set [Alk UBI is an antichain. It then follows from (3.1) that

    Meanwhile,

    IBtI = IPI(S) - [Altl = (7) -1[Altl, so that

  • 28 3 A discrete lattice

    It follows that

    ( n) (n) -1 k!(n - k)! I[All! ~ l k I[Alkl = l!(n -l)! I[Alkl o

    For example, in the case l = k - 1, we have

    3.3 A discrete analogue of Helly's theorem We turn next to a special case and discrete analogue of Helly's theorem, whose classical geometric version is given in Chapter 5.

    Theorem 3.3.1 (The discrete Helly theorem) Let S be a finite set of size lSI = n, and let F be a family of subsets of s. Suppose that, for any subset G

  • 3.4 Notes 29

    i =f- j2. Hence, m+l

    sE n A= n A. i=l AEF

    o 3.4 Notes

    For a general reference to the theory of lattices and partially ordered sets, see [92] (also [3, 27,42, 80]). For a graph-theoretic viewpoint, see [7]. In [84], Schanuel developed the Euler characteristic from a category-theoretic perspective. In [12], Chen extended the Euler characteristic to linear combinations of indicator functions of unbounded closed convex sets and unbounded relatively open convex sets. A thorough treatment of the combinatorial theory of the Euler characteristic appeared in [79, 80].

    The face enumerators J-li playa role analogous to that of the intrinsic volumes on parallelotopes and polyconvex sets (see Sections 4.2 and 7.2). The discrete basis theorem and kinematic formulas can be extended to the more general setting of finite vector spaces; see [53].

    Sperner's Theorem 3.1.1 first appeared in [91], as did Theorem 3.2.7. The proof presented in this section is due to Lubell [63]. Meshalkin proved Theorem 3.1.5 in [73]. In this section we follow a more simpli-fied approach due to Hochberg and Hirsch [45]. The generalization of Sperner's theorem to r-families (Theorem 3.1.2) was originally due to Erdos [22], but the proof given in this section is due to Harper and Rota [42].

    While Theorem 3.2.7 gives a lower bound to the number of l-faces of a simplicial complex A, all of whose maximal elements have dimen-sion k, Katona [48] and Kruskal [57] independently obtained a much stronger result, giving the exact minimum size of [A]l, a minimum that is independent of the size of the ambient set S. Katona's original paper may also be found in [27]. For a shorter proof of the Katona-Kruskal theorem, see [3, 7]. A survey of results in extremal set theory, includ-ing generalizations of Sperner's theorem and the L.Y.M. inequality, was presented in [29].

    The discrete Helly Theorem 3.3.1 is a special case of a more general theorem of convex geometry. This geometric result is treated in greater detail in Chapter 5 (see also [85, pp. 3-5]). Helly's original theorem motivated considerable developments in the field of combinatorial ge-ometry, and 'Helly-type' theorems are now common in many branches of mathematics [7, 18, 21].

  • 4

    The intrinsic volumes for parallelotopes

    We develop next a theory of invariant valuations for the lattice of fi-nite unions of orthogonal parallelotopes, having edges parallel to a fixed frame. Many of the central results in geometric probability can be stated and proven easily in the context of parallelotopes. For this reason the lattice of parallelotopes will serve as a model for the more difficult task of developing a lattice theory for finite unions of compact convex sets in Rn. The Euler characteristic, intrinsic volumes, and valuation charac-terization theorems for the lattice of parallelotopes will serve as proto-types in Chapters 5-9 for analogous constructions and characterization theorems in the more general context of polyconvex sets.

    4.1 The lattice of parallelotopes Choose a Cartesian coordinate system in Rn, which shall remain fixed throughout this chapter, and let Par(n) denote the family of sets that are obtained by taking finite unions and intersections of orthogonal par-allelotopes (i.e. rectilinear boxes), with sides parallel to the coordinate axes. If P E Par(n), we shall say that P is of dimension n (or has full dimension) if P is not contained in a finite union of hyperplanes of Rn; that is, if P has a non-empty interior. Otherwise, we shall say that P is of lower dimension. (Recall that a hyperplane in R n is a plane of dimension n - 1, not necessarily through the origin.) In general, a set P E Par( n) has dimension k if P is contained in a finite union of k-planes in R n, but is not contained in any finite union of k - 1 planes.

    Note that Par( n) is closed under finite unions and intersections. This follows from the basic fact that the intersection of two parallelotopes is a parallelotope. In other words, Par(n) is a distributive lattice.

  • 4.1 The lattice of parallelotopes 31 Denote by Tn the group generated by translations and permutations

    of coordinates in Rn. For A ~ Rn and 9 E Tn, write gA = g(A) = {g(a) : a E A}.

    A valuation f..L defined on Par( n) is said to be invariant when f..L(gP) = f..L(P) (4.1)

    for all 9 E Tn and all P E Par(n). If f..L(gP) = f..L(P) is known to hold only for translations 9 of R n, then we shall say that f..L is translation invariant.

    The object of this section is to determine all invariant valuations de-fined on Par( n). To avoid pathological cases, we shall impose a continu-ity condition on the valuations to be considered.

    For A ~ Rn and x E R n, the distance d(x, A) from the point x to the set A is given by

    d(x, A) = inf d(x, a), aEA

    where d(x, a) = Ix - al is the usual distance between points in Rn. Note that d(x, A) = 0 if x E A or if x is a limit point of A.

    For K,L ~ Rn, the Hausdorff distance 8(K,L) is defined by

    8(K, L) = max (sup d(a, L), supd(b, K)) . (4.2) aEK bEL

    If K and L are compact, then 8(K,L) = 0 if and only if K = L. A sequence of compact subsets Kn of Rn converges to a set K, or Kn --+ K, if 8(Kn' K) --+ 0 as n --t 00.

    Let Bn denote the unit ball in Rn. For K ~ Rn compact and f > 0, define

    K + fBn = {x + EU : x E K and u E Bn}. The following lemma gives an easier and more practical way to think about the distance 8.

    Lemma 4.1.1 Let K, L ~ Rn be compact sets. Then 8(K, L) ::::; f if and only if K ~ L + fBn and L ~ K + fBn. Proof Suppose that K ~ L + fBn. For x E K there exist y ELand u E Bn such that x = y + EU. In other words, Ix - yl ::::; f, so that d(x, L) ::::; f. Similarly, if L ~ K + fBn, then d(y, K) ::::; f for all y E L. Therefore, 8(K, L) ::::; f.

  • 32 4 The intrinsic volumes for parallelotopes Meanwhile, if there exists x E K such that x 1:. L + fBn (or vice versa),

    then Ix - yl > f for all y E L, so that 8(K, L) ;::: d(x, L) > f. 0 In view of Lemma 4.1.1, we see that a sequence of compact subsets

    Kn -----t K, if for f > 0 there exists N > 0 such that K c:;:: Ki + fBn and Ki c:;:: K + fBn whenever i > N.

    Theorem 4.1.2 The distance 8 defines a metric on the set of all compact subsets ofRn. Proof The distance 8 is clearly symmetric and non-negative. To verif:Y the triangle inequality, suppose that K, L, M c:;:: Rn are compact. Let fl = 8(K, M) and f2 = 8(L, M). By Lemma 4.1.1, K c:;:: M + f1Bn and M c:;:: L + f2Bn' so that

    K c:;:: L + f2Bn + f1Bn = L + (f2 + fdBn . ,

    Similarly, L c:;:: K +(f2+fl)Bn. Lemma 4.1.1 then implies that 8(K, L) :s; fl + f2. 0

    We now focus once again on the lattice Par( n) of finite unions of par-allelotopes. A valuation f..L is said to be continuous on Par(n), provided that

    whenever Pi, P are parallelotopes (and not just finite unions) and Pi -----t P.

    Another condition that will prove useful is monotonicity. A valua-tion f..L is said to be increasing on Par(n), provided that f..L(P) :s; f..L(Q), whenever P, Q E Par(n) and P c:;:: Q. Similarly one defines decreasing valuations. A valuation f..L is said to be monotone on Par(n), if f..L is either an increasing valuation or a decreasing valuation.

    When studying valuations on Par(n) we may restrict our attention to the generating set of parallelotopes in R n with edges parallel to the coordinate axes. Spe~ifically, we have the following extension theorem.

    Theorem 4.1.3 (Groemer's extension theorem for Par(n)) A val-uation f..L defined on parallelotopes with edges parallel to the coordinate axes admits a unique extension to a valuation on the lattice Par( n). Proof In view of Groemer's integral Theorem 2.2.1, it is sufficient to show that f..L defines an integral on the space of indicator functions of parallelotopes.

  • 4.1 The lattice of parallelotopes 33 The proposition is trivial in dimension zero. Assume that the propo-

    sition holds in dimension n - 1. Suppose that there exist distinct paral-lelotopes PI, ... , Pm such that

    m

    LaJPi =0 (4.3) i=l

    while m L aiJ.l(Pi ) = 1. (4.4)

    i=l

    Let k be the number of parallelotopes Pi in the expressions above of full dimension n, and suppose that k is minimal over all possible such expressions.

    If k = 0 then PI, ... , Pm are each contained inside a hyperplane. Let l denote the (finite) number of hyperplanes containing the parallelotopes PI"'" Pm, and suppose that l is minimal over all such expressions.

    If l = 1 then PI,'" , Pm are all contained in a single hyperplane. It then follows from the induction assumption (on the dimension of the ambient Euclidean space) that we have a contradiction. Therefore l > 1, and there exist hyperplanes HI"'" HI, orthogonal to the coordinate axes, such that Pi ~ HI U ... U Hz for i = 1, ... ,m. Suppose, without loss of generality, that PI ~ HI.

    Since IpinHl = IpJHll it follows from (4.3) that

    Meanwhile,

    m

    L aJpinHl = O. i=l

    m L aiJ.l(Pi n Hd = 0, i=l

    (4.5)

    (4.6)

    by the induction assumption on dimension, since each Pi n HI ~ HI, a hyperplane.

    Subtracting equations (4.5) and (4.6) from (4.3) and (4.4) respectively, we have

    and m

    m

    L ai(Ipi - IpinHJ = 0, i=l

    L ai(J.l(Pi ) ,-- J.l(Pi n HI)) = 1. i=l

    (4.7)

    (4.8)

  • 34 4 The intrinsic volumes for parallelotopes Since PI n HI = PI, equations (4.7) and (4.8) take the form of (4.3) and (4.4), where the nonzero terms involve parallelotopes Pi in at most l - 1 hyperplanes, contradicting the minimality of l.

    It follows that k 2: 1. Suppose then that PI has dimension n. Choose a hyperplane H, with associated closed half-spaces H+ and H- such that PI n H is a facet of PI, oriented so that PI c H+. Since I Pi nH + = Ip,!H+, it follows from (4.3) that

    Similarly,

    m

    L

  • 4.2 Invariant valuations on parallelotopes 35

    After repeating this argument with parallelotopes P2 , . , Pm we have

    ~aiJ-l(PI n .. nPm) = (~ai) J-l(PI n .. nPm) = 1. This implies that al + .. +am =f- and that PIn npm =f- 0. Meanwhile a similar argument using (4.3) gives

    f aJp1n ... nPm = (f a i ) Ipln ... nPm = 0, i=l i=l

    so that either al + ... + am = or PI n n Pm = 0, a contradiction in either case. 0

    4.2 Invariant valuations on parallelotopes To begin the classification of invariant valuations on Par(n), consider the problem in RI. An element of Par(l) is a finite union of closed intervals. Set

    J-l6(A) = number of connected components of A, J-l~(A) = length of A.

    One easily verifies that J-l6 and J-l~ are both continuous invariant valua-tions on Par(l). We shall prove that every continuous invariant valuation on Par(l) is a linear combination of J-l6 and J-ll.

    Suppose that J-l is a continuous invariant valuation on Par(l). Let c = J-l(A) , where A is a set consisting of a single point in Rl, and let J-l' = J-l - CJ-l6 Note that the invariant valuation J-l' vanishes on points. Define a continuous function f : [0, +00) ----t [0, +00) by the equation

    f(x) = J-l'([0, x]). If A is a closed interval of length x, then the invariance of J-l' implies that J-l'(A) = f(x). If A and B are closed intervals of length x and y, such that An B is a point, then

    f(x + y) = J-l'(A U B) = J-l'(A) + J-l'(B) - J-l'(A n B) = f(x) + f(y), so that f(x) = rx for some constant r. Hence, J-l' = rJ-lL and our assertion is proved.

    We now turn to Rn. There is one well known continuous invariant valuation defined on Par(n), namely, the volume. Denote by J-ln(P) the volume of a finite union P of parallelotopes of dimension n.

  • 36 4 The intrinsic volumes for parallelotopes Recall that the elementary symmetric functions of Xl, X2, ... ,Xn are

    the polynomials

    eo = 1,

    n

    ek(XI, X2, ,Xn) = L Xi1Xi2 .. , Xik' 1 :S k :S n. l:'Oh < .. -

  • 4.2 Invariant valuations on parallelotopes 37

    intersection of a collection of parallelotopes is a parallelotope.) By the incl usion-excl usion principle,

    JL~(Q) = LJL~(Pi) - LJL~(Pi n Pj ) + ... - .... i i

  • 38 4 The intrinsic volumes for parallelotopes Proposition 4.2.3

    f.li(PI x P2 ) = L f.lr (Pdf.ls (P2 ). (4.12) r+s=i

    The identity (4.12) is therefore valid when PI and P2 are finite unions of parallelotopes. Proof Suppose that PI has sides of length Xl, ... ,Xh and P2 has sides of length YI, ... , Yn-h. Then we have

    r+s=i

    Let jr+l = kl + h, ... ,ji = jr+s = ks + h, and let Xh+1 = YI,, Xn = Yn-h. Then

    r+s=i

    X X X X Jl Jr Jr+l Ji

    o

    A valuation f.l on Par(n) is said to be simple if f.l(P) = for all P of dimension less than n. The restriction of the volume f.ln to the lattice Par( n) is characterized by the following theorem.

    Theorem 4.2.4 (The volume theorem for Par(n Let f.l be a trans-lation invariant simple valuation defined on Par(n), and suppose that f.l is either continuous or monotone. Then there exists C E R such that f.l(P) = Cf.ln(P) for all P E Par(n); that is, f.l is equal to the volume, up to a constant factor. Proof Let [0,1]n denote the unit cube in Rn, and let C = f.l([0,1]n). Recall that f.l is translation invariant and vanishes on lower dimen-sions. Since f.l([0, 1]n) = c, a simple cut-and-paste argument shows that

  • 4.2 Invariant valuations on parallelotopes 39

    fL([O, 1/k]n) = c/kn for all integers k > O. Therefore, fL(C) = CfLn(C) for every box C of rational dimensions, with sides parallel to the coordinate axes. This follows from the fact that such a box can be built up by stacking cubes of the form [0,1/k]n for some k > O. Since fL is either continuous or monotone, it follows that fL( C) = CfLn (C) for every box C of positive real dimensions, with sides parallel to the coordinate axes. It then follows from the inclusion--exclusion principle that fL(P) = cfLn(P) for all P E Par(n). 0

    The condition of either continuity or monotonicity is necessary to the characterization given by Theorem 4.2.4. If we omit these conditions then counterexamples to Theorem 4.2.4 can be found even in the case of Par(1)! To see this, recall that R is a vector space of infinite dimension over the field Q of rational numbers. Denote this vector space RQ. Since the dual space RQ is also of infinite dimension, there exists a nontrivial map f E RQ; i.e., a linear map f : RQ ---., Q such that f(1) = 1 and f(x) E Q for all x E R.

    A parallelotope P E Par(1) is just a closed bounded interval of the form [a, b], having length b - a. Define a valuation TJ on parallelotopes (intervals) in Par(1) by the formula

    TJ([a, b]) = f(b - a). Evidently TJ is invariant, depending only on the length of the closed interval. Moreover, if a ~ C ~ b ~ d, then

    TJ([a, b] U [c, dj) + TJ([a, b] n [c, dj) = TJ([a, d]) + TJ([c, b]) = f(d - a) + f(b - c) = (f(c - a) + f(b - c) + f(d - b)) + f(b - c) =f(b-a)+f(d-c) = TJ([a, b]) + TJ([c, dj),

    by the linearity of f. It now follows from Groemer's extension The-orem 4.1.3 that TJ extends to an invariant valuation on Par(1) that vanishes on lower dimensions. However, TJ is not equal to length (one-dimensional volume), since TJ takes only rational values. In other words, invariance alone is insufficient to characterize the volume - either conti-nuity or monotonicity is also required. The reasoning that underlies this counterexample to Theorem 4.2.4 is easily extended to provide coun-terexamples for Par( n), where n 2: 1.

  • 40 4 The intrinsic volumes for parallelotopes We are now able to determine all continuous valuations on Par(n)

    that are invariant under translation and permutations of coordinates. We shall not yet prove that they are also rotation invariant.

    Theorem 4.2.5 The valuations /-Lo, /-Ll, . ,/-Ln form a basis for the vec-tor space of all continuous invariant valuations defined on Par(n). Proof Let /-L be a continuous invariant valuation on Par(n). Denote by Xl, X2, ... ,Xn the standard orthonormal basis for R n, and let Hj denote the (n -I)-hyperplane in Rn spanned by the coordinate vectors Xl, ... , X j -1 , X j +1, ... , Xn- The restriction of /-L to H j is an invariant valuation on parallelotopes in Hj . Proceeding by induction, we may assume that

    n-l

    /-L(A) = L Ci/-Li(A), i=O

    for all A E Par( n) such that A ~ H j . Moreover, the coefficients 0

    if

    /-L(aP) = ak/-L(P) for all P E Par(n) and all a 2: o.

  • 4.3 Notes 41

    Corollary 4.2.6 Let f-l be a continuous invariant valuation defined on Par(n) that is homogeneous of degree k, for some 0::; k ::; n. Then there exists C E R such that f-l(P) = Cf-lk(P) for all P E Par(n). Proof By Theorem 4.2.5 there exist Cl,"" Cn E R such that

    If P = [0, l]n then, for a > 0,

    n

    f-l = LCif-li. i=O

    f-l(aP) = ~Cif-li(ap) = ~Ciaif-li(p) = ~ (7)Ciai Meanwhile,

    f-l(aP) = akf-l(P) = a k ~Cif-li(P) = ~ (7)Ciak Therefore, Ci = if i =f. k, and f-l = Ckf-lk

    4.3 Notes

    o

    For a more complete discussion of the Hausdorff topology on the space of compact subsets of Rn and the subspace of compact convex sets, see [85, pp. 47-61]. Theorem 4.1.3 is a special case of a more general extension theorem of Groemer, in which the lattice Par(n) is replaced with the lattice of polytopes in Rn (see [32]). Theorem 4.2.5 can be generalized to a classification of all continuous translation invariant valuations on the lattice Par(n) , omitting the requirement that valuations be invariant under permutation of coordinates. The result is a 2n -dimensional space of valuations, with a basis indexed by the collection of all coordinate subspaces of Rn with respect to the fixed basis for parallelotope edges in Par(n); that is, by the set of all 2n subsets of that n-element basis. For a detailed treatment, see [54].

  • 5

    The lattice of polyconvex sets

    We turn now to the lattice of polyconvex sets, which is a natural setting for the study of classical geometric probability. In Section 5.2 we define the Euler characteristic on polyconvex sets, which is an important tool for the extension in Chapter 7 of the intrinsic volumes of Section 4.2 to polyconvex sets. The Euler characteristic will also reappear in Chap-ter 10, in which we generalize the discrete kinematic formula of Chap-ter 3 to polyconvex sets. Section 5.5, while interesting in its own right, points to the correct normalization of the rotation invariant measures on Grassmannians in Section 6.1.

    5.1 Polyconvex sets A subset K of Rn is said to be convex if any two points x and y in K are the endpoints of a line segment lying inside K. Denote by Kn the collection of all compact convex subsets of R n. A finite union of compact convex sets will be called a polyconvex set (a term suggested by E. De Giorgi).

    If A is a polyconvex set in Rn, we shall say that A is of dimension n if A is not contained in a finite union of hyperplanes of R n; that is, if A has a non-empty interior. Otherwise, we shall say that A is of lower dimension. The union and intersection of polyconvex sets are polyconvex. This follows from the basic fact that the intersection oftwo convex sets is convex. In other words, the family of polyconvex sets in Rn is a distributive lattice. We denote this lattice Polycon(n). Note that Par( n) is a sub lattice of Polycon( n). The lattice Polycon( n) is also sometimes called the convex ring.

    A non-empty compact convex set K E Kn is determined uniquely by its support function hK : sn-l --+ R, defined by hK(U) =

  • 5.1 Poly convex sets 43

    maxXEK{ x . u}, where denotes the standard inner product on Rn. For example, if vERn and v denotes the line segment with endpoints v and -v, then hv(u) = lu vi, for all u E sn-l.

    More generally, suppose that h: sn-l ----+ R, and consider the radial extension h : Rn ----+ R given by h(au) = ah(u), for all u E sn-l and a 2': O. The original function h is a support fUEction of a compact convex set in Rn if and only if the radial extension h is sublinear; that is,

    h(x + y) ~ h(x) + h(y), for all x,y ERn.

    Note that, for u E sn-l, a compact convex set K lies entirely on one side (the '-u' side) of the hyperplane H(K, u) determined by the equation X u = hK(U). The hyperplane H(K,u) is called the support plane of K in the direction u. If H(K, u)- denotes the closed half-space x u ~ hK(U) bounded by H(K, u), then we have

    K = n H(K,u)-. uESn

    For compact convex sets K and L the Minkowski sum K + L is defined by

    K + L = {x + y : x E K and y E L}. (5.1) It is not difficult to show that hK+L = hK + hL .

    Recall from Lemma 4.1.1 that for compact sets K and L in Rn, the Hausdorff metric satisfies d(K, L) ~ E if and only if K ~ L + EB and L ~ K + EB. It follows easily from (5.1) that

    8(K, L) = sup IhK(U) - hL{u)l. (5.2) uESn - 1

    In other words, the Hausdorff topology on Kn is also given by the uniform metric topology on the set of support functions of compact convex sets.

    Denote by En the Euclidean group on Rn; that is, the group generated by translations and (proper or improper) rotations. If A C Rn and 9 E En, write

    gA = g(A) = {g(a) : a E A}. The subgroup of translations (relative to a fixed Cartesian coordinate system) shall be denoted by Tn.

    A valuation J-l defined on polyconvex sets in Rn is said to be rigid motion invariant (or simply invariant, when no confusion is possible) if

    J-l(A) = J-l(gA) (5.3)

  • 44 5 The lattice of polyconvex sets for all 9 E En and all A E Polycon(n). If the equality (5.3) holds only when 9 E Tn, we say that J.L is translation invariant.

    Our objective is to determine all invariant valuations defined on poly-convex sets in Rn. As with Par(n), we impose a continuity condition on the valuations to be considered. A valuation J.L is said to be convex-continuous (or simply continuous, when no confusion is possible) pro-vided that

    whenever An, A are compact convex sets and An --+ A with respect to the metric (5.2). Examples of continuous invariant valuations on Polycon(n) include volume and surface area. The following proposition shows that we can restrict our attention to convex-continuous valuations defined on the generating set Kn.

    Theorem 5.1.1 (Groemer's extension theorem for Polycon(n A convex-continuous valuation J.L on Kn admits a unique extension to a valuation on the lattice Polycon( n). Proof Suppose that J.L is a continuous valuation. In view of Groemer's integral Theorem 2.2.1, it is sufficient to show that J.L defines an integral on the space of indicator functions.

    The theor,em is trivial in dimension ~ero. Assume that the theorem holds in dimension n-1. Suppose that there exist distinct K 1 , . ,Km E Kn such that

    while

    LaJKi =0 i=l

    Tn L aiJ.L(Ki) = 1. i=l

    (5.4)

    (5.5)

    Let m be the least positive integer for which such expressions (5.4) and (5.5) exist.

    Choose a hyperplane H, with associated closed half-spaces H+ and H- such that Kl C lnt H+. Since IKinH+ = IKJH+, it follows from (5.4) that

    Tn

    LaiIKinH+ = O. i=l

  • 5.1 Polyconvex sets

    Similarly, m

    L cxiIK,nH = 0 and i=l

    Meanwhile, since J-l is a valuation, m m m

    i=l i=l i=l

    m

    LCXJK,nH- = O. i=l

    m

    i=l

    45

    Since the sets Ki n H lie inside a space of dimension n - 1, the sum L~l cxiJ-l(KinH) = 0 by the induction assumption. Because KlnH- = 0, the sum L~l cxiJ-l(Ki n H-) = 0 by the minimality of m. From (5.5) we have

    m m

    i=l i=l

    Choose a sequence of hyperplanes HI, H2 , ... such that Kl C lnt Hi and

    By iterating the preceding argument, we have m

    LCXiJ-l(Ki n Hi n n H:) = 1 i=l

    for all q 2: 1. Since J-l is continuous, the limit as q ~ (Xl gives m

    L cxiJ-l(Ki n K l ) = 1, i=l

    while a similar argument using (5.4) gives m

    L cxiIK,nK1 = O. i=l

    After repeating this argument with the bodies K 2 , . . ,Km we have

    tCXiJ-l(Kl n n Km) = (tCXi) J-l(Kl n n Km) = l.

    This implies that CXl + .,. + CXm =I- 0 and that Kl n .. , n Km =I- 0. Meanwhile a similar argument using (5.4) gives

    f CXJK1n ... nKm = (f CX i ) IKln ... nKm = 0, i=l ,=1

  • 46 5 The lattice of polyconvex sets so that either a1 + ... + am = 0 or K1 n ... n Km = 0, a contradiction in either case. 0

    5.2 The Euler characteristic Next, we shall extend the valuation J-lo to the entire distributive lat-tice Polycon(n). We have seen that J-lo is a well-defined functional on sets that are finite unions and intersections of parallelotopes, and that J-lo(P) = 1 if P is a non-empty parallelotope. These results motivate the following theorem.

    Theorem 5.2.1 (The existence of the Euler characteristic) There exists a unique convex-continuous invariant valuation J-lo de-

    fined on the family Polycon(n) of all polyconvex sets in Rn, such that J-lo(K) = 1 whenever K is a non-empty compact convex set. The valuation J-lo is again called the Euler characteristic. Proof We proceed by induction on the dimension n, the case n = 1 having been established previously. By Theorems 2.2.1 and 5.1.1, it will suffice to establish the existence of a linear functional Ln defined on Kn-simple functions, such that Ln(IK) = 1 whenever K is a non-empty compact convex set.

    For n = 1, set

    L 1 (1) = L(I(x)-f(x+O)), xER

    where f(x + 0) = lima--+o+ f(x + a). The sum on the right-hand side is finite, and, for f = IK , where K is an interval [a, bj, we have

    Thus, L 1(IK ) = J-l6(K), so that

    L 1(1) = J f dJ-l6 For arbitrary n, choose an orthogonal coordinate system Xl, X2,"" xn .

    Given the first coordinate x, let Hx be the hyperplane parallel to the coordinates X2, ... , Xn and passing through the point (x, 0, ... ,0).

    Let f = f(X1, X2,"" xn ) be a simple function. The function fx(X2,' .. ,xn) = f(x, X2,' .. ,xn) is a simple function in Hx, and we

  • 5.2 The Euler characteristic 47

    assume that Ln- l (fx) has been defined in H x, since Hx is isomorphic to Rn-l. Set F(x) = Ln-l(fx), and set

    Note that the function F is simple, so that the right-hand side is well defined.

    If f = I K , where K is a compact convex set, then fx is the indicator function of the slice of K by the hyperplane at Xl = X, and F is the indicator function of the projection of K onto the xl-coordinate axis. It follows that Ll (F) = 1. Since

    Ln(f) = J f d'/a, for some valuation f.lo, it follows that f.lo is the desired valuation. 0

    Note that the Euler characteristic f.lo is normalized. In other words, if K is a polyconvex set of dimension k in R n, and if V is a plane of dimension j containing K, then f.l~(K) computed within V is equal to f.lo(K) computed in Rn. This follows from the fact that f.lo(K) can be computed via the inclusion-exclusion principle after K has been ex-pressed as a finite union of compact convex sets, whereas f.lo(K) = 1 for all non-empty compact convex sets K in spaces of any (finite) dimension. For this reason we write f.lo in place of f.lo.

    The argument in the preceding proof can also be used to compute the Euler characteristic of a polytope. By Corollary 2.2.2, the valuation f.lo extends uniquely to a valuation defined on the relative Boolean algebra generated by Polycon(n), a valuation that is again denoted by f.lo.

    Consider the (smaller) distributive sublattice ofPolycon(n) generated by compact convex polytopes. Recall that a convex polytope is the intersection of a finite collection of closed half-spaces. A polytope is a finite union of convex polytopes. Given a polytope P, the boundary 8P is also a polytope (which is not the case for an arbitrary compact convex set). Therefore, f.lo(8P) is defined.

    Theorem 5.2.2 If P is a compact convex polytope of dimension n > 0, then

    Proof Using again the notation of Theorem 5.2.1, we note that H x n8 P = 8( Hx n P) if X is not a boundary point of 1f P, the orthogonal projection

  • 48 5 The lattice of polyconvex sets of P onto the line H;:. Let F(x) = J-lo(o(Hx np)), where J-lo is taken in the space Hx.

    For the case n = 1 we have J-lo(oP) = 2 = 1- (-1), since OP consists of two distinct points. For n > 1 it follows from the induction hypothesis that

    when x E nP is not a boundary point of nP. Meanwhile, if x E o(nP), we have

    J-lo(Hx n oP) = 1, since Hx n P is a face of P (though possibly a single point). Finally,

    J-lo(Hx n oP) = 0, when Hx n oP = 0.

    We can now compute

    L1(F(x)) = ~)F(x) - F(x + 0)), x

    a sum that vanishes except at the two points, call them a and b (with a < b), where Hx touches the boundary of P. The right-hand side then reduces to

    We compute

    F(a) - F(a + 0) + F(b) - F(b + 0).

    F(b+ 0) = 0,

    F(b) = 1,

    F(a) = 1,

    F(a+O) = 1- (_I)n-l. On adding, we find that

    Ll(F(x)) = 1-1 + (_I)n-l + 1 = 1 + (_I)n-l = 1- (_I)n, as desired o

    If P is a compact convex polytope of dimension k in R n, let V be the k-dimensional plane containing P. We denote by relint(P) the interior of P relative to the topology of V; that is, the relative interior of P.

  • 5.2 The Euler characteristic 49

    Theorem 5.2.3 Let P be a compact convex polytope of dimension k in Rn. Then

    1L0(relint(P)) = (_l)k.

    Proof Since 1L0 is normalized independently of the ambient space, we compute within the k-dimensional plane in Rn containing P. From Theorem 5.2.2 we have

    1L0(relint(P = 1L0(P) -1L0(8P) = (_l)k. o

    We can now generalize Euler's formula to arbitrary (nonconvex) poly-topes. To this end, we define the notion of a system of faces F of a polytope P. This will be a family with the following properties:

    Every element of F is a convex polytope. UQEF relint(Q) = P . If Q, Q' E F and Q -=I- Q', then relint(Q) n relint(Q') = 0

    We can now prove the following result.

    Theorem 5.2.4 (The Euler-Schliifli-Poincare formula) Let F be a system of faces of a polytope P, and let fi be the number of elements of F of dimension i. Then

    1L0 = fo --;- II + h - ...

    First proof Place a linear ordering on the elements of F, or 'faces', such that, if Q < Q' then flim(Q) ::; dim(Q'). Evidently,

    Ip = L (IQ - lQ1)' QEF

    where Q1 = Q n (UQI

  • 50 5 The lattice of polyconvex sets For Q E Fo, we have IQl = 0, so that

    For Q E Fi , where i > 0, we have

    and hence,

    Therefore,

    so that

    Ql = Q n ( U Q1) = 8Q, Q'

  • 5.3 H elly' s theorem 51

    Theorem 5.3.1 (Klee's theorem) Let F be a finite family of compact convex sets such that

    is convex. Let i < IFI, and suppose that, for any subset G ~ F such that IGI = i (that is, every subset of cardinality i of F),

    n K=I 0. KEG

    Then there exists a subset H of F with cardinality i + 1, such that n K=I 0.

    KEH

    Proof Let n > 1 be a positive integer. Recall that

    1- (7) + (;) - (~) + ... +(-l)j(;) =1O, (5.6) for all positive integers j < n. To see why (5.6) holds, suppose that j ~ (n/2). In this case the left-hand side of (5.6) is an alternating sum of strictly increasing terms, and is therefore unequal to zero. Since

    and

    it also follows that (5.6) holds if we replace j ~ (n/2) by j ::::: (n/2). Now let n = IFI. From Theorem 5.2.1 and the inclusion-exclusion

    formula we have

    l=f-lO (U K) KEF

    L f-lo (K) - L f-lo (K n L) + ... KEF K#-LEF

    whenever nKEG K = 0 for all G ~ F such that IGI = i + 1. However, this is impossible, by virtue of the inequality (5.6). 0

    Given a set A ~ Rn, the convex hull of A is the smallest convex set in R n that contains A; that is, the intersection of all convex sets containing

  • 52 5 The lattice of polyconvex sets A. The following is a simple and yet fundamental property of convex hulls in Rn.

    Theorem 5.3.2 (Caratheodory's theorem) Let T be the convex hull of a family of compact convex sets K 1 , K 2 , .. ,Km in Rn. For each x E T, there exists a subfamily Kjl' ... ' K jp ' with convex hull Tx , such that p ~ n + 1 and x E T x.

    Proof If n = 0 the result is trivial. Assuming that the theorem holds for dimension n - 1, we prove the theorem for dimension n.

    Let x E T. If x lies on the boundary of the compact convex set T, let H denote a support plane of T at x. Because H supports the convex set T, all of T lies inside one of the closed half-spaces bounded by H. Since x E T n H, it follows that x lies in the convex hull of the convex sets K j n H. By the induction assumption on dimension, there exists a subfamily Kl1 n H, ... ,Kjp n H with convex hull T* such that p ~ n and x E T*. It follows that x lies in the convex hull of the subfamily K j1 , ... ,Kjp.

    If x lies in the interior of T, suppose that x rj. K j for any j (otherwise the proof is finished). Let denote a line through x that also meets K m , and let x' be the point of intersection of with the boundary of T. Since meets the boundary of T at two points, choose x' so that x lies between x' and n Km. It follows from the argument in the previous paragraph that x' lies in the convex hull of a subfamily Kjl , ... ,Kjp of F, where p ~ n. It then follows that x must lie the convex hull of the sets K j1 , ... ,Kjp ' K m, a collection of at most n + 1 sets in F. 0

    Combining CaratModory's Theorem 5.3.2 with Klee's Theorem 5.3.1 we obtain the following celebrated theorem of Helly.

    Theorem 5.3.3 (Helly's theorem) Let F be a finite family of compact convex sets in Rn. Suppose that, for any subset G ~ F such that IGI ~ n + 1 (that is, every subset of cardinality at most n + 1 of F),

    n K#0. KEG

    Then

    n K#0. KEF

  • 5.3 Helly's theorem 53

    In other words, if every n + 1 elements of F have non-empty intersec-tion, then the entire family F of convex sets has non-empty intersection. Proof If IFI ~ n + 1 the result is trivial. Suppose that Theorem 5.3.3 holds for IFI = m, for some m 2': n + 1. We show that the theorem also holds for IFI = m + 1.

    Let F = {K1 , ... , K mH }. For each 1 ~ j ~ m + 1 denote by L j the intersection

    Our induction assumption for the case IFI = m implies that each L j is non-empty. Let M denote the convex hull of the union L1 u u Lm +1.

    If x E M then Caratheodory's Theorem 5.3.2 implies that x lies in the convex hull of the union Lil U Li2 U ... U Lin+l for some 1 ~ i1 ~ ... ~ in+ 1 ~ m + 1. For each i :. {iI, ... , in+ I},

    so that x E K i . In other words, m+1

    M S;; U K i i=l

    Let Mi = Ki n M for 1 ~ i ~ m + 1. Note that, for each j, n Mi = n Ki n M = L j n M = L j -=I- 0. i-f-j i-f-j

    Since M1 U ... U Mm+1 = M is convex, it follows from Klee's Theo-rem 5.3.1 that

    Hence, m+1 m+1 n Ki;2 n Mi -=I- 0. i=l i=l

    o Theorem 5.3.3 is in fact a generalization of Theorem 3.3.1. To see

    this, suppose that S = {Sl' ... ,sn} is a finite set, and associate to each Si a distinct point Xi E Rn-l, chosen so that the collection {Xl, ... ,xn } is affinely independent. Let ~ denote the geometric simplex in R n - 1 having vertices {Xl, ... , x n }. Subsets of S now correspond to faces ofthe

  • 54 5 The lattice of polyconvex sets simplex ~, and Theorem 3.3.1 is now a specialization of Theorem 5.3.3 to families of faces of the simplex ~.

    5.4 Lutwak's containment theorem We now turn to a beautiful application of Helly's theorem to the question of containment of convex bodies. Given compact convex sets K and L with non-empty interiors, is there a simple condition that guarantees that some translate of K is a subset of L? It turns out that the answer to this question is determined by the relationship of K to the simplices in R n that contain L.

    For K E Kn and vERn, denote by K + v the set

    K + v = {x + v : x E K}. In other words, K + v denotes the translation of the set K by the vector v.

    Theorem 5.4.1 (Lutwak's containment theorem) Let K, L E Kn with non-empty interiors. The following are equivalent.

    (i) For every simplex ~ such that L ~ ~, there exists vERn such that K +v ~~.

    (ii) There exists Va E R n such that K + Va ~ L.

    In other words, if every simplex containing L also contains a translate of K, then L itself contains a translate of K. Proof The implication (ii) '* (i) is obvious. We show that (i) '* (ii).

    To begin, suppose first that L is a polytope, with facets L 1 , L 2 , .. ,Lrn and corresponding facet (outward) unit normal vectors U1, U2,.'" Urn' Assume also that every selection of n distinct unit normals Uj is a linearly independent set. (Were this not the case, a small perturbation of L would make it so.)

    For each facet L i , let Hi denote the (n - I)-dimensional hyperplane in Rn containing L i , and let Ht denote the closed half-space bounded by Hi and containing the polytope L. Finally, denote by Ti the set of vectors vERn such that K + v c Ht. Since each Ht is a (convex) closed half-space and K is compact, it is clear that each Ti is a non-empty closed convex set.

    The independence condition on the unit normals {Ui} implies that, for each distinct selection Uil' Ui2' ... ,Uin+l of n + 1 unit normals, either

  • 5.5 Cauchy's surface area formula 55 the corresponding intersection

    n+1 H- . . -nH+ ~1,~2""'~n+l - is (5.7)

    8=1

    contains a simplex .6.il ,i2, ... ,in +l such that L ~ .6.i l,i2, ... ,in +1l or this in-tersection contains a translate of the ball O'.B of radius 0'., for all 0'. > O. (This is the case in which the intersection (5.7) is unbounded.) In the first case, the hypothesis of the theorem implies the existence of a vector vERn such that K + v ~ .6.il ,i2, ... ,in +l' In the second case there also exists v such that K + v lies in the intersection (5.7). In either case, there exists v E Til n .. n T in+l . In other words, each collection of n + 1 sets Ti has a non-empty intersection. ReIly's Theorem 5.3.3 then implies the existence of a vector v such that

    m

    In other words, K + v ~ Ht for i = 1, ... ,m. Since L = Hi n .. n H:!, it follows that K + v ~ L.

    Now suppose that L is an arbitrary compact convex set. Let {Pi}~l be a decreasing collection of polytopes such that Pi ~ L as i ~ 00, and such that each n of the facet normals to Pi are linearly independent. If .6. is a simplex containing Pi, then L ~ Pi ~ .6., so there exists a vector w such that K + w ~.6.. Since the theorem holds for the polytopes Pi, it then follows that there exists a vector Vi for each i, such that K + Vi ~ Pi' Since the Pi are decreasing (with respect to the relation of subset containment), the sequence {Vi} is bounded and must contain a convergent subsequence. Assume then without loss of generality that Vi ~ v. Since Pi ~ L, it follows that K + v ~ L. 0

    5.5 Cauchy's surface area formula We conclude this chapter with the following interpretation of the surface area S(K) of a convex body in Rn, which will be of use to us in the sequel. If K E Kn and V is an (n - I)-dimensional subspace of R n, denote by KIV the orthogonal projection of K onto V. Let W n -1 denote the (n - I)-dimensional volume of the unit ball Bn- 1 in Rn-1.

    Lemma 5.5.1 For v E sn-l,

    f lu, vi du = 2Wn -1. lSn-l

  • 56 5 The lattice of polyconvex sets Proof Recall from elementary calculus that

    where sn-l is partitioned into many small regions Ai, having area S(Ai), and where Ui E Ai for each i. Let Ai denote the orthogonal projection of Ai onto the tangent hyperplane to sn-l at the point Ui. Denote by AilvJ.. the orthogonal projection of Ai onto the hyperplane vJ..; that is, into the disk Bn- l in vJ... Because Ai is a flat region lying inside a hyperplane with unit normal Ui, we have S(AilvJ..) = Iu . vIS(Ai)'

    Meanwhile, for Ai sufficiently small, we have S(Ai) ~ S(Ai) and S(AilvJ..) ~ S(AilvJ..). Therefore,

    kn-l Iu vi du ~ L S(AilvJ..) .

    Since the collection of sets {Ai IvJ.. } covers the disk Bi twice, projecting from both of the directions v and -v (that is, from both hemispheres of sn-l), we have

    { lu, vi du ~ 2Wn -l, }Sn-l where the similarities converge to equalities in the limit, as the mesh of the partition {Ai} goes to zero. 0

    Theorem 5.5.2 (Cauchy's surface area formula) For all K E Kn,

    I 1 J.. S(K) = - !-In-I(Klu ) duo Wn-l Sn-l

    (5.