measure theory_2 (2).pdf

download measure theory_2 (2).pdf

of 54

Transcript of measure theory_2 (2).pdf

  • 7/25/2019 measure theory_2 (2).pdf

    1/54

    Chapter 1

    Measure theory and Probability

    1.1 Set sequences

    In this section is a set and P () is the class of all subsets of .

    Denition 1.1 (Set sequence)A set sequence is a map

    IN P ()n An

    We represent it by {An}nIN P ().

    Theorem 1.1 (The De Morgan laws)It holds that

    (i)

    n=1

    An

    c

    =

    n=1

    Acn .

    (ii)

    n=1

    An

    c

    =

    n=1

    Acn .

    Denition 1.2 (Monotone set sequence)A set sequence{An}nIN P () is said to be monotone increas-ing if and only if An An+1 , n IN . We represent it by{An} .

    1

  • 7/25/2019 measure theory_2 (2).pdf

    2/54

    1.1. SET SEQUENCES

    When An An+1 , n IN , the sequence is said to be monotone decreasing , and we represent it by {An} .

    Example 1.1 Consider the sequences dened by:

    (i) An = ( n, n ), n IN . This sequence is monotone increas-

    ing, since n IN ,An = ( n, n ) ( (n + 1) , n + 1) = An+1 .

    (ii) Bn = ( 1/n, 1 + 1/n ), n IN . This sequence is monotonedecreasing, since n IN ,

    Bn = ( 1/n, 1 + 1/n ) ( 1/ (n + 1) , 1 + 1/ (n + 1)) = Bn+1 .

    Denition 1.3 (Limit of a set sequence)

    (i) We call lower limit of {An}, and we denote it limAn , to theset of points of that belong to all Ans except for a nitenumber of them.

    (ii) We call upper limit of {An}, and we denote it limAn , to theset of points of that belong to innite number of Ans. Itis also said that An occurs innitely often (i.o.), and it isdenoted also limAn = An i.o.

    Example 1.2 If A2n , n IN , then limAn but /limAn since there is an innite number of Ans to which does notbelong, {A2n 1}nIN .

    Proposition 1.1 (Another characterization of limit of a set sequence)

    (i) limAn =

    k=1

    n= k

    An

    ISABEL MOLINA 2

  • 7/25/2019 measure theory_2 (2).pdf

    3/54

    1.1. SET SEQUENCES

    (ii) limAn =

    k=1

    n= k

    An

    Proposition 1.2 (Relation between lower and upper lim-its)

    The lower and upper limits of a set sequence {An} satisfylimAn limAn

    Denition 1.4 (Convergence)A set sequence {An} converges if and only if

    limAn = lim An .

    Then, we call limit of {An} tolim

    nAn = lim An = lim An .

    Denition 1.5 (Inferior/Superior limit of a sequence of real numbers)Let {an}nIN IR be a sequence. We dene:

    (i) lim inf n

    an = supk

    inf n k

    an;

    (ii) lim supn

    an = inf k

    supn k

    an .

    Proposition 1.3 (Convergence of monotone set sequences)Any monotone (increasing of decreasing) set sequence converges,and it holds:

    (i) If {An} , then limn An =

    n=1 An .

    (ii) If {An} , then limn An =

    n=1

    An .

    ISABEL MOLINA 3

  • 7/25/2019 measure theory_2 (2).pdf

    4/54

    1.1. SET SEQUENCES

    Example 1.3 Obtain the limits of the following set sequences:

    (i) {An}, where An = ( n, n ), n IN .

    (ii) {Bn}, where Bn = ( 1/n, 1 + 1/n ), n IN .

    (i) By the previous proposition, since {An} , then

    limn

    An =

    n=1

    An =

    n=1

    ( n, n ) = IR.

    (ii) Again, using the previous proposition, since {An} , then

    limn

    Bn =

    n=1

    Bn =

    n=1

    1n

    , 1 + 1n

    = [0, 1].

    Problems

    1. Prove Proposition 1.1.

    2. Dene sets of real numbers as follows. Let An = ( 1/n, 1] if n is odd, and An = ( 1, 1/n ] if n is even. Find limAn and

    limAn .3. Prove Proposition 1.2.

    4. Prove Proposition 1.3.

    5. Let = IR 2 and An the interior of the circle with center atthe point (( 1)n/n, 0) and radius 1. Find limAn and limAn .

    6. Prove that (limAn)c

    = lim Acn and (limAn)

    c

    = lim Acn .

    7. Using the De Morgan laws and Proposition 1.3, prove that if {An} A, then {Acn} Ac while if {An} A, then {Acn} Ac.

    ISABEL MOLINA 4

  • 7/25/2019 measure theory_2 (2).pdf

    5/54

    1.2. STRUCTURES OF SUBSETS

    8. Let{xn} be a sequence of real numbers and letAn = ( , xn).What is the connection between lim inf

    nxn and limAn? Simi-

    larly between lim supn

    xn and limAn .

    1.2 Structures of subsetsA probability function will be a function dened over events or sub-sets of a sample space . It is convenient to provide a good struc-ture to these subsets, which in turn will provide good propertiesto the probability function. In this section we study collections of subsets of a set with a good structure.

    Denition 1.6 (Algebra)An algebra (also called eld ) A over a set is a collection of subsets of that has the following properties:

    (i) A;

    (ii) If A A, then Ac A;

    (iii) If A1, A2, . . . , An A, thenn

    i=1Ai A.

    An algebra over contains both and . It also contains allnite unions and intersections of sets from A. We say that A isclosed under complementation, nite union and nite intersection.Extending property (iii) to an innite sequence of elements of Awe obtain a -algebra.

    Denition 1.7 ( -algebra)A -algebra (or -eld) A over a set is a collection of subsets of that has the following properties:

    ISABEL MOLINA 5

  • 7/25/2019 measure theory_2 (2).pdf

    6/54

    1.2. STRUCTURES OF SUBSETS

    (i) A;

    (ii) If A A, then Ac A;

    (iii) If A1, A2 . . . is a sequence of elements of A, then n=1 An A.

    Thus, a -algebra is closed under countable union. It is alsoclosed under countable intersection. Moreover, if A is an algebra,a countable union of sets in A can be expressed as the limit of

    an increasing sequence of sets, the nite unionsn

    i=1

    Ai. Thus, a

    -algebra is an algebra that is closed under limits of increasingsequences.

    Example 1.4 The smallest -algebra is {, }. The smallest-algebra that contains a subset A is {,A ,Ac, }. It iscontained in any other -algebra containing A. The collection of all subsets of , P (), is a well known algebra called the algebra of the parts of .

    Denition 1.8 ( -algebra spanned by a collection C of events)Given a collection of sets C P (), we dene the -algebraspanned by C , and we denote it by (C ), as the smallest -algebrathat contains C .

    Remark 1.1 For each C , the -algebra spanned by C , (C ), al-ways exists, since AC is the intersection of all -algebras that con-tain C and at least P () C is a -algebra.

    When is nite or countable, it is common to work with the-algebra P (), so we will use this one unless otherwise stated. Inthe case = IR , later we will consider probability measures andwe will want to obtain probabilities of intervals. Thus, we need a

    ISABEL MOLINA 6

  • 7/25/2019 measure theory_2 (2).pdf

    7/54

    1.2. STRUCTURES OF SUBSETS

    -algebra containing all intervals. The Borel -algebra is based onthis idea, and it will be used by default when = IR .

    Denition 1.9 (Borel -algebra)Consider the sample space = IR and the collection of intervalsof the form

    I = {( , a] : a IR }.We dene the Borel -algebra over IR , represented by B , as the-algebra spanned by I .

    The Borel -algebra B contains all complements, countable in-tersections and unions of elements of I . In particular, B containsall types of intervals and isolated points of IR , although B is notequal to P (IR ). For example,

    (a, ) B , since (a, ) = ( , a]c, and ( , a] IR .

    (a, b] IR , a < b, since this interval can be expressed as(a, b] = ( , b] (a, ), where ( , b] B and (a, ) B .

    {a} B , a IR , since{a} =

    n=1a

    1n , a , which belongs

    to B .

    When the sample space is continuous but is a subset of IR ,we need a -algebra restricted to subsets of .

    Denition 1.10 (Restricted Borel -algebra )Let A IR . We dene the Borel -algebra restricted to A as thecollection

    B A = {B A : B B}.

    ISABEL MOLINA 7

  • 7/25/2019 measure theory_2 (2).pdf

    8/54

    1.3. SET FUNCTIONS

    In the following we dene the space over which measures, in-cluding probability measures, will be dened. This space will bethe one whose elements will be suitable to measure.

    Denition 1.11 (Measure space)The pair (, A), where is a sample space and A is an algebraover , is called measurable space .

    Problems

    1. Let A, B with A B = . Construct the smallest -algebra that contains A and B.

    2. Prove that an algebra over contains all nite intersectionsof sets from A.

    3. Prove that a -algebra over contains all countable intersec-tions of sets from A.

    4. Prove that a -algebra is an algebra that is closed under limitsof increasing sequences.

    5. Let = IR . Let A be the class of all nite unions of disjointelements from the set

    C = {(a, b], ( , a], (b, ); a b}

    Prove that A is an algebra.

    1.3 Set functionsIn the following A is an algebra and we consider the extended realline given by IR = IR { , + } .

    ISABEL MOLINA 8

  • 7/25/2019 measure theory_2 (2).pdf

    9/54

    1.3. SET FUNCTIONS

    Denition 1.12 (Additive set function)A set function : A IR is additive if it satises:

    For all{Ai}ni=1 Awith AiA j = , i = j, n

    i=1

    Ai =n

    i=1

    (Ai).

    We will assume that + and cannot both belong to therange of . We will exclude the cases (A) = + for all A Aand (A) = for all A A. Extending the denition to aninnite sequence, we obtain a -additive set function.

    Denition 1.13 ( -additive set function)A set function : A IR is -additive if it satises:

    For all {An}nIN A with Ai A j = , i = j,

    n=1

    An =

    n=1

    (An).

    Observe that a -additive set function is well dened, since theinnite union of sets of A belongs to A because A is a -algebra.It is easy to see that an additive function satises (

    ) = 0. More-over, countable additivity implies nite additivity.

    Denition 1.14 (Measure)A set function : A IR is a measure if

    (a) is -additive;

    (b) (A) 0, A A.

    Denition 1.15 (Probability measure)A measure with () = 1 is called a probability measure .

    ISABEL MOLINA 9

  • 7/25/2019 measure theory_2 (2).pdf

    10/54

    1.3. SET FUNCTIONS

    Example 1.5 (Counting measure)Let be any set and consider the -algebra of the parts of ,P (). Dene (A) as the number of points of A. The set function is a measure known as the counting measure .

    Example 1.6 (Probability measure)Let = {x1, x2, . . .} be a nite or countably innite set, and let p1, p2, . . . , be nonnegative numbers. Consider the -algebra of theparts of , P (), and dene

    (A) =xiA

    pi.

    The set function is a probability measure if and only if i=1 pi =1.

    Example 1.7 (Lebesgue measure)A well known measure dened over (IR, B ), which assigns to eachelement of B its length, is the Lebesgue measure , denoted here as. For an interval, either open, close or semiclosed, the Lebesguemeasure is the length of the interval. For a single point, the

    Lebesgue measure is zero.Denition 1.16 ( -nite set function)A set function : A IR is - nite if A A, there exists asequence {An}nIN of disjoint elements of A with (An) < n,whose union covers A, that is,

    A

    n=1

    An .

    Denition 1.17 (Measure space)The triplet (, A, ), where is a sample space, A is an algebraand is a measure dened over (, A), is called measure space .

    ISABEL MOLINA 10

  • 7/25/2019 measure theory_2 (2).pdf

    11/54

    1.3. SET FUNCTIONS

    Denition 1.18 (Absolutely continuous measure withrespect to another)A measure on Borel subsets of the real line is absolutely contin-uous with respect to another measure if (A) = 0 implies that(A) = 0. It is also said that is dominated by , and written as

  • 7/25/2019 measure theory_2 (2).pdf

    12/54

    1.4. PROBABILITY MEASURES

    dened on an algebra A, then for all A1, . . . , An A,

    n

    i=1

    Ai n

    i=1

    (Ai).

    5. Prove that for any measure dened on an algebra A, thenfor all A1, . . . , An A such that n=1 An A,

    n=1

    An

    n=1

    (An).

    1.4 Probability measures

    Denition 1.19 (Random experiment)A random experiment is a process for which:

    the set of possible results is known;

    its result cannot be predicted without error;

    if we repeat it in identical conditions, the result can be differ-ent.

    Denition 1.20 (Elementary event, sample space, event)The possible results of the random experiment that are indivisibleare called elementary events . The set of elementary events isknown as sample space , and it will be denoted . An event A is asubset of , such that once the random experiment is carried out,we can say that A has occurred if the result of the experimentis contained in A.

    Example 1.8 Examples of random experiments are:

    (a) Tossing a coin. The sample space is = {head , tail }.Events are: , {head }, {tail }, .

    ISABEL MOLINA 12

  • 7/25/2019 measure theory_2 (2).pdf

    13/54

    1.4. PROBABILITY MEASURES

    (b) Observing the number of traffic accidents in a minute in Spain.The sample space is = IN {0}.

    (c) Drawing a Spanish woman aged between 20 and 40 and mea-suring her weight (in kgs.). The sample space is = [m, ),were m is the minimum possible weight.

    We will require that the collection of events has a structure of -algebra. This will make possible to obtain the probability of allcomplements, unions and intersections of events. The probabilitieswill be set functions dened over a measurable space composed bythe sample space and a -algebra of subsets of .

    Example 1.9 For the experiment (a) described in Example 1.8,a measurable space is:

    = {head , tail }, A = {, {head }, {tail }, }.For the experiment (b), the sample space is = IN {0}. If wetake the -algebra P (), then (, P ()) is a measurable space.Finally, for the experiment (c), with sample space = [m, ) IR , a suitable -algebra is the Borel -algebra restricted to ,

    B [m, ).Denition 1.21 (Axiomatic denition of probability byKolmogoroff)Let (, A) be a measurable space, where is a sample space andA is a -algebra over . A probability function is a set functionP : A [0, 1] that satises the following axioms:(i) P () = 1;

    (ii) For any sequence A1, A2, . . . of disjoint elements of A, it holds

    P

    n=1

    An =

    n=1

    P (An).

    ISABEL MOLINA 13

  • 7/25/2019 measure theory_2 (2).pdf

    14/54

    1.4. PROBABILITY MEASURES

    By axiom (ii), a probability function is a measure for which themeasure of the sample space is 1. The triplet (, A, P ), whereP is a probability P , is called probability space .

    Example 1.10 For the experiment (a) described in Example 1.8,with the measurable space (, A), where

    = {head , tail }, A = {, {head }, {tail }, },

    dene

    P 1() = 0, P 1({head }) = p, P 1({tail }) = 1 p, P 1() = 1,

    where p [0, 1]. This function veries the axioms of Kolmogoroff.

    Example 1.11 For the experiment (b) described is Example 1.8,with the measurable space (, P ()), dene: For the elementary events, the probability is

    P ({0}) = 0 .131, P ({1}) = 0 .272, P ({2}) = 0 .27, P ({3}) = 0 .183,P ({4}) = 0 .09, P ({5}) = 0 .012, P ({6}) = 0 .00095, P ({7}) = 0 .00005,

    P () = 0 , P ({i}) = 0 , i 8.

    For other events, the probability is dened as the sum of theprobabilities of the elementary events contains in that event,that is, if A = {a1, . . . , a n}, where ai are the elementaryevents, the probability of A is

    P (A) =n

    i=1

    P ({a i}).

    This function veries the axioms of Kolmogoroff.Proposition 1.4 (Properties of the probability)The following properties are consequence of the axioms of Kol-mogoroff:

    ISABEL MOLINA 14

  • 7/25/2019 measure theory_2 (2).pdf

    15/54

    1.4. PROBABILITY MEASURES

    (i) P () = 0;

    (ii) Let A1, A2, . . . , An A with Ai A j = , i = j . Then,

    P n

    i=1

    Ai =n

    i=1

    P (Ai).

    (iii) A A, P (A) 1.

    (iv) A A, P (Ac) = 1 P (A).

    (v) For A, B A with A B, it holds P (A) P (B ).

    (vi) Let A, B A be two events. Then

    P (A B ) = P (A) + P (B) P (A B ).

    (vii) Let A1, A2, . . . , An A be events. Then

    P n

    i=1

    Ai =n

    i=1

    P (Ai) n

    i1 ,i2 =1i1

  • 7/25/2019 measure theory_2 (2).pdf

    16/54

    1.4. PROBABILITY MEASURES

    Proposition 1.6 (Sequential continuity of the probabil-ity)Let (, A, P ) be a probability space. Then, for any sequence{An}of events from A it holds

    P limn An = limn P (An).Example 1.12 Consider the random experiment of selecting ran-domly a number from [0, 1]. Then the sample space is = [0, 1].Consider also the Borel -algebra restricted to [0, 1] and dene thefunction

    g(x) = P ([0, x]), x (0, 1).

    Proof that g is always right continuous, for each probability mea-sure P that we choose.

    Example 1.13 (Construction of a probability measurefor countable )If is nite or countable, the -algebra that is typically chosenis P (). In this case, in order to dene a probability function, itsuffices to dene the probabilities of the elementary events {a i} asP ({a i}) = pi, a i with the condition that i pi = 1, pi 0,i. Then, A ,

    P (A) =aiA

    P ({a i}) =aiA

    pi.

    Example 1.14 (Construction of a probability measurein (IR, B ))

    How can we construct a probability measure in (IR, B )? In gen-eral, it is not possible to dene a probability measure by assigningdirectly a numerical value to each A B , since then probably theaxioms of Kolmogoroff will not be satised.

    ISABEL MOLINA 16

  • 7/25/2019 measure theory_2 (2).pdf

    17/54

    1.4. PROBABILITY MEASURES

    For this, we will rst consider the collection of intervals

    C = {(a, b], ( , b], (a, + ) : a < b}. (1.1)

    We start by assigning values of P to intervals from C , by ensuringthat P is -additive on C . Then, we consider the algebra F ob-

    tained by doing nite unions of disjoint intervals from C and weextend P to F . The extended function will be a probability mea-sure on (IR, F ). Finally, there is a unique extension of a probabilitymeasure from F to (F ) = B , see the following propositions.

    Proposition 1.7 Consider the collection of all nite unions of disjoint intervals from C in (1.1),

    F = n

    i=1

    Ai : Ai C , Ai disjoint .

    Then F is an algebra.

    Next we extend P from C to F as follows.

    Proposition 1.8 (Extension of the probability function)

    (a) For all A F , since A = ni=1 (a i, bi], with ai, bi IR { , + } , let us dene

    P 1(A) =n

    i=1

    P (a i, bi].

    Then, P 1 is a probability measure over (IR, F ).

    (b) For all A C , it holds that P (A) = P 1(A).Observe that B = (F ). Finally, can we extend P from F to

    B = (F )? If the answer is positive, is the extension unique? Thenext theorem gives the answer to these two questions.

    ISABEL MOLINA 17

  • 7/25/2019 measure theory_2 (2).pdf

    18/54

    1.4. PROBABILITY MEASURES

    Theorem 1.2 (Caratheodorys Extension Theorem)Let (, A, P ) be a probability space, whereA is an algebra. Then,P can be extended from A to (A), and the extension is unique(i.e., there exists a unique probability measure P over (A) withP (A) = P (A), A A).

    The extension of P fromF to (F ) = B is done by steps. First,P is extended to the collection of the limits of increasing sequencesof events inF , denoted C . It holds that C F and C (F ) = B (Monotone Class Theorem). The probability of each event A fromC is dened as the limit of the probabilities of the sequences of events from F that converge to A. Afterwards, P is extendedto the -algebra of the parts of IR . For each subset A P (IR ),the probability is dened as the inmum of the probabilities of the events in C that contain A. This extension is not countablyadditive on P (IR ), only on a smaller -algebra, so P it is not aprobability measure on (IR, P (IR )). Finally, a -algebra in whichP is a probability measure is dened as the collectionH of subsetsH , for which P (H ) + P (H c) = 1. This collection is indeed a-algebra that contains C and P is a probability measure on it. Itholds that (F ) = B H and P restricted to (F ) = B is also aprobability measure on (IR, B ).

    Problems

    1. Prove the properties of the probability measures in Proposition1.4.

    2. Prove Bools inequality in Proposition 1.4.

    3. Prove the Sequential Continuity of the probability in Proposi-tion 1.6.

    ISABEL MOLINA 18

  • 7/25/2019 measure theory_2 (2).pdf

    19/54

    1.5. OTHER DEFINITIONS OF PROBABILITY

    1.5 Other denitions of probability

    When is nite, say = {a1, . . . , a k}, many times the elementaryevents are equiprobable, that is, P ({a1}) = = P ({ak}) = 1/k .Then, for A , say A = {a i1 , . . . , a im }, then

    P (A) =m

    j =1

    P ({a i j }) = mk

    .

    This is the denition of probability given by Laplace, which isuseful only for experiments with a nite number of possible resultsand whose results are, a priori, equally frequent.

    Denition 1.22 (Laplace rule of probability)The Laplace probability of an event A is the proportion of results favorable to A; that is, if k is the number of possible resultsor cardinal of and k(A) is the number of results contained in Aor cardinal of A, then

    P (A) = k(A)

    k .

    In order to apply the Laplace rule, we need to learn to count.The counting techniques are comprised in the area of Combinato-rial Analysis.

    The following examples show intuitively the frequentist deni-tion of probability.

    Example 1.15 (Frequentist denition of probability)The following tables report the relative frequencies of the resultsof the experiments described in Example 1.8, when each of theseexperiments are repeated n times.(a) Tossing a coin n times. The table shows that both frequencies

    of head and tail converge to 0.5.

    ISABEL MOLINA 19

  • 7/25/2019 measure theory_2 (2).pdf

    20/54

    1.5. OTHER DEFINITIONS OF PROBABILITY

    Resultn head tail10 0.700 0.30020 0.550 0.45030 0.467 0.533100 0.470 0.530

    1000 0.491 0.50910000 0.503 0.497100000 0.500 0.500

    Lanzamiento de una moneda

    0.000

    0.100

    0.200

    0.300

    0.400

    0.500

    0.600

    0.700

    0.800

    0.900

    1.000

    0 10000 20000 30000 40000 50000 60000 70000 80000 90000 100000

    n

    F r e c .

    r e l .

    cara

    cruz

    (b) Observation of the number of traffic accidents in n minutes.We can observe in the table below that the frequencies of thepossible results of the experiment seem to converge.

    ISABEL MOLINA 20

  • 7/25/2019 measure theory_2 (2).pdf

    21/54

    1.5. OTHER DEFINITIONS OF PROBABILITY

    Resultn 0 1 2 3 4 5 6 7 810 0.1 0.2 0.2 0.2 0.1 0.1 0.1 0 020 0.2 0.4 0.15 0.05 0.2 0 0 0 030 0.13 0.17 0.33 0.23 0.03 0 0 0 0100 0.12 0.22 0.29 0.24 0.09 0 0 0 0

    1000 0.151 0.259 0.237 0.202 0.091 0.012 0.002 0 010000 0.138 0.271 0.271 0.178 0.086 0.012 0.0008 0.0001 020000 0.131 0.272 0.270 0.183 0.090 0.012 0.00095 0.00005 0

    Nmero de accidentes de trfico

    -0.05

    0

    0.05

    0.1

    0.15

    0.2

    0.25

    0.3

    0.35

    0.4

    0.45

    0 2000 4000 6000 8000 10000 12000 14000 16000 18000 20000

    n

    F r e c .

    r e l .

    X=0

    X=1

    X=2

    X=3

    X=4

    X=5

    X=6

    X=7

    X=8

    X=9

    X=10

    X=11

    Observacin del nmero de accidentes

    0.000

    0.050

    0.100

    0.150

    0.200

    0.250

    0.300

    0 2 4 6 8 10 12 14

    Nmero de accidentes

    F r e c .

    r e l . l m

    i t e

    ISABEL MOLINA 21

  • 7/25/2019 measure theory_2 (2).pdf

    22/54

    1.5. OTHER DEFINITIONS OF PROBABILITY

    (c) Independient drawing of n women aged between 20 and 40and measuring their weight (in kgs.). Again, we observe thatthe relative frequencies of the given weight intervals seem toconverge.

    Weight intervals

    n (0, 35] (35, 45] (45, 55] (55, 65] (65, )10 0 0 0.9 0.1 020 0 0.2 0.6 0.2 030 0 0.17 0.7 0.13 0100 0 0.19 0.66 0.15 01000 0.005 0.219 0.678 0.098 05000 0.0012 0.197 0.681 0.121 0.0004

    Seleccin de mujeres y anotacin de su peso

    0

    0.1

    0.2

    0.3

    0.4

    0.5

    0.6

    0.7

    0.8

    0.9

    1

    0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000

    n

    F r e c

    . r e l .

    (0,35](35,45](45,55](55,65]>65

    ISABEL MOLINA 22

  • 7/25/2019 measure theory_2 (2).pdf

    23/54

    1.5. OTHER DEFINITIONS OF PROBABILITY

    Seleccin de mujeres y anotacin de su peso

    0

    0.1

    0.2

    0.3

    0.4

    0.5

    0.6

    0.7

    0.8

    1 2 3 4 5

    Intervalo de pesos

    F r e c . r

    e l . l m

    i t e

    Denition 1.23 (Frequentist probability)The frequentist denition of probability of an event A is the limitof the relative frequency of this event, when we let the number of repetitions of the random experiment grow to innity.

    If the experiment is repeated n times, and nA is the number of repetitions in which A occurs, then the probability of A is

    P (A) = limn

    nAn

    .

    Problems

    1. Check if the Laplace denition of probability satises the ax-ioms of Kolmogoroff.

    2. Check if the frequentist denition of probability satises theaxioms of Kolmogoroff.

    ISABEL MOLINA 23

  • 7/25/2019 measure theory_2 (2).pdf

    24/54

    1.6. MEASURABILITY AND LEBESGUE INTEGRAL

    1.6 Measurability and Lebesgue integral

    A measurable function relates two measurable spaces, preservingthe structure of the events.

    Denition 1.24 Let (1, A1) and (2, A2) be two measurablespaces. A function f : 1 2 is said to be measurable if andonly if B A2, f 1(B) A1, where f 1(B ) = { 1 :f () B}.

    The sum, product, quotient (when the function in the denomi-nator is different from zero), maximum, minimum and compositionof two measurable functions is a measurable function. Moreover,

    if {f n}nIN is a sequence of measurable functions, thensupnIN

    {f n}, inf nIN

    {f n}, lim inf nIN

    f n , lim supnIN

    f n , limn f n ,

    assuming that they exist, are also measurable. If they are innite,we can consider IR instead of IR .

    The following result will gives us a tool useful to check if afunction f from (1, A1) into (2, A2) is measurable.

    Theorem 1.3 Let (1, A1) and (2, A2) be measure spaces andlet f : 1 2. Let C 2 P (2) be a collection of subsets thatgenerates A2, i.e, such that (C 2) = A2. Then f is measurable if and only of f 1(A) A1, A C 2.

    Corolary 1.1 Let (, A) be a measure space and f : IR afunction. Then f is measurable if and only if f 1( , a] A,a IR .

    The Lebesgue integral is restricted to measurable functions. Weare going to dene the integral by steps. We consider measurable

    ISABEL MOLINA 24

  • 7/25/2019 measure theory_2 (2).pdf

    25/54

    1.6. MEASURABILITY AND LEBESGUE INTEGRAL

    functions dened from a measurable space (, A) on the measur-able space (IR, B ), where B is the Borel -algebra. We consideralso a -nite measure .

    Denition 1.25 (Indicator function)Given S A, an indicator function, 1S : IR , gives value 1 toelements of S and 0 to the rest of elements:

    1S () =1, S ;0, / S.

    Next we dene simple functions, which are linear combinationsof indicator functions.

    Denition 1.26 (Simple function)Let (, A, ) be a measure space. Let ai be real numbers and{S i}ni=1 disjoint elements of A. A simple function has the form

    =n

    i=1

    a i1S i .

    Proposition 1.9 Indicators and simple functions are measur-able.

    Denition 1.27 (Lebesgue integral for simple functions)(i) The Lebesgue integral of a simple function with respect to

    a -nite measure is dened as

    d =n

    i=1

    a i (S i).

    (ii) The Lebesgue integral

    of with respect to over a subsetA A is

    A d = A 1A d =n

    i=1

    a i (A S i).

    ISABEL MOLINA 25

  • 7/25/2019 measure theory_2 (2).pdf

    26/54

    1.6. MEASURABILITY AND LEBESGUE INTEGRAL

    The next theorem says that for any measurable function f onIR , we can always nd a sequence of measurable functions thatconverge to f . This will allow the denition of the Lebesgue inte-gral.

    Theorem 1.4 Let f : IR . It holds:

    (a) f is a positive measurable function if and only if f = limn f n ,where {f n}nIN is an increasing sequence of non negative sim-ple functions.

    (b) f is a measurable function if and only if f = limn f n , where{f n}nIN is an increasing sequence of simple functions.

    Denition 1.28 (Lebesgue integral for non-negative func-tions)Let f be a non-negative measurable function dened over (, A, )and {f n}nIN be an increasing sequence of simple functions thatconverge pointwise to f (this sequence can be always constructed).The Lebesgue integral of f with respect to the -nite measure is dened as

    f d = limn f n d.The previous denition is correct due to the following uniquenesstheorem.

    Theorem 1.5 (Uniqueness of Lebesgue integral)Let f be a non-negative measurable function. Let {f n}nIN and{gn}nIN be two increasing sequences of non-negative simple func-tions that converge pointwise to f . Then

    limn f n d = limn gn d.

    ISABEL MOLINA 26

  • 7/25/2019 measure theory_2 (2).pdf

    27/54

    1.6. MEASURABILITY AND LEBESGUE INTEGRAL

    Denition 1.29 (Lebesgue integral for general functions)For a measurable function f that can take negative values, we canwrite it as the sum of two non-negative functions in the form:

    f = f + f ,

    wheref + () = max{f (), 0} is the positive part of f and f () =max{ f (), 0} is the negative part of f . If the integrals of f +

    and f are nite, then the Lebesgue integral of f is

    f d = f + d f d,assuming that at least one of the integrals on the right is nite.

    Denition 1.30 The Lebesgue integral of a measurable functionf over a subset A A is dened as

    A f d = f 1Ad .Denition 1.31 A function is said to be Lebesgue integrable if and only if

    f d < .Moreover, if instead of doing the decomposition f = f + f we do another decomposition, the result is the same.

    Theorem 1.6 Let f 1, f 2, g1, g2 be non-negative measurable func-tions and let f = f 1 f 2 = g1 g2. Then,

    f 1 d f 2 d = g1 d g2 d.Proposition 1.10 A measurable function f is Lebesgue inte-grable if and only if |f | is Lebesgue integrable.

    ISABEL MOLINA 27

  • 7/25/2019 measure theory_2 (2).pdf

    28/54

    1.6. MEASURABILITY AND LEBESGUE INTEGRAL

    Remark 1.2 The Lebesgue integral of a measurable function f dened from a measurable space (, A) on (IR, B ), over a Borelset I = (a, b) B , will also be expressed as

    I

    f d =

    b

    a

    f (x) d(x).

    Proposition 1.11 If a function f : IR IR + = [0, ) is Rie-mann integrable, then it is also Lebesgue integrable (with respectto the Lebesgue measure ) and the two integrals coincide, i.e.,

    A f (x) d(x) = A f (x) dx.Example 1.16

    The Dirichlet function 1lQ

    is not continuous inany point of its domain.

    This function is not Riemann integrable in [0, 1] because eachsubinterval will contain at least a rational number and andirrational number, since both sets are dense in IR . Then, eachsuperior sum is 1 and also the inmum of this superior sums,whereas each lowersum is 0, the same as the suppremum of all

    lowersums. Since the suppresum and the inmum are different,then the Riemann integral does not exist.

    However, it is Lebesgue integrable on [0, 1] with respect to theLebesgue measure , since by denition

    [0,1] 1lQ d = (lQ [0, 1]) = 0,since lQ is numerable.

    Proposition 1.12 (Properties of Lebesgue integral)

    (a) If P (A) = 0, then A f d = 0.ISABEL MOLINA 28

  • 7/25/2019 measure theory_2 (2).pdf

    29/54

    1.6. MEASURABILITY AND LEBESGUE INTEGRAL

    (b) If {An}nIN is a sequence of disjoint sets with A =

    n=1

    An ,

    then

    A f d =

    n=1 An f d .(c) If two measurable functions f and g are equal in all parts of

    their domain except for a subset with measure zero and f is Lebesgue integrable, then g is also Lebesgue integrable andtheir Lebesgue integral is the same, that is,

    If ({ : f () = g()}) = 0, then f d = g d.(d) Linearity: If f and g are Lebesgue integrable functions and a

    and b are real numbers, then

    A(a f + b g) d = a A f d + b A g d.(e) Monotonocity: If f and g are Lebesgue integrable and f < g ,

    then

    f d g d.Theorem 1.7 (Monotone convergence theorem)Consider a point-wise non-decreasing sequence of [0, ]-valuedmeasurable functions {f n}nIN (i.e., 0 f n(x) f n+1 (x), x IR , n > 1) with limn f n = f . Then,

    limn

    f n d =

    fd.

    Theorem 1.8 (Dominated convergence theorem)Consider a sequence of real-valued measurable functions {f n}nIN with limn f n = f . Assume that the sequence is dominated

    ISABEL MOLINA 29

  • 7/25/2019 measure theory_2 (2).pdf

    30/54

    1.6. MEASURABILITY AND LEBESGUE INTEGRAL

    by an integrable function g (i.e., |f n(x)| g(x), x IR , with

    g(x) d < ). Then,limn f n d = fd.

    Theorem 1.9 (H olders inequality)Let (, A, ) be a measure space. Let p, q IR such that p > 1and 1/p + 1/q = 1. Let f and g be measurable functions with|f | pand |g|q -integrable (i.e., |f |

    pd < and |g|q d < .).

    Then, |fg | is also -integrable (i.e., |fg |d < ) and |fg |d |f | p d1/p

    |g|q d1/q

    .

    The particular case with p = q = 2 is known as Schwartzs in-equality .

    Theorem 1.10 (Minkowskis inequality)Let (, A, ) be a measure space. Let p 1. Let f and g bemeasurable functions with|f | p and |g| p -integrable. Then, |f + g| p

    is also -integrable and

    |f + g| pd

    1/p

    |f | p d

    1/p

    + |g| p d

    1/p

    .

    Denition 1.32 ( L p space)Let (, A, ) be a measure space. Let p = 0. We dene the L p()space as the set of measurable functions f with |f | p -integrable,that is,

    L p

    () = L p

    (, A, ) = f : f measurable and |f | p

    d < .By the Minkowskis inequality, theL p() space with 1 p <

    is a vector space in IR , in which we can dene a norm and thecorresponding metric associated with that norm.

    ISABEL MOLINA 30

  • 7/25/2019 measure theory_2 (2).pdf

    31/54

    1.7. DISTRIBUTION FUNCTION

    Proposition 1.13 (Norm in L p space)The function : L p() IR that assigns to each function f L p() the value (f ) = |f |

    p d 1/p is a norm in the vectorspace L p() and it is denoted as

    f p = |f | p d

    1/p

    .

    Now we can introduce a metric in L p() as

    d(f, g ) = f g p.A vector space with a metric obtained from a norm is called ametric space .

    Problems

    1. Proof that if f and g are measurable, then max{f, g } andmin{f, g } are measurable.

    1.7 Distribution function

    We will consider the probability space (IR, B , P ). The distributionfunction will be a very important tool since it will summarize theprobabilities over Borel subsets.

    Denition 1.33 (Distribucion function)Let (IR, B , P ) a probability space. The distribution function (d.f.) associated with the probability function P is dened as

    F : IR [0, 1]x F (x) = P ( , x].We can also deneF ( ) = lim

    xF (x) and F (+ ) = lim

    x+ F (x).

    Then, the distribution function is F : [ , + ] [0, 1].

    ISABEL MOLINA 31

  • 7/25/2019 measure theory_2 (2).pdf

    32/54

    1.7. DISTRIBUTION FUNCTION

    Proposition 1.14 (Properties of the d.f.)

    (i) The d.f. is monotone increasing, that is,

    x < y F (x) F (y).

    (ii) F ( ) = 0 and F (+ ) = 1.

    (iii) F is right continuous for all x IR .

    Remark 1.3 If the d.f. was dened as F (x) = P ( , x), thenit would be left continuous.

    We can speak about a d.f. without reference to the probabilitymeasure P that is used to dene the d.f.

    Denition 1.34 A function F : [ , + ] [0, 1] is a d.f. if and only if satises Properties (i)-(iii).

    Now, given a d.f. F verifying (i)-(iii), is there a unique proba-bility function over (IR, B ) whose d.f. is exactly F ?

    Proposition 1.15 Let F : [ , + ] [0, 1] be a functionthat satises properties (i)-(iii). Then, there is a unique probability

    function P F dened over (IR, B ) such that the distribution functionassociated with P F is exactly F .

    Remark 1.4 Let a, b be real numbers witha < b. Then P (a, b] =F (b) F (a).

    Theorem 1.11 The set D(F ) of discontinuity points of F is -nite or countable.

    Denition 1.35 (Discrete d.f.)A d.f. F is discrete if there exist a nite or countable set{a1, . . . , a n , . . .} IR such that P F ({a i}) > 0, i and i=1 P F ({a i}) = 1, where P F is the probability function associated with F .

    ISABEL MOLINA 32

  • 7/25/2019 measure theory_2 (2).pdf

    33/54

    1.7. DISTRIBUTION FUNCTION

    Denition 1.36 (Probability mass function)The collection of numbers P F ({a1}), . . . , P F ({an}), . . ., such thatP F ({a i}) > 0, i and i=1 P F ({a i}) = 1, is called probability mass function .

    Remark 1.5 Observe thatF (x) = P F ( , x] =

    a i xP F ({a i}).

    Thus, F (x) is a step function and the length of the step at an isexactly the probability of an , that is,

    P F ({a i}) = P ( , an] P ( , an) = F (an) limxan

    F (x)

    = F (an) F (an ).

    Theorem 1.12 (Radon-Nykodym Theorem)Given a measurable space (, A), if a -nite measure on (, A)is absolutely continuous with respect to a -nite measure on(, A), then there is a measurable function f : [0, ), suchthat for any measurable set A,

    (A) = A f d.The function f satisfying the above equality is uniquely dened

    up to a set with measure zero, that is, if g is another functionwhich satises the same property, then f = g except in a set withmeasure zero. f is commonly written d/d and is called the

    RadonNikodym derivative . The choice of notation and the nameof the function reects the fact that the function is analogous toa derivative in calculus in the sense that it describes the rate of change of density of one measure with respect to another.

    ISABEL MOLINA 33

  • 7/25/2019 measure theory_2 (2).pdf

    34/54

    1.7. DISTRIBUTION FUNCTION

    Theorem 1.13 A nite measure on Borel subsets of the realline is absolutely continuous with respect to Lebesgue measure if and only if the point function

    F (x) = (( , x])

    is a locally and absolutely continuous real function.If is absolutely continuous, then the Radon-Nikodym deriva-

    tive of is equal almost everywhere to the derivative of F . Thus,the absolutely continuous measures on IR n are precisely those thathave densities; as a special case, the absolutely continuous d.f.sare precisely the ones that have probability density functions.

    Denition 1.37 (Absolutely continuous d.f.)A d.f. is absolutely continuous if and only if there is a non-negative Lebesgue integrable function f such that

    x IR, F (x) = ( ,x] fd,where is the Lebesgue measure. The function f is called proba-

    bility density function , p.d.f.Proposition 1.16 Let f : IR IR + = [0, ) be a Riemannintegrable function such that

    + f (t)dt = 1. Then, F (x) =

    x

    f (t)dt is an absolutely continuous d.f. whose associated p.d.f is f .

    All the p.d.f.s that we are going to see are Riemann integrable.

    Proposition 1.17 Let F be an absolutely continuous d.f. Thenit holds:

    (a) F is continuous.

    ISABEL MOLINA 34

  • 7/25/2019 measure theory_2 (2).pdf

    35/54

    1.8. RANDOM VARIABLES

    (b) If f is continuous in the point x, then F is differentiable in xand F (x) = f (x).

    (c) P F ({x}) = 0, x IR .

    (d) P F (a, b) = P F (a, b] = P F [a, b) = P F [a, b] =

    b

    a f (t)dt, a, b

    with a < b.(e) P F (B ) = B f (t)dt, B B .Remark 1.6 Note that:(1) Not all continuous d.f.s are absolutely continuous.

    (2) Another type of d.fs are those called singular d.f.s, which are

    continuous. We will not study them.Proposition 1.18 Let F 1, F 2 be d.f.s and [0, 1]. Then,F = F 1 + (1 )F 2 is a d.f.

    Denition 1.38 (Mixed d.f.)A d.f. is said to be mixed if and only if there is a discrete d.f.F 1, an absolutely continuous d.f. F 2 and [0, 1] such that

    F = F 1 + (1 )F 2.

    1.8 Random variables

    A random variable transforms the elements of the sample space into real numbers (elements from IR ), preserving the -algebrastructure of the initial events.

    Denition 1.39 Let (, A) be a measurable space. Consideralso the measurable space (IR, B ), where B is the Borel -algebra

    ISABEL MOLINA 35

  • 7/25/2019 measure theory_2 (2).pdf

    36/54

    1.8. RANDOM VARIABLES

    over IR . A random variable (r.v.) is a function X : IR thatis measurable, that is,

    B B , X 1(B) A,

    where X 1(B ) := { : X () B}.

    Remark 1.7 Observe that:

    (a) a r.v. X is simply a measurable function in IR . The namerandom variable stems from the fact the result of the randomexperiment is random, and then the observed value of the r.v., X (), is also random.

    (b) the measurability property of the r.v. will allow transferringprobabilities of events A A to probabilities of Borel setsI B , where I is the image of A through X .

    Example 1.17 For the experiments introduced in Example 1.8,the following are random variables:

    (a) For the measurable space (, A) with sample space = {head , tailand -algebra A = {, {head }, {tail }, }, a randomvariable is:

    X () = 1 if = head ,0 if = tail ;This variable counts the number of heads when tossing a coin.In fact, it is a random variable, since for any event from thenal space B B , we have

    If 0, 1 B, then X 1

    (B ) = A. If 0 B but 1 / B , then X 1(B ) = {tail} A. If 1 B but 0 / B , then X 1(B ) = {head} A. If 0, 1 / B, then X 1(B ) = A.

    ISABEL MOLINA 36

  • 7/25/2019 measure theory_2 (2).pdf

    37/54

    1.8. RANDOM VARIABLES

    (b) For the measurable space (, P ()), where = IN {0},since IR , a trivial r.v. is X 1() = . It is a r.v. since forany B B ,

    X 11 (B ) = { IN {0} : X 1() = B}

    is the set of natural numbers (including zero) that are con-tained in B. But any countable set of natural numbers belongsto P (), since this -algebra contains all subsets of IN {0}.Therefore, X 1=Number of traffic accidents in a minute inSpain is a r.v.Another r.v. could be

    X 2() = 1 if IN ;0 if = 0.

    Again, X 2 is a r.v. since for each B B ,

    If 0, 1 B, then X 12 (B ) = P (). If 1 B but 0 / B , then X 12 (B ) = IN P (). If 0 B but 1 / B , then X 12 (B ) = {0} P ().

    If 0, 1 / B, then X 12 (B ) = P ().

    (c) As in previous example, for the measurable space (, B ),where = [m, ), a possible r.v. is X 1() = , since foreach B B , we have

    X 11 (B ) = { [a, ) : X 1() = B} = [a, )B B .

    Another r.v. would be the indicator of less than 65 kgs., givenby

    X 2() =1 if 65,0 if < 65.

    ISABEL MOLINA 37

  • 7/25/2019 measure theory_2 (2).pdf

    38/54

    1.8. RANDOM VARIABLES

    Theorem 1.14 Any function X from (IR, B ) in (IR, B ) that iscontinuous is a r.v.

    The probability of an event from IR induced by a r.v. is goingto be dened as the probability of the original events from ,that is, the probability of a r.v. preserves the probabilities of theoriginal measurable space. This denition requires the measura-bility property, since the original events must be in the initial-algebra so that they have a probability.

    Denition 1.40 (Probability induced by a r.v.)Let (, A, P ) be a measure space and letB be the Borel -algebraover IR . The probability induced by the r.v. X is a function

    P X : B IR , dened asP X (B ) = P (X 1(B )), B B .

    Theorem 1.15 The probability induced by a r.v. X is a proba-bility function in (IR, B ).

    Example 1.18 For the probability function P 1 dened in Exam-

    ple 1.10 and the r.v. dened in Example 1.17 (a), the probabilityinduced by a r.v. X is described as follows. Let B B .

    If 0, 1 B , then P 1X (B ) = P 1(X 1(B )) = P 1() = 1.

    If 0 B but 1 / B , then P 1X (B ) = P 1(X 1(B)) = P 1({tail}) =1/ 2.

    If 1 B but 0 / B , then P 1X (B ) = P 1(X 1(B)) = P 1({head}) =

    1/ 2. If 0, 1 / B , then P 1X (B ) = P 1(X 1(B )) = P 1() = 0.

    ISABEL MOLINA 38

  • 7/25/2019 measure theory_2 (2).pdf

    39/54

    1.8. RANDOM VARIABLES

    Summarizing, the probability induced by X is

    P 1X (B ) =0, if 0, 1 / B;1/ 2, if 0 or 1 are in B;1, if 0, 1 B.

    In particular, we obtain the following probabilities P 1X ({0}) = P 1(X = 0) = 1/ 2.

    P 1X (( , 0]) = P 1(X 0) = 1/ 2.

    P 1X ((0, 1]) = P 1(0 < X 1) = 1/ 2.

    Example 1.19 For the probability function P introduced in Ex-

    ample 1.11 and the r.v. X 1 dened in Example 1.17 (b), the prob-ability induced by the r.v. X 1 is described as follows. Let B B such that IN B = {a1, a2, . . . , a p}.

    P X 1 (B ) = P (X 11 (B )) = P 1((IN {0})B ) = P ({a1, a2, . . . , a p}) = p

    i=1

    P

    Denition 1.41 (Degenerate r.v.)A r.v. X is said to be degenerate at a point c IR if and only if

    P (X = c) = P X ({c}) = 1.

    Since P X is a probability function, there is a distribution func-tion that summarizes its values.

    Denition 1.42 (Distribucion function)

    The distribucion function (d.f.) of a r.v. X is dened as thefunction F X : IR [0, 1] with

    F X (x) = P X ( , x], x IR.

    ISABEL MOLINA 39

  • 7/25/2019 measure theory_2 (2).pdf

    40/54

    1.8. RANDOM VARIABLES

    Denition 1.43 A r.v. X is said to be discrete (absolutely con-tinuous ) if and only if its d.f. F X is discrete (absolutely continu-ous).

    Remark 1.8 It holds that:

    (a) a discrete r.v. takes a nite or countable number of values.(b) a continuous r.v. takes innite number of values, and the

    probability of single values are zero.

    Denition 1.44 (Support of a r.v.)

    (a) If X is a discrete r.v., we dene the support of X as

    DX := {x IR : P X {x} > 0}.

    (b) If X continuous, the support is dened as

    DX := {x IR : f X (x) > 0}.

    Observe that for a discrete r.v.,xDX

    P X {x} = 1 and DX is nite

    or countable.Example 1.20 From the random variables introduced in Exam-ple 1.17, those dened in (a) and (b) are discrete, along with X 2form (c).

    For a discrete r.v. X , we can dene a function the gives theprobabilities of single points.

    Denition 1.45 The probability mass function (p.m.f) of a dis-crete r.v. X is the function pX : IR IR such that

    pX (x) = P X ({x}), x IR.

    ISABEL MOLINA 40

  • 7/25/2019 measure theory_2 (2).pdf

    41/54

    1.8. RANDOM VARIABLES

    We will also use the notation p(X = x) = pX (x).The probability function induced by a discrete r.v. X , P X , is

    completely determined by the distribution function F X or by themass function pX . Thus, in the following, when we speak aboutthe distribution of a discrete r.v. X , we could be referring either

    to the probability function induced by X , P X , the distributionfunction F X , or the mass function pX .

    Example 1.21 The r.v. X 1 : Weight of a randomly selectedSpanish woman aged within 20 and 40, dened in Example 1.17(c), is continuous.

    Denition 1.46 The probability density function (p.d.f.) of X is a function f X : IR IR dened as

    f X (x) = 0 if x S ;

    F X (x) if x / S.

    It is named probability density function of x because it gives thedensity of probability of an innitesimal interval centered in x.

    The same as in the discrete case, the probabilities of a contin-

    uous r.v. X are determined either by P X , F X or the p.d.f. f X .Again, the distribution of a r.v., could be referring to any of these functions.

    Random variables, as measurable functions, inherit all the prop-erties of measurable functions. Furthermore, we will be able to cal-culate Lebesgue integrals of measurable functions of r.v.s using asmeasure their induced probability functions. This will be possible

    due to the following theorem.

    Theorem 1.16 (Theorem of change of integration space)Let X be a r.v. from (, A, P ) in (IR, B ) an g another r.v. from

    ISABEL MOLINA 41

  • 7/25/2019 measure theory_2 (2).pdf

    42/54

    1.8. RANDOM VARIABLES

    (IR, B ) in (IR, B ). Then,

    (g X ) dP = IR g dP X .Remark 1.9 Let F X be the d.f. associated with the probabilitymeasure P X . The integral

    IR g dP X = +

    g(x) dP X (x)

    will also be denoted as

    IR g dF x = +

    g(x) dF X (x).

    Proposition 1.19 If X is an absolutely continuous r.v. with d.f.F X and p.d.f. with respect to the Lebesgue measuref X = dF X /dand if g is any function for which IR |g| dP X < , then IR g dP X = IR g f X d.

    In the following we will see how to calculate these integrals forthe most interesting cases of d.f. F X .(a) F X discrete: The probability is concentrated in a nite or nu-

    merable setDX = {a1, . . . , a n , . . .}, with probabilitiesP {a1}, . . . , P {Then, using properties (a) and (b) of the Lebesgue integral,

    IR g dP X = DX g dP X + DcX g(x) dP X = DX g dP X =

    n=1 {an } g dP X

    =

    n=1

    g(an)P {an}.

    ISABEL MOLINA 42

  • 7/25/2019 measure theory_2 (2).pdf

    43/54

    1.8. RANDOM VARIABLES

    (b) F X absolutely continuous: In this case,

    IR g dP X = IR g f X dand if g f X is Riemann integrable, then it is also Lebesgue

    integrable and the two integrals coincide, i.e.,

    IR g dP X = IR g f X d = +

    g(x)f X (x)dx.

    Denition 1.47 The expectation of the r.v. X is dened as

    = E (X ) =

    IR

    X dP.

    Corolary 1.2 The expectation of X can be calculated as

    E (X ) = IR x dF X (x),and provided that X is absolutely continuous with p.d.f. f X (x),then

    E (X ) =

    IR

    xf X (x) dx.

    Denition 1.48 The k-th moment of X with respect to a IRis dened as

    k,a = E (gk,a X ),where gk,a (x) = (x a)k, provided that the expectation exists.

    Remark 1.10 It holds that

    k,a = IR gk,a X dP = IR gk,a (x) dF X (x) = IR (x a)k dF X (x).

    Observe that for the calculation of the moments of a r.v. X weonly require is its d.f.

    ISABEL MOLINA 43

  • 7/25/2019 measure theory_2 (2).pdf

    44/54

    1.8. RANDOM VARIABLES

    Denition 1.49 The k-th moment of X with respect to themean is

    k := k, = IR (x )k dF X (x).In particular, second moment with respect to the mean is called

    variance ,

    2X = V (X ) = 2 = IR (x )2 dF X (x).The standard deviation is X = V (X ).Denition 1.50 The k-th moment of X with respect to theorigin is

    k := k,0 = E (X n) = IR xk dF X (x).Proposition 1.20 It holds

    k =k

    i=0

    ( 1)k i ki k i i.

    Lemma 1.1 If k = E (X k) exists and is nite, then there existsm and is nite, m k.

    One way of obtaining information about the distribution of arandom variable is to calculate the probability of intervals of thetype (E (X ) , E (X ) + ). If we do not know the theoreticaldistribution of the random variable but we do know its expectation

    and variance, the Tchebychevs inequality gives a lower bound of this probability. This inequality is a straightforward consequenceof the following one.

    ISABEL MOLINA 44

  • 7/25/2019 measure theory_2 (2).pdf

    45/54

    1.9. THE CHARACTERISTIC FUNCTION

    Theorem 1.17 (Markovs inequality)Let X be a r.v. from (, A, P ) in (IR, B ) and g be a non-negativer.v. from (IR, B , P X ) in (IR, B ) and let k > 0. Then, it holds

    P ({ : g(X ()) k}) E [g(X )]

    k .

    Theorem 1.18 (Tchebychevs inequality)Let X be a r.v. with nite mean and nite standard deviation. Then

    P ({ : |X () | k}) 1k2

    .

    Corolary 1.3 Let X be a r.v. with mean and standard devia-tion = 0. Then, P (X = ) = 1.

    Problems

    1. Prove Markovs inequality.

    2. Prove Tchebychevs inequality.

    3. Prove Corollary 1.3.

    1.9 The characteristic function

    We are going to dene the characteristic function associated with adistribution function (or with a random variable). This function ispretty useful due to its close relation with the d.f. and the momentsof a r.v.

    Denition 1.51 (Characteristic function)Let X be a r.v. dened from the measure space (, A, P ) into

    ISABEL MOLINA 45

  • 7/25/2019 measure theory_2 (2).pdf

    46/54

    1.9. THE CHARACTERISTIC FUNCTION

    (IR, B ). The characteristic function (c.f.) of X is

    (t) = E eitX = eitX dP, t IR.Remark 1.11 (The c.f. is determined by the d.f.)

    The functionY t = gt(X ) = eitX

    is a composition of two measurablefunctions, X () and gt(x), that is, Y t = gt X . Then, Y t ismeasurable and by the Theorem of change of integration space,the c.f. is calculated as

    (t) = eitX dP = (gt X )dP =

    gtdP X =

    IR

    gt(x)dF (x) =

    IR

    eitx dF (x),

    where P X is the probability induced by X and F is the d.f. associ-ated with P X . Observe that the only thing that we need to obtain is the d.f., F , that is, is uniquely determined by F .

    Remark 1.12 Observe that:

    IR e

    itx dF (x) =

    IR cos(tx )dF (x) + i

    IR sin(tx )dF (x).

    Since | cos(tx )| 1 and | sin(tx )| 1, then it holds:

    IR | cos(tx )| 1, IR | sin(tx )| 1and therefore, | cos(tx )| and | sin(tx )| are integrable. Thismeans that (t) exists t IR .

    Many properties of the integral of real functions can be trans-lated to the integral of the complex functioneitx . In practicallyall cases, the result is a straightforward consequence of the factthat to integrate a complex values function is equivalent to in-tegrate separately the real and imaginary parts.

    ISABEL MOLINA 46

  • 7/25/2019 measure theory_2 (2).pdf

    47/54

    1.9. THE CHARACTERISTIC FUNCTION

    Proposition 1.21 (Properties of the c.f.)Let (t) be the characteristic function associated with the d.f. F .

    Then

    (a) (0) = 1 ( is non-vanishing at t = 0);

    (b) |(t)| 1 ( is bounded);(c) ( t) = (t), t IR , where (t) denotes the conjugate

    complex of (t);

    (d) (t) is uniformly continuous in IR , that is,

    limh0

    |(t + h) (t)| = 0, t IR.

    Theorem 1.19 (c.f. of a linear transformation)Let X be a r.v. with c.f. X (t). Then, the c.f. of Y = aX + b,where a, b IR , is Y (t) = eitbX (at ).

    Example 1.22 (c.f. for some r.v.s)Here we give the c.f. of some well known random variables:

    (i) For the Binomial distribution, Bin(n, p), the c.f. is given by

    (t) = (q + peit )n .

    (ii) For the Poisson distribution, Pois(), the c.f. is given by

    (t) = exp (eit 1) .

    (iii) For the Normal distribution, N (, 2), the c.f. is given by

    (t) = exp it 2t2

    2.

    Lemma 1.2 x IR , |eix 1| | x|.

    ISABEL MOLINA 47

  • 7/25/2019 measure theory_2 (2).pdf

    48/54

    1.9. THE CHARACTERISTIC FUNCTION

    Remark 1.13 (Proposition 1.21 does not determine ac.f.)If (t) is a c.f., then Properties (a)-(d) in Proposition 1.21 holdbut the reciprocal is not true, see Example 1.23.

    Theorem 1.20 (Moments are determined by the c.f.)If the n-th moment of F , n = IR xndF (x), is nite, then

    (a) The n-th derivative of (t) at t = 0 exists and satisesn)(0) =in n .

    (b) n)(t) = in IR eitx xndF (x).

    Corolary 1.4 (Series expansion of the c.f.)

    If n = E (X n) exists n IN , then it holds that

    X (t) =

    n=0

    n(it )n

    n! , t ( r, r ),

    where ( r, r ) is the radius of convergence of the series.

    Example 1.23 (Proposition 1.21 does not determine a

    c.f.)Consider the function (t) = 11+ t4 , t IR . This function veriesproperties (a)-(d) in Proposition 1.21. However, observe that therst derivative evaluated at zero is

    (0) =

    4t3

    (1 + t4)2 t=0= 0.

    The second derivative at zero is

    (0) = 12t2(1 + t4)2 + 4t3 2(1 + t4) 4t3

    (1 + t4)4 t=0= 0.

    ISABEL MOLINA 48

  • 7/25/2019 measure theory_2 (2).pdf

    49/54

    1.9. THE CHARACTERISTIC FUNCTION

    Then, if (t) is the c.f. of a r.v. X , the mean and variance areequal to

    E (X ) = 1 = (0)

    i = 0, V (X ) = 2 (E (X ))2 = 2 =

    (0)i2

    = 0.

    But a random variable with mean and variance equal to zero is adegenerate variable at zero, that is, P (X = 0) = 1, and then itsc.f. is

    (t) = E eit 0 = 1, t IR,which is a contradiction.

    We have seen already that the d.f. determines the c.f. Thefollowing theorem gives an expression of the d.f. in terms of thec.f. for an interval. This result will imply that the c.f. determinesa unique d.f.

    Theorem 1.21 (Inversion Theorem)Let (t) be the c.f. corresponding to the d.f. F (x). Let a, b betwo points of continuity of F , that is, a, b C (F ). Then,

    F (b) F (a) = 12 limT

    T

    T

    e ita e itb

    it (t)dt.

    As a consequence of the Inversion Theorem, we obtain the fol-lowing result.

    Theorem 1.22 (The c.f. determines a unique d.f.)If (t) is the c.f. of a d.f. F , then it is not the c.f. of any other d.f.

    Remark 1.14 (c.f. for an absolutely continuous r.v.)If F is absolutely continuous with p.d.f. f , then the c.f. is

    (t) = E eitX = IR eitx dF (x) = IR eitx f (x)dx.ISABEL MOLINA 49

  • 7/25/2019 measure theory_2 (2).pdf

    50/54

    1.9. THE CHARACTERISTIC FUNCTION

    We have seen that for absolutely continuous F , the c.f. (t) canbe expressed in terms of the p.d.f. f . However, it is possible toexpress the p.d.f. f in terms of the c.f. (t)? The next theorem isthe answer.

    Theorem 1.23 (Fourier transform of the c.f.)If F is absolutely continuous and (t) is Riemann integrable in IR ,that is,

    |(t)| dt < , then (t) is the c.f. corresponding to

    an absolutely continuous r.v. with p.d.f. given by

    f (x) = F (x) = 12 IR e itx(t)dt,

    where the last term is called the Fourier transform of .

    In the following, we are going to study the c.f. of random vari-ables that share the probability symmetrically in IR + and IR .

    Denition 1.52 (Symmetric r.v.)A r.v. X is symmetric if and only if X d= X , that is, iff F X (x) =F X (x), x IR .

    Remark 1.15 (Symmetric r.v.)Since is determined by F , X is symmetric iff X (t) = X (t),t IR .

    Corolary 1.5 (Another characterization of a symmet-ric r.v.)X is symmetric iff F X ( x) = 1 F X (x ), where F X (x ) =P (X < x ).

    Corolary 1.6 (Another characterization of a symmet-ric r.v.)X is symmetric iff X (t) = X ( t), t IR .

    ISABEL MOLINA 50

  • 7/25/2019 measure theory_2 (2).pdf

    51/54

    1.9. THE CHARACTERISTIC FUNCTION

    Corolary 1.7 (Properties of a symmetric r.v.)Let X be a symmetric r.v. Then,

    (a) If F X is absolutely continuous, then f X (x) = f X ( x), x IR .

    (b) If F X is discrete, then P X (x) = P X ( x), x IR .Theorem 1.24 (c.f. of a symmetric r.v.)The c.f. of the d.f. F is real iff F is symmetric.

    Remark 1.16 We know that (t), t IR determines com-pletely F (x), x IR . However, if we only know (t) for t ina nite interval, then do we know completely F (x), x IR ? Theanswer is no, since we can nd two different d.f.s with the same c.f.in a nite interval, see Example 1.24.

    Example 1.24 (The c.f. in a nite interval does notdetermine the d.f.)Consider the r.v. X taking the values (2n + 1), n = 0, 1, 2, ,with probabilities

    P (X = 2n+1) = P (X = (2n+1)) = 42(2n + 1) 2

    , n = 0, 1, 2,

    Consider also the r.v. Y taking values 0,(4n +2), n = 0, 1, 2, . . .,with probabilities

    P (Y = 0) = 1/ 2

    P (Y = 4n + 2) = P (X = (4n + 2)) = 2

    2(2n + 1) 2, n = 0, 1, 2,

    ISABEL MOLINA 51

  • 7/25/2019 measure theory_2 (2).pdf

    52/54

    1.9. THE CHARACTERISTIC FUNCTION

    Using the formulas

    n=0

    1(2n + 1) 2

    = 2

    8 ;

    8

    2

    n=0

    cos(2n + 1) t

    (2n + 1) 2 = 1

    2|t |

    , |t | ,

    prove:

    (a) P X is a probability function;

    (b) P Y is a probability function;

    (c) X (t) = Y (t), for |t | / 2.

    Remark 1.17 From the series expansion of the c.f. in Corollary1.4, one is tempted to conclude that the c.f., and therefore alsothe d.f., of a r.v. is completely determined by all of its moments,provided that they exist. This is false, see Example 1.25.

    Example 1.25 (Moments do not always determine thec.f.)

    For a [0, 1], consider the p.d.f. denes by

    f a(x) = 124

    e x1 / 4

    (1 a sin(x1/ 4), x 0.

    Using the following formulas:

    0xne x

    1 / 4sin(x1/ 4)dx = 0, n IN {0};

    0xne x

    1/

    4

    = 4(4n + 3)!n IN {0},

    prove:

    (a) f a is a p.d.f. a [0, 1].

    ISABEL MOLINA 52

  • 7/25/2019 measure theory_2 (2).pdf

    53/54

    1.9. THE CHARACTERISTIC FUNCTION

    (b) The moments n are the same a [0, 1].

    (c) The series n=0 n(it )n

    n! diverges for all t = 0.

    Denition 1.53 (Moment generating function)We dene the moment generating function (m.g.f.) of a r.v. X as

    M (t) = E etX , t ( r, r ) IR, r > 0,

    assuming that there exists r > 0 such that the integral exists forall t ( r, r ). If such r > 0 does not exist, then we say that them.g.f. of X does not exist.

    Remark 1.18 Remember that the c.f. always exists unlike them.g.f.

    Proposition 1.22 (Properties of the m.g.f.)If there exists r > 0 such that the series

    n=0

    (tx )n

    n!

    is uniformly convergent in ( r, r ), where r is called the radius of

    convergence of the series, then it holds that(a) The n-th moment n exists and if nite, n IN ;

    (b) The n-th derivative of M (t), evaluated at t = 0, exists and itsatises M n)(0) = n , n IN ;

    (c) M (t) can be expressed as

    M (t) =

    n=0

    nn! tn , t ( r, r ).

    Remark 1.19 Under the assumption of Proposition 1.22, the mo-ments {n}n=0 determine the d.f. F .

    ISABEL MOLINA 53

  • 7/25/2019 measure theory_2 (2).pdf

    54/54

    1.9. THE CHARACTERISTIC FUNCTION

    Remark 1.20 It might happen that M (t) exists for t outside( r, r ), but that it cannot be expressed as the series n=0

    nn! t

    n .