Applied Probability and Stochastic Process

29
ECE6111: APPLIED PROBABILITY AND STOCHASTIC PROCESSES Yaakov Bar-Shalom University of Connecticut Dept. of Elec. & Computer Engineering Storrs, CT 06269-2157 These viewgraphs are based on the text A. Papoulis and S. U. Pillai Probability, Random Variables and Stochastic Processes, McGraw-Hill, 2002 and the viewgraphs of S. U. Pillai (www.mhhe.com/papoulis), with permission 110801 ECE6111L01 [Ch.1; Ch2] 1/ 29

description

Applied Probability and Stochastic Process

Transcript of Applied Probability and Stochastic Process

  • ECE6111: APPLIED PROBABILITYAND STOCHASTIC PROCESSES

    Yaakov Bar-Shalom

    University of ConnecticutDept. of Elec. & Computer Engineering

    Storrs, CT 06269-2157These viewgraphs are based on the text

    A. Papoulis and S. U. PillaiProbability, Random Variables and Stochastic Processes,

    McGraw-Hill, 2002and the viewgraphs of S. U. Pillai (www.mhhe.com/papoulis), with permission

    110801 ECE6111L01 [Ch.1; Ch2] 1/ 29

  • Basics

    Probability theory deals with the study of random phenomena formodeling uncertainty.Probability demonstrates both our knowledge as well as our lackof knowledge.

    Goal: to present methods that deal with uncertainty in engineeringsystems (some are inherently random, some we assume as randombecause of modeling limitations).Probabilistic Experiment

    The notion of an experiment assumes a set of repeatableconditions that allow any number of identical repetitions.When a probabilistic experiment is performed under repeatableconditions, certain elementary outcomes i occur in different butcompletely uncertain (probabilistic) ways.We can assign a nonnegative number P (i) the probability ofthe event that outcome i occurs in various ways.

    110801 ECE6111L01 [Ch.1; Ch2] 2/ 29

  • Laplaces Classical Definition:The Probability of an event A is defined a priori without actualexperimentation as

    P (A) =Number of outcomes favorable to ATotal number of possible outcomes (1)

    provided all these outcomes are equally likely.

    Example: Consider a box with n white and m red balls. In thiscase, there are two elementary outcomes: white ball or red ball.Probability of selecting a white ball is n

    n+m .

    Relative Frequency Definition:The probability of an event A is defined as

    P (A) = limn

    nAn

    (2)

    where nA is the number of occurrences of A and n is the totalnumber of trials.

    110801 ECE6111L01 [Ch.1; Ch2] 3/ 29

  • Axiomatic approach to probability (Kolmogorov):developed through a set of axioms (given below) is generallyrecognized as superior to the above definitions.

    The totality of all outcomes i, known a priori, constitutes a set , theset of all experimental outcomes space of outcomes

    = {1, 2, , i, } (3) has subsets A, B, C, etc., called events.If A is a subset of , then A implies .From A and B, we can generate other related subsets (events).

    A B = {| A or B}

    A B = {| A and B}

    andA = {| / A} (4)

    110801 ECE6111L01 [Ch.1; Ch2] 4/ 29

  • A B A B A

    A

    A B A B AFig.1.1 Venn diagrams.

    If A B = , the empty set, then A and B are said to bemutually exclusive (m.e.).A partition of is a collection of mutually exclusive andexhaustive (m.e.e.) subsets of such that

    Ai Aj = andn

    i=1

    Ai = (5)

    BA

    1A

    2A

    nA

    iA

    jA

    A B = Fig.1.2 The concepts of m.e. and m.e.e.

    110801 ECE6111L01 [Ch.1; Ch2] 5/ 29

  • De Morgans theoremA B = A B and A B = A B (6)

    A B A B A BA B

    A B A B A B A BFig.1.3

    It is meaningful to talk about the subsets of as events, forwhich we want a mechanism to compute their probabilities.

    Example 1.1:Consider the experiment where two coins are simultaneously tossed.The various elementary events are

    1 = {H,H} , 2 = {H,T } , 3 = {T,H} , 4 = {T, T }

    and = {1, 2, 3, 4}

    110801 ECE6111L01 [Ch.1; Ch2] 6/ 29

  • The subset A = {1, 2, 3} is the same as Heads has occurred atmost once" and qualifies as an event.Suppose two subsets A and B are both events, then consider thequestions

    Does an outcome belong to A or B = AB = A+B?"Does an outcome belong to A and B = A B = AB?"Does an outcome fall outside A?"

    Thus the sets A B, A B, A, B, etc., also qualify as events. Weshall formalize this using the notion of a Field.

    Field: A collection of subsets of a nonempty set forms a field F if

    i) Fii) If A F, then A F (7)iii) If A F, and B F, then A B F.

    i.e., it is closed under set operations.

    110801 ECE6111L01 [Ch.1; Ch2] 7/ 29

  • Using (i) - (iii), it is easy to show that A B, A B, etc., alsobelong to F .

    For example, from (ii) we have A F , B F and using (iii)this gives A B F ;Applying (ii) again we get A B = A B F , where wehave used De Morgans theorem in (6).

    Thus if A F and B F then

    F ={, A, B, A, B, A B, A B, A B,

    } (8)From here onwards, we shall reserve the term event only to membersof F .Assuming that the probability (measure) pi = P (i) of elementaryoutcomes of are a priori defined, how does one obtain probabilities(measures) of more complicated events such as A, B, AB = A B,etc., above?The three axioms of probability defined next can be used to achievethis goal.

    110801 ECE6111L01 [Ch.1; Ch2] 8/ 29

  • The Axioms of ProbabilityFor any event A, we assign a number P (A), called the probability ofthe event A. This number satisfies the following three conditions thatare the axioms of probability.

    Axiom 1 P (A) 0 (Probability is a nonnegative number)Axiom 2 P () = 1 (Probability of the set of outcomes is unity)(9)Axiom 3 If A B = , then P (A B) = P (A) + P (B).

    The event is the sure event it occurs in every experiment.Note that (iii) states that if A and B are mutually exclusive (m.e.)events,

    the probability of their union is the sum of their probabilities.

    Q: What is missing in Axiom 3?

    110801 ECE6111L01 [Ch.1; Ch2] 9/ 29

  • The following conclusions follow from these axioms:a. Since A A = , we have using (ii)

    P (A A) = P () = 1

    But A A = , then, using (iii),P (A A) = P (A) + P (A) = 1 i.e., P (A) = 1 P (A) (10)

    b. Similarly, for any A, A {} = {} (m.e.)Hence it follows that P (A {}) = P (A) + P ()But A {} = A, and thus P () = 0.

    c. Suppose A and B are not mutually exclusive (m.e.)?

    Q: How does one compute P (A B)?

    110801 ECE6111L01 [Ch.1; Ch2] 10/ 29

  • A BA A B

    A B A BFig.1.4

    To compute the above probability, we should re-express A B interms of m.e. sets so that we can make use of the probability axioms.From Fig.1.4 we have

    A B = A AB (11)where A and AB are clearly m.e. events. Thus using axiom (iii)

    P (A B) = P (A AB) = P (A) + P (AB) (12)To compute P (AB) we can express B as

    B = B = B (A A)

    = (B A) (B A) = BA BA (13)Thus

    P (B) = P (BA) + P (BA) (14)110801 ECE6111L01 [Ch.1; Ch2] 11/ 29

  • From (14),P (AB) = P (B) P (AB) (15)

    and using (15) in (12)P (A B) = P (A) + P (B) P (AB) (16)

    (This result really follows by inspection of Fig. 1.4 and compensatingfor double counting).

    Question:Suppose every member of an infinite collection Ai of disjointsets is an event, then what can we say about their union

    A =

    i=1

    Ai

    i.e., suppose all Ai F , what about A? Does it belong to F?Further, if A also belongs to F (it is a measurable set), whatabout P (A)?

    110801 ECE6111L01 [Ch.1; Ch2] 12/ 29

  • The above questions involving an infinite number of sets can beunderstood using our intuitive experience from plausible experiments.For example, in a coin tossing experiment, where the same coin istossed indefinitely, define

    A = heads eventually appears". (17)Is A an event?Let

    An = {heads appears for the 1st time on the n-th toss}

    =

    t, t, t, , t

    n1

    , h

    (18)

    Clearly Ai Aj = . Thus, the above A is the infinite union

    A = A1 A2 A3 Ai (19)

    110801 ECE6111L01 [Ch.1; Ch2] 13/ 29

  • We cannot use probability axiom (iii) to compute P(A), since theaxiom only deals with two (or a finite number) of m.e. events.To answer the questions whether A is an event and how tocompute P (A), extension of these notions must be done as newaxioms.

    -Field (Borel Field) DefinitionA field F is a -field if in addition to the three conditions in (7), wehave the following:For every sequence Ai, i = 1, . . . ,, of pairwise disjoint eventsbelonging to F , their union also belongs to F , i.e.,

    A =

    i=1

    Ai F (20)

    This property is called being closed under a countable number ofunions. (countable = countably infinite)

    110801 ECE6111L01 [Ch.1; Ch2] 14/ 29

  • Thus, in view of (20), we can add another axiom to the set ofprobability axioms in (9).Axiom 4: If Ai, i = 1, 2, . . . are mutually exclusive, then

    P

    (i=1

    Ai

    )=i=1

    P (Ai) (21)

    Returning to the infinite coin tossing experiment, from experience weknow that if we keep tossing a coin, eventually, a heads must showup, i.e.,

    P (A) = 1 (22)But A =

    i=1 Ai and using the fourth probability axiom (21),

    P (A) = P

    (i=1

    Ai

    )=

    i=1

    P (Ai) (23)

    Note: A is called a Borel set.

    110801 ECE6111L01 [Ch.1; Ch2] 15/ 29

  • From (18) An is exactly one in 2n outcomes. Thus, for a fair coin, wehave

    P (An) =1

    2nand P

    (n=1

    An

    )=

    n=1

    1

    2n= 1 (24)

    which agrees with (22), thus justifying the intuitive reasonableness ofthe Axiom 4 in (21).In summary, the triplet (, F , P ), consisting of

    a nonempty set of elementary eventsa -field F of subsets of a probability measure P on the sets in F subject to the fouraxioms ((9) and (21)),

    form a probability model (probability space).The probability of more complicated events must follow from thisframework by deduction.

    Q: What other events or variables might be of interest?

    110801 ECE6111L01 [Ch.1; Ch2] 16/ 29

  • Conditional Probability and Independence

    In N independent trials of the same experiment, suppose NA, NB,NAB denote the number of times events A, B and AB occur,respectively. According to the frequency interpretation of probability,for large N

    P (A) NAN

    , P (B) NBN

    , P (AB) NABN

    (25)

    Among the NA occurrences of A, only NAB of them are also foundamong the NB occurrences of B. Thus the ratio

    NABNB

    =NAB/N

    NB/NP (AB)

    P (B)(26)

    is a measure of the event A occurrences given (conditioned) that Bhas already occurred.

    110801 ECE6111L01 [Ch.1; Ch2] 17/ 29

  • We denote this conditional probability by

    P (A|B) = Probability of the event A given that B has occurred. (27)

    Formal definition of the conditional probability

    P (A|B) =P (AB)

    P (B)(28)

    provided P (B) 6= 0.

    As we will show next, the above definition satisfies all the probabilityaxioms discussed earlier.

    110801 ECE6111L01 [Ch.1; Ch2] 18/ 29

  • We have

    (i) P (A|B) = P (AB)(0)P (B)(>0) 0

    (ii) P (|B) = P (B)P (B) =

    P (B)P (B) = 1

    (iii) Suppose A C = . Then

    P (A C|B) =P ((A C) B)

    P (B)=P (AB CB)

    P (B)(29)

    But AB BC = , hence P (AB BC) = P (AB) + P (BC).

    P (A C|B) =P (AB)

    P (B)+P (BC)

    P (B)= P (A|B) + P (C|B) (30)

    satisfying all probability axioms in (9).

    Thus the conditional probability defined in (28) is a legitimateprobability measure.

    110801 ECE6111L01 [Ch.1; Ch2] 19/ 29

  • Properties of the Conditional Probability

    a. If B A, then AB = B, and

    P (A|B) =P (AB)

    P (B)=P (B)

    P (B)= 1 (31)

    since if B A, then occurrence of B implies automaticoccurrence of the event A. As an example,A = {outcome is even} , B = {outcome is 2} , in a die rollingexperiment. Then B A and P (A|B) = 1.

    b. If A B, then AB = A, and (if P (B) < 1)

    P (A|B) =P (AB)

    P (B)=P (A)

    P (B)> P (A) (32)

    For example, in a die rolling experiment, A = {outcome is 2} ,B = {outcome is even} , so that A B.The statement that B has occurred (outcome is even) makesthe odds for outcome is 2" greater than without that information.

    110801 ECE6111L01 [Ch.1; Ch2] 20/ 29

  • Total Probability Theorem (TPT)

    We can use the TPT to express the probability of a complicated eventin terms of simpler" related events.

    Let A1, A2, . . ., An be disjoint and their union is they are m.e.e..Thus AiAj = , and

    ni=1

    Ai = (33)

    Thus

    B = B(A1 A2 An) = BA1 BA2 BAn (34)But Ai Aj = BAi BAj = , so that from (34)

    P (B) =

    ni=1

    P (BAi) =

    ni=1

    P (B|Ai)P (Ai) (35)

    The TPT is an extremely useful result the key is in finding thesimpler conditional events.

    110801 ECE6111L01 [Ch.1; Ch2] 21/ 29

  • With the notion of conditional probability, next we introduce the notionof independence of events.

    Independence:A and B are said to be independent events, if

    P (AB) = P (A)P (B) (36)Notice that the above definition is a probabilistic statement, not aset theoretic notion such as mutual exclusiveness.

    If A and B are independent, then

    P (A|B) =P (AB)

    P (B)=P (A)P (B)

    P (B)= P (A) (37)

    Thus if A and B are independent, the fact that B has occurreddoes not shed any more light on the event A. It makes nodifference to A whether B has occurred or not.

    110801 ECE6111L01 [Ch.1; Ch2] 22/ 29

  • An example will illustrate the concept of independence.

    Example 1.2: A box contains 6 white and 4 black balls. Remove twoballs at random without replacement. What is the probability that thefirst one is white and the second one is black?

    Let W1 = first ball removed is white"B2 = second ball removed is black"

    We need P (W1 B2).We have W1 B2 =W1B2. Using the conditional probabilitydefinition,

    P (W1B2) = P (B2|W1)P (W1) (38)=

    4

    5 + 4

    6

    6 + 4=

    4

    15

    Q: Are the events W1 and B2 independent?

    110801 ECE6111L01 [Ch.1; Ch2] 23/ 29

  • Are the events W1 and B2 independent?To verify this we need to compute the unconditional probability P (B2).

    The first ball has two options: W1 = first ball is white or B1= first ball isblack.Note that W1 B1 = and W1 B1 = 1. Hence W1 and B1 arem.e.e., i.e., they form a partition of 1. Thus, with the TPT,

    P (B2) = P (B2|W1)P (W1) + P (B2|B1)P (B1)

    =4

    5 + 4

    6

    6 + 4+

    3

    6 + 3

    4

    6 + 4=

    2

    5

    and

    P (B2)P (W1) =2

    53

    56= P (B2W1) =

    4

    15

    As expected, the events W1 and B2 are dependent because theremoval of the first ball, which is not replaced, affects the probabilityof the second removal outcome.

    110801 ECE6111L01 [Ch.1; Ch2] 24/ 29

  • Bayes Theorem

    From (28),P (AB) = P (A|B)P (B) (39)

    Similarly, from (28)

    P (B|A) =P (AB)

    P (A)or P (AB) = P (B|A)P (A) (40)

    From (39) and (40), we getP (A|B)P (B) = P (B|A)P (A)

    or

    P (A|B) =P (B|A)P (A)

    P (B)(41)

    Equation (41) is known as Bayes theorem or formula (or rule).

    110801 ECE6111L01 [Ch.1; Ch2] 25/ 29

  • Interpretation of Bayes theorem

    P (A) represents the a priori (prior) probability of the event A.Suppose B has occurred, and assume that A and B are notindependent.Bayes formula in (41) takes into account the new information (B hasoccurred) and yields the a posteriori (posterior) probability of A givenB.

    A more general version of Bayes theorem:Let Ai, i = 1, . . . , n represent a set of m.e.e. events with associated apriori probabilities P (Ai), i = 1, . . . , n. With the new information B hasoccurred, the information about Ai can be updated as follows

    P (Ai|B) =P (B|Ai)P (Ai)

    P (B)=

    P (B|Ai)P (Ai)nj=1 P (B|Aj)P (Aj)

    (42)

    110801 ECE6111L01 [Ch.1; Ch2] 26/ 29

  • Example 1.3:Two boxes B1 and B2 contain 100 and 200 light bulbs, respectively.The first box (B1) has 15 defective bulbs and the second 5. Suppose1. a box is selected at random and 2. one bulb is picked out.(a) What is the probability that it is defective?

    Solution: Note that box B1 has 85 good and 15 defective bulbs.Similarly box B2 has 195 good and 5 defective bulbs.Let D = Defective bulb is picked out. Then

    P (D|B1) =15

    100= 0.15, P (D|B2) =

    5

    200= 0.025

    Since a box is selected at random, they are equally likely, i.e.,

    P (B1) = P (B2) =1

    2

    110801 ECE6111L01 [Ch.1; Ch2] 27/ 29

  • Thus B1 and B2 form a partition of (part 1 of the experiment), andusing the TPT (35) we obtain

    P (D) = P (D|B1)P (B1) + P (D|B2)P (B2)

    = 0.151

    2+ 0.025

    1

    2= 0.0875

    Thus, there is about 9% probability that a bulb picked at random isdefective.(b) Suppose we test the bulb and it is found to be defective. What is

    the probability that it came from box 1, i.e., P (B1|D)?

    P (B1|D) =P (D|B1)P (B1)

    P (D)=

    0.15 1/2

    0.0875= 0.8571

    Notice that initially P (B1) = 0.5; then we picked out a box atrandom and picked a bulb that turned out to be defective. Canthis information shed some light about the fact that we mighthave picked box 1?

    From the answer, P (B1|D) = 0.857 > 0.5, and indeed it is more likelyat this point that we must have chosen box 1 as opposed to box 2.Q: WHY?

    110801 ECE6111L01 [Ch.1; Ch2] 28/ 29

  • Independence of n events A1, . . . , AnP (Ai1 . . . Aik) = P (Ai1 ) P (Aik) for all sets of integersi1, . . . , ik n.Pairwise independence of n events complete mutualindependence?P (AiAj) = P (Ai)P (Aj) / A1, . . . , An independent.Example:

    A

    B

    C

    P (A) = P (B) = P (C) = 1/5P (AB) = P (AC) = P (BC) = P (ABC) = 1/25

    P (ABC) 6= P (A)P (B)P (C) = 1/125 = NOT independent !!

    110801 ECE6111L01 [Ch.1; Ch2] 29/ 29