Applied Probability and Stochastic Process

ECE6111: APPLIED PROBABILITYAND STOCHASTIC PROCESSES

Yaakov Bar-Shalom

University of ConnecticutDept. of Elec. & Computer Engineering

Storrs, CT 06269-2157These viewgraphs are based on the text

A. Papoulis and S. U. PillaiProbability, Random Variables and Stochastic Processes,

McGraw-Hill, 2002and the viewgraphs of S. U. Pillai (www.mhhe.com/papoulis), with permission

110801 ECE6111L01 [Ch.1; Ch2] 1/ 29

Basics

Probability theory deals with the study of random phenomena formodeling uncertainty.Probability demonstrates both our knowledge as well as our lackof knowledge.

Goal: to present methods that deal with uncertainty in engineeringsystems (some are inherently random, some we assume as randombecause of modeling limitations).Probabilistic Experiment

The notion of an experiment assumes a set of repeatableconditions that allow any number of identical repetitions.When a probabilistic experiment is performed under repeatableconditions, certain elementary outcomes i occur in different butcompletely uncertain (probabilistic) ways.We can assign a nonnegative number P (i) the probability ofthe event that outcome i occurs in various ways.

110801 ECE6111L01 [Ch.1; Ch2] 2/ 29

Laplaces Classical Definition:The Probability of an event A is defined a priori without actualexperimentation as

P (A) =Number of outcomes favorable to ATotal number of possible outcomes (1)

provided all these outcomes are equally likely.

Example: Consider a box with n white and m red balls. In thiscase, there are two elementary outcomes: white ball or red ball.Probability of selecting a white ball is n

n+m .

Relative Frequency Definition:The probability of an event A is defined as

P (A) = limn

nAn

(2)

where nA is the number of occurrences of A and n is the totalnumber of trials.

110801 ECE6111L01 [Ch.1; Ch2] 3/ 29

Axiomatic approach to probability (Kolmogorov):developed through a set of axioms (given below) is generallyrecognized as superior to the above definitions.

The totality of all outcomes i, known a priori, constitutes a set , theset of all experimental outcomes space of outcomes

= {1, 2, , i, } (3) has subsets A, B, C, etc., called events.If A is a subset of , then A implies .From A and B, we can generate other related subsets (events).

A B = {| A or B}

A B = {| A and B}

andA = {| / A} (4)

110801 ECE6111L01 [Ch.1; Ch2] 4/ 29

A B A B A

A

A B A B AFig.1.1 Venn diagrams.

If A B = , the empty set, then A and B are said to bemutually exclusive (m.e.).A partition of is a collection of mutually exclusive andexhaustive (m.e.e.) subsets of such that

Ai Aj = andn

i=1

Ai = (5)

BA

1A

2A

nA

iA

jA

A B = Fig.1.2 The concepts of m.e. and m.e.e.

110801 ECE6111L01 [Ch.1; Ch2] 5/ 29

De Morgans theoremA B = A B and A B = A B (6)

A B A B A BA B

A B A B A B A BFig.1.3

It is meaningful to talk about the subsets of as events, forwhich we want a mechanism to compute their probabilities.

Example 1.1:Consider the experiment where two coins are simultaneously tossed.The various elementary events are

1 = {H,H} , 2 = {H,T } , 3 = {T,H} , 4 = {T, T }

and = {1, 2, 3, 4}

110801 ECE6111L01 [Ch.1; Ch2] 6/ 29

The subset A = {1, 2, 3} is the same as Heads has occurred atmost once" and qualifies as an event.Suppose two subsets A and B are both events, then consider thequestions

Does an outcome belong to A or B = AB = A+B?"Does an outcome belong to A and B = A B = AB?"Does an outcome fall outside A?"

Thus the sets A B, A B, A, B, etc., also qualify as events. Weshall formalize this using the notion of a Field.

Field: A collection of subsets of a nonempty set forms a field F if

i) Fii) If A F, then A F (7)iii) If A F, and B F, then A B F.

i.e., it is closed under set operations.

110801 ECE6111L01 [Ch.1; Ch2] 7/ 29

Using (i) - (iii), it is easy to show that A B, A B, etc., alsobelong to F .

For example, from (ii) we have A F , B F and using (iii)this gives A B F ;Applying (ii) again we get A B = A B F , where wehave used De Morgans theorem in (6).

Thus if A F and B F then

F ={, A, B, A, B, A B, A B, A B,

} (8)From here onwards, we shall reserve the term event only to membersof F .Assuming that the probability (measure) pi = P (i) of elementaryoutcomes of are a priori defined, how does one obtain probabilities(measures) of more complicated events such as A, B, AB = A B,etc., above?The three axioms of probability defined next can be used to achievethis goal.

110801 ECE6111L01 [Ch.1; Ch2] 8/ 29

The Axioms of ProbabilityFor any event A, we assign a number P (A), called the probability ofthe event A. This number satisfies the following three conditions thatare the axioms of probability.

Axiom 1 P (A) 0 (Probability is a nonnegative number)Axiom 2 P () = 1 (Probability of the set of outcomes is unity)(9)Axiom 3 If A B = , then P (A B) = P (A) + P (B).

The event is the sure event it occurs in every experiment.Note that (iii) states that if A and B are mutually exclusive (m.e.)events,

the probability of their union is the sum of their probabilities.

Q: What is missing in Axiom 3?

110801 ECE6111L01 [Ch.1; Ch2] 9/ 29

The following conclusions follow from these axioms:a. Since A A = , we have using (ii)

P (A A) = P () = 1

But A A = , then, using (iii),P (A A) = P (A) + P (A) = 1 i.e., P (A) = 1 P (A) (10)

b. Similarly, for any A, A {} = {} (m.e.)Hence it follows that P (A {}) = P (A) + P ()But A {} = A, and thus P () = 0.

c. Suppose A and B are not mutually exclusive (m.e.)?

Q: How does one compute P (A B)?

110801 ECE6111L01 [Ch.1; Ch2] 10/ 29

A BA A B

A B A BFig.1.4

To compute the above probability, we should re-express A B interms of m.e. sets so that we can make use of the probability axioms.From Fig.1.4 we have

A B = A AB (11)where A and AB are clearly m.e. events. Thus using axiom (iii)

P (A B) = P (A AB) = P (A) + P (AB) (12)To compute P (AB) we can express B as

B = B = B (A A)

= (B A) (B A) = BA BA (13)Thus

P (B) = P (BA) + P (BA) (14)110801 ECE6111L01 [Ch.1; Ch2] 11/ 29

From (14),P (AB) = P (B) P (AB) (15)

and using (15) in (12)P (A B) = P (A) + P (B) P (AB) (16)

(This result really follows by inspection of Fig. 1.4 and compensatingfor double counting).

Question:Suppose every member of an infinite collection Ai of disjointsets is an event, then what can we say about their union

A =

i=1

Ai

i.e., suppose all Ai F , what about A? Does it belong to F?Further, if A also belongs to F (it is a measurable set), whatabout P (A)?

110801 ECE6111L01 [Ch.1; Ch2] 12/ 29

The above questions involving an infinite number of sets can beunderstood using our intuitive experience from plausible experiments.For example, in a coin tossing experiment, where the same coin istossed indefinitely, define

A = heads eventually appears". (17)Is A an event?Let

An = {heads appears for the 1st time on the n-th toss}

=

t, t, t, , t

n1

, h

(18)

Clearly Ai Aj = . Thus, the above A is the infinite union

A = A1 A2 A3 Ai (19)

110801 ECE6111L01 [Ch.1; Ch2] 13/ 29

We cannot use probability axiom (iii) to compute P(A), since theaxiom only deals with two (or a finite number) of m.e. events.To answer the questions whether A is an event and how tocompute P (A), extension of these notions must be done as newaxioms.

-Field (Borel Field) DefinitionA field F is a -field if in addition to the three conditions in (7), wehave the following:For every sequence Ai, i = 1, . . . ,, of pairwise disjoint eventsbelonging to F , their union also belongs to F , i.e.,

A =

i=1

Ai F (20)

This property is called being closed under a countable number ofunions. (countable = countably infinite)

110801 ECE6111L01 [Ch.1; Ch2] 14/ 29

Thus, in view of (20), we can add another axiom to the set ofprobability axioms in (9).Axiom 4: If Ai, i = 1, 2, . . . are mutually exclusive, then

P

(i=1

Ai

)=i=1

P (Ai) (21)

Returning to the infinite coin tossing experiment, from experience weknow that if we keep tossing a coin, eventually, a heads must showup, i.e.,

P (A) = 1 (22)But A =

i=1 Ai and using the fourth probability axiom (21),

P (A) = P

(i=1

Ai

)=

i=1

P (Ai) (23)

Note: A is called a Borel set.

110801 ECE6111L01 [Ch.1; Ch2] 15/ 29

From (18) An is exactly one in 2n outcomes. Thus, for a fair coin, wehave

P (An) =1

2nand P

(n=1

An

)=

n=1

1

2n= 1 (24)

which agrees with (22), thus justifying the intuitive reasonableness ofthe Axiom 4 in (21).In summary, the triplet (, F , P ), consisting of

a nonempty set of elementary eventsa -field F of subsets of a probability measure P on the sets in F subject to the fouraxioms ((9) and (21)),

form a probability model (probability space).The probability of more complicated events must follow from thisframework by deduction.

Q: What other events or variables might be of interest?

110801 ECE6111L01 [Ch.1; Ch2] 16/ 29

Conditional Probability and Independence

In N independent trials of the same experiment, suppose NA, NB,NAB denote the number of times events A, B and AB occur,respectively. According to the frequency interpretation of probability,for large N

P (A) NAN

, P (B) NBN

, P (AB) NABN

(25)

Among the NA occurrences of A, only NAB of them are also foundamong the NB occurrences of B. Thus the ratio

NABNB

=NAB/N

NB/NP (AB)

P (B)(26)

is a measure of the event A occurrences given (conditioned) that Bhas already occurred.

110801 ECE6111L01 [Ch.1; Ch2] 17/ 29

We denote this conditional probability by

P (A|B) = Probability of the event A given that B has occurred. (27)

Formal definition of the conditional probability

P (A|B) =P (AB)

P (B)(28)

provided P (B) 6= 0.

As we will show next, the above definition satisfies all the probabilityaxioms discussed earlier.

110801 ECE6111L01 [Ch.1; Ch2] 18/ 29

We have

(i) P (A|B) = P (AB)(0)P (B)(>0) 0

(ii) P (|B) = P (B)P (B) =

P (B)P (B) = 1

(iii) Suppose A C = . Then

P (A C|B) =P ((A C) B)

P (B)=P (AB CB)

P (B)(29)

But AB BC = , hence P (AB BC) = P (AB) + P (BC).

P (A C|B) =P (AB)

P (B)+P (BC)

P (B)= P (A|B) + P (C|B) (30)

satisfying all probability axioms in (9).

Thus the conditional probability defined in (28) is a legitimateprobability measure.

110801 ECE6111L01 [Ch.1; Ch2] 19/ 29

Properties of the Conditional Probability

a. If B A, then AB = B, and

P (A|B) =P (AB)

P (B)=P (B)

P (B)= 1 (31)

since if B A, then occurrence of B implies automaticoccurrence of the event A. As an example,A = {outcome is even} , B = {outcome is 2} , in a die rollingexperiment. Then B A and P (A|B) = 1.

b. If A B, then AB = A, and (if P (B) < 1)

P (A|B) =P (AB)

P (B)=P (A)

P (B)> P (A) (32)

For example, in a die rolling experiment, A = {outcome is 2} ,B = {outcome is even} , so that A B.The statement that B has occurred (outcome is even) makesthe odds for outcome is 2" greater than without that information.

110801 ECE6111L01 [Ch.1; Ch2] 20/ 29

Total Probability Theorem (TPT)

We can use the TPT to express the probability of a complicated eventin terms of simpler" related events.

Let A1, A2, . . ., An be disjoint and their union is they are m.e.e..Thus AiAj = , and

ni=1

Ai = (33)

Thus

B = B(A1 A2 An) = BA1 BA2 BAn (34)But Ai Aj = BAi BAj = , so that from (34)

P (B) =

ni=1

P (BAi) =

ni=1

P (B|Ai)P (Ai) (35)

The TPT is an extremely useful result the key is in finding thesimpler conditional events.

110801 ECE6111L01 [Ch.1; Ch2] 21/ 29

With the notion of conditional probability, next we introduce the notionof independence of events.

Independence:A and B are said to be independent events, if

P (AB) = P (A)P (B) (36)Notice that the above definition is a probabilistic statement, not aset theoretic notion such as mutual exclusiveness.

If A and B are independent, then

P (A|B) =P (AB)

P (B)=P (A)P (B)

P (B)= P (A) (37)

Thus if A and B are independent, the fact that B has occurreddoes not shed any more light on the event A. It makes nodifference to A whether B has occurred or not.

110801 ECE6111L01 [Ch.1; Ch2] 22/ 29

An example will illustrate the concept of independence.

Example 1.2: A box contains 6 white and 4 black balls. Remove twoballs at random without replacement. What is the probability that thefirst one is white and the second one is black?

Let W1 = first ball removed is white"B2 = second ball removed is black"

We need P (W1 B2).We have W1 B2 =W1B2. Using the conditional probabilitydefinition,

P (W1B2) = P (B2|W1)P (W1) (38)=

4

5 + 4

6

6 + 4=

4

15

Q: Are the events W1 and B2 independent?

110801 ECE6111L01 [Ch.1; Ch2] 23/ 29

Are the events W1 and B2 independent?To verify this we need to compute the unconditional probability P (B2).

The first ball has two options: W1 = first ball is white or B1= first ball isblack.Note that W1 B1 = and W1 B1 = 1. Hence W1 and B1 arem.e.e., i.e., they form a partition of 1. Thus, with the TPT,

P (B2) = P (B2|W1)P (W1) + P (B2|B1)P (B1)

=4

5 + 4

6

6 + 4+

3

6 + 3

4

6 + 4=

2

5

and

P (B2)P (W1) =2

53

56= P (B2W1) =

4

15

As expected, the events W1 and B2 are dependent because theremoval of the first ball, which is not replaced, affects the probabilityof the second removal outcome.

110801 ECE6111L01 [Ch.1; Ch2] 24/ 29

Interpretation of Bayes theorem

P (A) represents the a priori (prior) probability of the event A.Suppose B has occurred, and assume that A and B are notindependent.Bayes formula in (41) takes into account the new information (B hasoccurred) and yields the a posteriori (posterior) probability of A givenB.

A more general version of Bayes theorem:Let Ai, i = 1, . . . , n represent a set of m.e.e. events with associated apriori probabilities P (Ai), i = 1, . . . , n. With the new information B hasoccurred, the information about Ai can be updated as follows

P (Ai|B) =P (B|Ai)P (Ai)

P (B)=

P (B|Ai)P (Ai)nj=1 P (B|Aj)P (Aj)

(42)

110801 ECE6111L01 [Ch.1; Ch2] 26/ 29

Example 1.3:Two boxes B1 and B2 contain 100 and 200 light bulbs, respectively.The first box (B1) has 15 defective bulbs and the second 5. Suppose1. a box is selected at random and 2. one bulb is picked out.(a) What is the probability that it is defective?

Solution: Note that box B1 has 85 good and 15 defective bulbs.Similarly box B2 has 195 good and 5 defective bulbs.Let D = Defective bulb is picked out. Then

P (D|B1) =15

100= 0.15, P (D|B2) =

5

200= 0.025

Since a box is selected at random, they are equally likely, i.e.,

P (B1) = P (B2) =1

2

110801 ECE6111L01 [Ch.1; Ch2] 27/ 29

Thus B1 and B2 form a partition of (part 1 of the experiment), andusing the TPT (35) we obtain

P (D) = P (D|B1)P (B1) + P (D|B2)P (B2)

= 0.151

2+ 0.025

1

2= 0.0875

Thus, there is about 9% probability that a bulb picked at random isdefective.(b) Suppose we test the bulb and it is found to be defective. What is

the probability that it came from box 1, i.e., P (B1|D)?

P (B1|D) =P (D|B1)P (B1)

P (D)=

0.15 1/2

0.0875= 0.8571

Notice that initially P (B1) = 0.5; then we picked out a box atrandom and picked a bulb that turned out to be defective. Canthis information shed some light about the fact that we mighthave picked box 1?

From the answer, P (B1|D) = 0.857 > 0.5, and indeed it is more likelyat this point that we must have chosen box 1 as opposed to box 2.Q: WHY?

110801 ECE6111L01 [Ch.1; Ch2] 28/ 29

Independence of n events A1, . . . , AnP (Ai1 . . . Aik) = P (Ai1 ) P (Aik) for all sets of integersi1, . . . , ik n.Pairwise independence of n events complete mutualindependence?P (AiAj) = P (Ai)P (Aj) / A1, . . . , An independent.Example:

A

B

C

P (A) = P (B) = P (C) = 1/5P (AB) = P (AC) = P (BC) = P (ABC) = 1/25

P (ABC) 6= P (A)P (B)P (C) = 1/125 = NOT independent !!

110801 ECE6111L01 [Ch.1; Ch2] 29/ 29

Applied Probability and Stochastic Process

Documents

Transcript of Applied Probability and Stochastic Process