Joint Probability DistributionsBayes’ Rule and Applications
Conditional Expectations, Variances Etc.
Risk Assessment and Management:Module 2
Introduction to Probability and StatisticsLecture 2: Multiple Random Variables
M. Vidyasagar
Cecil & Ida Green ProfessorThe University of Texas at DallasEmail: [email protected]
August 27, 2010
M. Vidyasagar Multiple Random Variables
Joint Probability DistributionsBayes’ Rule and Applications
Conditional Expectations, Variances Etc.
Outline
1 Joint Probability Distributions
Joint and Marginal Probability Distributions
Independence, Conditional Probability Distributions
Covariance and Correlation Coefficients
2 Bayes’ Rule and Applications
A Motivating Example
Bayes’ Rule
3 Conditional Expectations, Variances Etc.
Conditional Expected Value: Definition
Conditional Expected Value: Example
Conditioning on an Event, Independent Events
M. Vidyasagar Multiple Random Variables
Joint Probability DistributionsBayes’ Rule and Applications
Conditional Expectations, Variances Etc.
Joint and Marginal Probability DistributionsIndependence, Conditional Probability DistributionsCovariance and Correlation Coefficients
Outline
1 Joint Probability Distributions
Joint and Marginal Probability Distributions
Independence, Conditional Probability Distributions
Covariance and Correlation Coefficients
2 Bayes’ Rule and Applications
A Motivating Example
Bayes’ Rule
3 Conditional Expectations, Variances Etc.
Conditional Expected Value: Definition
Conditional Expected Value: Example
Conditioning on an Event, Independent Events
M. Vidyasagar Multiple Random Variables
Joint Probability DistributionsBayes’ Rule and Applications
Conditional Expectations, Variances Etc.
Joint and Marginal Probability DistributionsIndependence, Conditional Probability DistributionsCovariance and Correlation Coefficients
Joint Probability Distributions: Motivating Examples
Joint probability distributions arise naturally when one conductsmultiple experiments with random outcomes.
Example 1: An urn has 7 white balls and 3 black balls. We drawtwo balls in succession, replacing the first ball drawn beforedrawing a second time. What is the probability of drawing a whiteball followed by a black ball? What is the probability of drawingone white ball and one black ball (in either order)?
Example 2: An urn has 7 white balls and 3 black balls. We drawtwo balls in succession, without replacing the first ball drawnbefore drawing a second time. What is the probability of drawing awhite ball followed by a black ball? What is the probability ofdrawing one white ball and one black ball (in either order)?
M. Vidyasagar Multiple Random Variables
Joint Probability DistributionsBayes’ Rule and Applications
Conditional Expectations, Variances Etc.
Joint and Marginal Probability DistributionsIndependence, Conditional Probability DistributionsCovariance and Correlation Coefficients
Cartesian Products of Sets
Suppose we have two random variables X and Y , which may ormay not influence each other. X takes values in A = {x1, . . . , xn}while Y takes values in B = {y1, . . . , ym}. Note that the two setsA and B could be different.
The joint random variable (X,Y ) takes in the so-called(Cartesian) product set A× B, which consists of all pairs of theform (xi, yj). Thus
A× B := {(xi, yj) : xi ∈ A, yj ∈ B}.
Note that A× B has nm elements.
The product set A× B is the sample space for the joint randomvariable (X,Y ). The event space, as before, consists of all possiblesubsets of A× B.
M. Vidyasagar Multiple Random Variables
Joint Probability DistributionsBayes’ Rule and Applications
Conditional Expectations, Variances Etc.
Joint and Marginal Probability DistributionsIndependence, Conditional Probability DistributionsCovariance and Correlation Coefficients
Joint Distributions of Joint Random Variables
Suppose (X,Y ) takes values in A× B. Its (joint) probabilitydistribution is a vector φ with nm components, where
φij = Pr{(X,Y ) = (xi, yj)} = Pr{X = xi&Y = yj}.
As always φij ≥ 0 for all i, j and∑n
i=1
∑mj=1 φij = 1.
What is the difference between the ‘joint’ r.v. (X,Y ) and a plainold r.v. Z taking values in a set of cardinality nm?
So long as we always talk about X and Y together – nothing! Butthere are special notions for joint r.v.’s such as independence,marginal and conditional distributions etc.
M. Vidyasagar Multiple Random Variables
Joint Probability DistributionsBayes’ Rule and Applications
Conditional Expectations, Variances Etc.
Joint and Marginal Probability DistributionsIndependence, Conditional Probability DistributionsCovariance and Correlation Coefficients
Marginal Distributions
Suppose X,Y are r.v.’s assuming values in A,B respectively, withjoint distribution φ. So
φij = Pr{X = xi&Y = yj}, ∀i, j.Now let us ask: What is the probability that X = xi, and we don’tcare what Y is?
Answer: The event {X = xi} is the union of the m disjointevents (X,Y ) = (xi, y1) through (X,Y ) = (xi, ym). In symbols
{X = xi} =
m⋃j=1
{(X,Y ) = (xi, yj)}.
So it follows from earlier discussion that
Pr{X = xi} =
m∑j=1
φij .
M. Vidyasagar Multiple Random Variables
Joint Probability DistributionsBayes’ Rule and Applications
Conditional Expectations, Variances Etc.
Joint and Marginal Probability DistributionsIndependence, Conditional Probability DistributionsCovariance and Correlation Coefficients
Marginal Distributions (Cont’d)
So the r.v. X by itself has the distribution denoted by φX , definedby
(φX)i :=
m∑j=1
φij , ∀i.
The distribution φX is referred to as the marginal distribution ofX corresponding to the joint distribution φ. It is a probabilitydistribution on the set A.
Similarly the marginal distribution of Y corresponding to the jointdistribution φ is defined analogously by
(φY )j =
n∑i=1
φij , ∀j.
M. Vidyasagar Multiple Random Variables
Joint Probability DistributionsBayes’ Rule and Applications
Conditional Expectations, Variances Etc.
Joint and Marginal Probability DistributionsIndependence, Conditional Probability DistributionsCovariance and Correlation Coefficients
Matrix Interpretation of Marginal Distributions
If X,Y take values in A,B with joint distribution φ, arrange theentries in a matrix as shown below.
X�Y y1 . . . ymx1 φ11 . . . φ1m...
.... . .
...xn φn1 . . . φnm
If we call the above matrix as Φ, then the vector φX is obtained bymultiplying Φ on the right by a column vector of all one’s, and φYis obtained by multiplying Φ on the left by a row vector of all one’s.
M. Vidyasagar Multiple Random Variables
Joint Probability DistributionsBayes’ Rule and Applications
Conditional Expectations, Variances Etc.
Joint and Marginal Probability DistributionsIndependence, Conditional Probability DistributionsCovariance and Correlation Coefficients
Outline
1 Joint Probability Distributions
Joint and Marginal Probability Distributions
Independence, Conditional Probability Distributions
Covariance and Correlation Coefficients
2 Bayes’ Rule and Applications
A Motivating Example
Bayes’ Rule
3 Conditional Expectations, Variances Etc.
Conditional Expected Value: Definition
Conditional Expected Value: Example
Conditioning on an Event, Independent Events
M. Vidyasagar Multiple Random Variables
Joint Probability DistributionsBayes’ Rule and Applications
Conditional Expectations, Variances Etc.
Joint and Marginal Probability DistributionsIndependence, Conditional Probability DistributionsCovariance and Correlation Coefficients
Independence of Two Random Variables
Suppose X,Y are r.v.’s assuming values in A,B respectively, withjoint distribution φ. The random variables X,Y are said to beindependent if
φij = (φX)i · (φY )j , ∀i, j.
In words, X and Y are independent if their joint probabilitydistribution is just the product of the two individual marginaldistributions.
An equivalent definition is: X and Y are independent if their jointprobability matrix Φ has rank one.
M. Vidyasagar Multiple Random Variables
Joint Probability DistributionsBayes’ Rule and Applications
Conditional Expectations, Variances Etc.
Joint and Marginal Probability DistributionsIndependence, Conditional Probability DistributionsCovariance and Correlation Coefficients
Ball-Drawing from an Urn with Replacement
Suppose an urn has 7 white balls and 3 black balls. We draw a ballfrom this urn, and then replace the ball before drawing a secondtime.
Define X to be the color of the first ball drawn, Y to be the colorof the second ball. Since there are 7 white balls and 3 black ballseach time, it is clear that
Pr{X = W} = 0.7,Pr{X = B} = 0.3,
and similarly for Y . Since we replace the ball, the outcome of Xdoes not affect the outcome of Y . In other words, X and Y areindependent in this case. So
Pr{(X,Y ) = (W,B)} = Pr{X = W} × Pr{Y = W} = 0.49,
and similarly for the other three combinations.M. Vidyasagar Multiple Random Variables
Joint Probability DistributionsBayes’ Rule and Applications
Conditional Expectations, Variances Etc.
Joint and Marginal Probability DistributionsIndependence, Conditional Probability DistributionsCovariance and Correlation Coefficients
Joint Distribution for Drawing with Replacement
If we draw with replacement, the joint probability distribution of Xand Y is as shown below:
X�Y W B
W 0.49 0.21B 0.21 0.09
With this information we can answer the questions raised earlier.
The probability of drawing a white ball followed by a black ball is
Pr{(X,Y ) = (W,B)} = φ(W,B) = 0.21.
The event ‘drawing one white ball and one black ball (in eitherorder)’ is the subset {(W,B), (B,W )}. So
Pφ({(W,B), (B,W )}) = φ(W,B) + φ(B,W ) = 0.42.
M. Vidyasagar Multiple Random Variables
Joint Probability DistributionsBayes’ Rule and Applications
Conditional Expectations, Variances Etc.
Joint and Marginal Probability DistributionsIndependence, Conditional Probability DistributionsCovariance and Correlation Coefficients
Conditional Probabilities
Let φ be the joint distribution of (X,Y ) as before. Theconditional probability of X given the outcome Y = yj is definedas
Pr{X = xi|Y = yj} :=Pr{X = xi&Y = yj}
Pr{Y = yj}=
φij∑ni′=1 φi′j
.
Use φ{xi|yj} as a shorthand for Pr{X = xi|Y = yj}. Then thevector
φ{X|Y=yj} := [φ{xi|y1} . . . φ{xi|ym}]
is called the conditional probability distribution of X given theoutcome Y = yj . Note that φ{X|Y=yj} is a distribution on the setA.
M. Vidyasagar Multiple Random Variables
Joint Probability DistributionsBayes’ Rule and Applications
Conditional Expectations, Variances Etc.
Joint and Marginal Probability DistributionsIndependence, Conditional Probability DistributionsCovariance and Correlation Coefficients
Ball-Drawing from an Urn Without Replacement
Repeat the same experiment, except that we don’t replace the firstball we draw. Now things are more tricky.
Suppose X = W or we draw a white ball the first time. So nowthere are only 6 white and 3 black balls left. So we can say that
Pr{Y = W |X = W} = 6/9,Pr{Y = B|X = W} = 3/9.
If we draw a black ball the first time, then there are 7 white ballsand 2 black balls for the second draw. So
Pr{Y = W |X = B} = 7/9,Pr{Y = B|X = B} = 2/9.
M. Vidyasagar Multiple Random Variables
Joint Probability DistributionsBayes’ Rule and Applications
Conditional Expectations, Variances Etc.
Joint and Marginal Probability DistributionsIndependence, Conditional Probability DistributionsCovariance and Correlation Coefficients
Drawing Without Replacement (Cont’d)
We can construct the joint distribution φ by using the definition ofconditional probabilities cleverly.
Pr{(X,Y ) = (W,W )} = Pr{X = W} · Pr{Y = W |X = W}= 0.7× 6/9 = 42/90,
and similarly for the other three combinations. This leads to thetable below.
X�Y W B
W 42/90 21/90B 21/90 6/90
Since Pr{(X,Y ) = (W,W )} 6= Pr{X = W} · Pr{Y = W}, itfollows that X and Y are not independent!
M. Vidyasagar Multiple Random Variables
Joint Probability DistributionsBayes’ Rule and Applications
Conditional Expectations, Variances Etc.
Joint and Marginal Probability DistributionsIndependence, Conditional Probability DistributionsCovariance and Correlation Coefficients
Probabilities of Some Events
With the probability table
X�Y W B
W 42/90 21/90B 21/90 6/90
We can answer the questions raised in the beginning. Theprobability of drawing a white ball followed by a black ball is
Pr{(X,Y ) = (W,B)} = φ(W,B) = 21/90 ≈ 0.2333.
The event ‘drawing one white ball and one black ball (in eitherorder)’ is the subset {(W,B), (B,W )}. So
Pφ({(W,B), (B,W )}) = φ(W,B) + φ(B,W ) = 42/90 ≈ 0.4667.
M. Vidyasagar Multiple Random Variables
Joint Probability DistributionsBayes’ Rule and Applications
Conditional Expectations, Variances Etc.
Joint and Marginal Probability DistributionsIndependence, Conditional Probability DistributionsCovariance and Correlation Coefficients
Two Counter-Intuitive Calculations
Let Φ denote the matrix of joint probabilities of X and Y , namely
Φ =
[42/90 21/9021/90 6/90
].
Note: (W,B) and (B,W ) have the same probability even withoutreplacement! This is not a coincidence. (See exercises)
Let us calculate the marginal distribution of Y , the second draw.Summing columns in the matrix gives
[Pr{Y = W} Pr{Y = B}] = [1 1]Φ = [0.7 0.3]!
But there are only 9 balls when we draw Y . So how can Y havethe same distribution as X? (See exercises)
M. Vidyasagar Multiple Random Variables
Joint Probability DistributionsBayes’ Rule and Applications
Conditional Expectations, Variances Etc.
Joint and Marginal Probability DistributionsIndependence, Conditional Probability DistributionsCovariance and Correlation Coefficients
Another Counter-Intuitive Calculation
Thus far we have studied Pr{Y |X} which is OK because X is theoutcome of the first draw.
As per the definition, we can also compute Pr{X|Y }, thedistribution of the first outcome, conditioned on the secondoutcome. Does this even make sense, or is it just mathematicaljugglery?
The question makes sense. Suppose the urn had 1 white ball andN ≥ 2 black balls. We draw a ball (the outcome of X) and put itaside without looking at it. Then we draw another ball (theoutcome of Y ) without replacement.
Suppose we draw a white ball (Y = W ). Then we know for surethat X 6= W . So Pr{X = W |Y = W} = 0 in this instance.
M. Vidyasagar Multiple Random Variables
Joint Probability DistributionsBayes’ Rule and Applications
Conditional Expectations, Variances Etc.
Joint and Marginal Probability DistributionsIndependence, Conditional Probability DistributionsCovariance and Correlation Coefficients
Another Counter-Intuitive Calculation (Cont’d)
Using the joint distribution matrix[42/90 21/9021/90 6/90
]and the definition we can compute Pr{X|Y } for all outcomes ofX and Y .
It turns out that Pr{X|Y } is the same as Pr{Y |X}! How is thispossible? (See exercises)
M. Vidyasagar Multiple Random Variables
Joint Probability DistributionsBayes’ Rule and Applications
Conditional Expectations, Variances Etc.
Joint and Marginal Probability DistributionsIndependence, Conditional Probability DistributionsCovariance and Correlation Coefficients
More Than Two Random Variables
There is nothing magic about having just two r.v.s!
Suppose X,Y, Z are r.v.’s taking values in A,B,C = {z1, . . . , zl}respectively. So they have a joint distribution φ where
φijk = Pr{X = xi&Y = yj&Z = zk}.
We can define marginals of single and double r.v.’s, as well asconditional distributions, just as before.
M. Vidyasagar Multiple Random Variables
Joint Probability DistributionsBayes’ Rule and Applications
Conditional Expectations, Variances Etc.
Joint and Marginal Probability DistributionsIndependence, Conditional Probability DistributionsCovariance and Correlation Coefficients
Definitions of Marginal Distributions
Marginal distribution of single r.v.:
(φX)i :=
m∑j=1
l∑k=1
φijk,
and similarly for Y, Z.
Marginal (joint) distribution of two r.v.’s:
(φX,Y )ij =
l∑k=1
φijk,
and similarly for (Y,Z) and (X,Z).
M. Vidyasagar Multiple Random Variables
Joint Probability DistributionsBayes’ Rule and Applications
Conditional Expectations, Variances Etc.
Joint and Marginal Probability DistributionsIndependence, Conditional Probability DistributionsCovariance and Correlation Coefficients
Definitions of Conditional Distributions
Conditional distribution of a single r.v.:
φ{xi|(yj ,zk)} :=φijk∑m
j′=1
∑lk′=1 φij′k′
, i = 1, . . . , n,
and similarly for other two cases.
Conditional distribution of a joint r.v.:
φ{(xi,yj)|zk} :=φijk∑l
k′=1 φijk′, ∀i, j,
and similarly for other two cases.
M. Vidyasagar Multiple Random Variables
Joint Probability DistributionsBayes’ Rule and Applications
Conditional Expectations, Variances Etc.
Joint and Marginal Probability DistributionsIndependence, Conditional Probability DistributionsCovariance and Correlation Coefficients
Law of Iterated Conditioning
Suppose X,Y, Z are three r.v.’s. We observe an outcome of Z andit is zk. This gives a conditional distribution φ{(X,Y )|zk}. then weobserve an outcome of Y and it is yj . So now we can furthercondition X on this new observation to compute φ{{(X,Y )|zk}|yj}.
Do we really need to do this in stages, or is this the same asconditioning X ‘in one go’ namely φ{X|(yj ,zk)}?
Fortunately both answers are the same. This may be called the‘law of iterated conditioning’. (See exercises)
M. Vidyasagar Multiple Random Variables
Joint Probability DistributionsBayes’ Rule and Applications
Conditional Expectations, Variances Etc.
Joint and Marginal Probability DistributionsIndependence, Conditional Probability DistributionsCovariance and Correlation Coefficients
Conditional Independence
Suppose X,Y, Z are r.v.’s. We say that Y, Z are conditionallyindependent given X if it is true that
φ{(yj ,zk)|xi} = φ{yj |xi} · φ{zk|xi}, ∀i, j, k.
Baby (contrived) example: Suppose we draw a ball from an urn;call the outcome X. Then without replacing we draw another ball;call the outcome Y . Then after replacing the second ball drawn,we draw again; call the outcome Z.
Then Y,Z are not independent. However Y and Z areconditionally independent given X. (See exercises)
M. Vidyasagar Multiple Random Variables
Joint Probability DistributionsBayes’ Rule and Applications
Conditional Expectations, Variances Etc.
Joint and Marginal Probability DistributionsIndependence, Conditional Probability DistributionsCovariance and Correlation Coefficients
Outline
1 Joint Probability Distributions
Joint and Marginal Probability Distributions
Independence, Conditional Probability Distributions
Covariance and Correlation Coefficients
2 Bayes’ Rule and Applications
A Motivating Example
Bayes’ Rule
3 Conditional Expectations, Variances Etc.
Conditional Expected Value: Definition
Conditional Expected Value: Example
Conditioning on an Event, Independent Events
M. Vidyasagar Multiple Random Variables
Joint Probability DistributionsBayes’ Rule and Applications
Conditional Expectations, Variances Etc.
Joint and Marginal Probability DistributionsIndependence, Conditional Probability DistributionsCovariance and Correlation Coefficients
Covariance for Real-Valued Random Variables
Until we have talked about general r.v.’s. Now suppose X,Y arer.v.’s assuming values in finite sets A,B which are subsets of thereal number set R. Let φ,φX ,φY denote, as before, their jointand marginal distributions.
By viewing X as an r.v. with distribution φX , we can compute itsmean and variance; similarly for Y . The focus of this discussion ison what happens for ‘cross’ terms of the form XY .
Recall that
V (X) = E[(X − E(X))2], V (Y ) = E[(Y − E(Y ))2]
denote the variances of X and Y respectively. Alsoσ(X) = (V (X))1/2 and σ(Y ) = (V (Y ))1/2 denote the standarddeviations of X and Y respectively.
M. Vidyasagar Multiple Random Variables
Joint Probability DistributionsBayes’ Rule and Applications
Conditional Expectations, Variances Etc.
Joint and Marginal Probability DistributionsIndependence, Conditional Probability DistributionsCovariance and Correlation Coefficients
Covariance for Real-Valued Random Variables (Cont’d)
Definition The quantity C(X,Y ) := E[(X − E(X))(Y − E(Y ))]is called the covariance between X and Y , and the quantity
ρ(X,Y ) :=E[(X − E(X))(Y − E(Y ))]
{E[(X − E(X))2]}1/2 · {E[(Y − E(Y ))2]}1/2
=C(X,Y )
σ(X)σ(Y )
is called the correlation coefficient between X and Y .
The correlation coefficient ρ(X,Y ) always lies between −1 and+1. If ρ(X,Y ) > 0, we say that X and Y are positively correlated;similarly if ρ(X,Y ) < 0. If ρ(X,Y ) = 0, we say that X and Y areuncorrelated.
M. Vidyasagar Multiple Random Variables
Joint Probability DistributionsBayes’ Rule and Applications
Conditional Expectations, Variances Etc.
Joint and Marginal Probability DistributionsIndependence, Conditional Probability DistributionsCovariance and Correlation Coefficients
Consequences of Definitions
Theorem: An equivalent expression for the covariance is
C(X,Y ) = E(XY )− E(X)E(Y ).
Theorem: If X,Y are independent, then
E[XY,φ] = E[X,φX ] · E[Y,φY ].
Theorem: If X,Y are independent, then they are uncorrelated.(Explain why)
Exercises: Prove the above theorems, and also give an examplewhere X,Y are uncorrelated but not independent.
Fact: We have that
V (X + Y ) = V (X) + 2C(X,Y ) + V (Y ).
M. Vidyasagar Multiple Random Variables
Joint Probability DistributionsBayes’ Rule and Applications
Conditional Expectations, Variances Etc.
Joint and Marginal Probability DistributionsIndependence, Conditional Probability DistributionsCovariance and Correlation Coefficients
Example
Recall the ball in an urn example. There are 7 white balls and 3black balls in an urn. We draw one ball and call the outcome X.Then without replacing we draw a second ball and call theoutcome Y . Now X,Y are abstract r.v.’s so we define anassociated ‘pay-off’ function f : {W,B} by f(W ) = 0, f(B) = 1.
The probability table is
X�Y W B
W 42/90 21/90B 21/90 6/90
So it is easy to verify that
E[f(X)] = E[f(Y )] = 0.3, E[f(X)f(Y )] = 2/30 ≈ 0.0667.
Since E[f(X)f(Y )] < E[f(X)]E[f(Y )], f(X) and f(Y ) arenegatively correlated.
M. Vidyasagar Multiple Random Variables
Joint Probability DistributionsBayes’ Rule and Applications
Conditional Expectations, Variances Etc.
A Motivating ExampleBayes’ Rule
Outline
1 Joint Probability Distributions
Joint and Marginal Probability Distributions
Independence, Conditional Probability Distributions
Covariance and Correlation Coefficients
2 Bayes’ Rule and Applications
A Motivating Example
Bayes’ Rule
3 Conditional Expectations, Variances Etc.
Conditional Expected Value: Definition
Conditional Expected Value: Example
Conditioning on an Event, Independent Events
M. Vidyasagar Multiple Random Variables
Joint Probability DistributionsBayes’ Rule and Applications
Conditional Expectations, Variances Etc.
A Motivating ExampleBayes’ Rule
Motivating Example for Bayes’ Rule
Bayes’ rule is very useful in distinguishing prior probabilities fromposterior probabilities. This is best illustrated via an example.
Example: Someone has developed a HIV diagnostic test that hasa 2% false negative rate, and a 1% false positive rate. In otherwords, if the patient really has HIV, then the test comes outpositive 98% of the time. If the patient does not have HIV, thetest comes out negative 99% of the time.
Both of these are ‘prior’ probabilities. What we really want toknow is the ‘posterior’ probability, namely: If the patient testspositive, what is the probability that he/she has HIV?
M. Vidyasagar Multiple Random Variables
Joint Probability DistributionsBayes’ Rule and Applications
Conditional Expectations, Variances Etc.
A Motivating ExampleBayes’ Rule
Motivating Example (Cont’d)
Suppose that 1% of the population has HIV. Define two r.v.’s:X ∈ {H,F} indicates whether the patient has HIV or is free fromHIV. Y ∈ {P,N} indicates whether the test comes out positive ornegative.
The data we are given can be summarized as follows:
Pr{X = H} = 0.01.
Pr{Y = P |X = H} = 0.98,Pr{Y = N |X = F} = 0.99.
From this data we can easily infer the following additionalinformation:
Pr{X = F} = 0.99.
Pr{Y = N |X = H} = 0.02,Pr{Y = P |X = F} = 0.01.
M. Vidyasagar Multiple Random Variables
Joint Probability DistributionsBayes’ Rule and Applications
Conditional Expectations, Variances Etc.
A Motivating ExampleBayes’ Rule
Motivating Example (Cont’d)
Now we can construct the joint distribution of X and Y .
X�Y P N
H 0.0098 0.0002F 0.0099 0.9801
From this it follows that
Pr{Y = P} = 0.0098 + 0.0099 = 0.0197,
Pr{X = H|Y = P} =Pr{X = H&Y = P}
Pr{Y = P}=
0.0098
0.0197≈ 0.5.
So the test is totally useless! – we might as well flip a coin asadminister the test!
M. Vidyasagar Multiple Random Variables
Joint Probability DistributionsBayes’ Rule and Applications
Conditional Expectations, Variances Etc.
A Motivating ExampleBayes’ Rule
Outline
1 Joint Probability Distributions
Joint and Marginal Probability Distributions
Independence, Conditional Probability Distributions
Covariance and Correlation Coefficients
2 Bayes’ Rule and Applications
A Motivating Example
Bayes’ Rule
3 Conditional Expectations, Variances Etc.
Conditional Expected Value: Definition
Conditional Expected Value: Example
Conditioning on an Event, Independent Events
M. Vidyasagar Multiple Random Variables
Joint Probability DistributionsBayes’ Rule and Applications
Conditional Expectations, Variances Etc.
A Motivating ExampleBayes’ Rule
Bayes’ Rule
Suppose X,Y are r.v.’s assuming values in finite set A,B, andxi ∈ A, yj ∈ B.. By the definition of conditional probability, weknow that
Pr{X = xi|Y = yj} =Pr{X = xi&Y = yj}
Pr{Y = yj}.
Bayes’ rule consists of rewriting this formula in an equivalent form.
Bayes’ Rule – Version 1: Suppose X,Y are r.v.’s assumingvalues in finite set A,B. Then
Pr{X = xi|Y = yj} = Pr{Y = yj |X = xi}Pr{X = xi}Pr{Y = yj}
.
M. Vidyasagar Multiple Random Variables
Joint Probability DistributionsBayes’ Rule and Applications
Conditional Expectations, Variances Etc.
A Motivating ExampleBayes’ Rule
Bayes’ Rule (Cont’d)
Bayes’ Rule – Version 2:
Suppose X,Y are r.v.’s assuming values in finite set A,B. Then
Pr{X = xi|Y = yj} =Pr{Y = yj |X = xi} · Pr{X = xi}∑n
i′=1 Pr{Y = yj |X = xi′} · Pr{X = xi′}.
It can be recognized that the numerator is justPr{X = xi&Y = yj} while the denominator is just Pr{Y = yj}.
M. Vidyasagar Multiple Random Variables
Joint Probability DistributionsBayes’ Rule and Applications
Conditional Expectations, Variances Etc.
A Motivating ExampleBayes’ Rule
Motivating Example (Cont’d)
By Bayes’ rule, Version 1, we have
Pr{X = xi|Y = yj} = Pr{Y = yj |X = xi}Pr{X = xi}Pr{Y = yj}
.
In our example Pr{Y = P |X = H} ≈ 1, because the diagnostichas a very low false negative rate.
However, Pr{X = H} = 0.01 (1% of the population has HIV)while Pr{Y = P} ≈ 0.02 (the test will be positive for about 2% ofthe population). So
Pr{X = H|Y = P} ≈ 0.01
0.02= 0.5.
M. Vidyasagar Multiple Random Variables
Joint Probability DistributionsBayes’ Rule and Applications
Conditional Expectations, Variances Etc.
A Motivating ExampleBayes’ Rule
Motivating Example (Cont’d)
Now suppose that the fraction of the population that has HIV isnot 1% but is 0.1% instead. Then Pr{X = H} = 0.001. It can becomputed that Pr{Y = P} ≈ 0.01, so
Pr{X = H|Y = P} ≈ 0.1.
So the test is still worse in this situation!
This is because most of the positive readings will be from falsepositives of non-afflicted persons, and only a few positive readingswill be from afflicted persons.
M. Vidyasagar Multiple Random Variables
Joint Probability DistributionsBayes’ Rule and Applications
Conditional Expectations, Variances Etc.
Conditional Expected Value: DefinitionConditional Expected Value: ExampleConditioning on an Event, Independent Events
Outline
1 Joint Probability Distributions
Joint and Marginal Probability Distributions
Independence, Conditional Probability Distributions
Covariance and Correlation Coefficients
2 Bayes’ Rule and Applications
A Motivating Example
Bayes’ Rule
3 Conditional Expectations, Variances Etc.
Conditional Expected Value: Definition
Conditional Expected Value: Example
Conditioning on an Event, Independent Events
M. Vidyasagar Multiple Random Variables
Joint Probability DistributionsBayes’ Rule and Applications
Conditional Expectations, Variances Etc.
Conditional Expected Value: DefinitionConditional Expected Value: ExampleConditioning on an Event, Independent Events
Conditional Expected Value: Motivation
Suppose as always that X,Y are r.v.’s assuming values in A,Bwith joint distribution φ. Suppose f : A→ R is a real-valuedfunction of the r.v. X. We can talk about the ‘unconditional’ aswell as ‘conditional’ expected value of f .
If we know nothing about either X or Y , then the probabilitydistribution of X is the ‘marginal’ distribution φX . If we denotefi = f(xi), then the ‘unconditional’ expected value of f is
E[f ] =
n∑i=1
fi(φX)i =
n∑i=1
fi
m∑j=1
φij .
Now suppose we measure Y and the outcome is Y = yj . We can‘update’ the distribution of X to φ{X|Y=yj} and recompute theexpected value.
M. Vidyasagar Multiple Random Variables
Joint Probability DistributionsBayes’ Rule and Applications
Conditional Expectations, Variances Etc.
Conditional Expected Value: DefinitionConditional Expected Value: ExampleConditioning on an Event, Independent Events
Conditional Expected Value: Definition
With the outcome Y = yj , the conditional distribution of X nowbecomes
φ{X|Y=yj} = [φ{x1|yj} . . . φ{xn|yj}].
So the conditional expected value of f , given the outcomeY = yj , is defined as
E[f |Y = yj ] =
n∑i=1
fiφ{xi|yj}.
M. Vidyasagar Multiple Random Variables
Joint Probability DistributionsBayes’ Rule and Applications
Conditional Expectations, Variances Etc.
Conditional Expected Value: DefinitionConditional Expected Value: ExampleConditioning on an Event, Independent Events
Outline
1 Joint Probability Distributions
Joint and Marginal Probability Distributions
Independence, Conditional Probability Distributions
Covariance and Correlation Coefficients
2 Bayes’ Rule and Applications
A Motivating Example
Bayes’ Rule
3 Conditional Expectations, Variances Etc.
Conditional Expected Value: Definition
Conditional Expected Value: Example
Conditioning on an Event, Independent Events
M. Vidyasagar Multiple Random Variables
Joint Probability DistributionsBayes’ Rule and Applications
Conditional Expectations, Variances Etc.
Conditional Expected Value: DefinitionConditional Expected Value: ExampleConditioning on an Event, Independent Events
Conditional Expected Value: Example
You have been given a speeding ticket in the amount of $ 50. Youcan contest it in court; if found guilty you pay $ 100 but if you arefound not guilty you pay nothing. There are two judges who trysuch cases. Judge T is a ‘toughie’ who finds ‘guilty’ 70% of thetime while Judge S is a ‘softie’ who finds ’guilty’ only 20% of thetime. Judge T is more senior so he tries 60% of the cases. Youhave the option of just paying the ticket right until the momentyour case comes up for trial.
M. Vidyasagar Multiple Random Variables
Joint Probability DistributionsBayes’ Rule and Applications
Conditional Expectations, Variances Etc.
Conditional Expected Value: DefinitionConditional Expected Value: ExampleConditioning on an Event, Independent Events
Conditional Expected Value: Example (Cont’d)
The joint probability distribution is shown below, where J standsfor Judge and D stands for the decision:
J�D G N
T 0.42 0.18S 0.08 0.32
The fine ‘function’ is real-valued with f(G) = 100 and f(N) = 0.Since the prior probability of being found guilty is 50%, the priorexpected fine is $ 50 – the same as the ticket. So you decide to‘go for it’.
M. Vidyasagar Multiple Random Variables
Joint Probability DistributionsBayes’ Rule and Applications
Conditional Expectations, Variances Etc.
Conditional Expected Value: DefinitionConditional Expected Value: ExampleConditioning on an Event, Independent Events
Conditional Expected Value: Example (Cont’d)
When you arrive at the courtroom you do not know which judge issitting, but he finds the defendant just before you ‘guilty’. Whatnow is your expected fine?
You need to compute the posterior probability Pr{J = T |D = G}.By Bayes’ rule,
Pr{J = T |D = G} =Pr{D = G|J = T} · Pr{J = T}
Pr{D = G}=
0.42
0.50= 0.84.
So now you are 84% sure that this is the ‘tough’ judge. Theposterior probability distribution of J (the identity of the judge) is
[P (T ) P (S)] = [0.84 0.16].
M. Vidyasagar Multiple Random Variables
Joint Probability DistributionsBayes’ Rule and Applications
Conditional Expectations, Variances Etc.
Conditional Expected Value: DefinitionConditional Expected Value: ExampleConditioning on an Event, Independent Events
Conditional Expected Value: Example (Cont’d)
Use P (T ) = 0.84, P (S) = 0.16. This updates the probability tableas shown below:
J�D G N
T 0.588 0.252S 0.032 0.128
The conditional expected value of the ‘fine function’ f is now $ 62,more than the value of the ticket.
So you should perhaps decide not to contest!
M. Vidyasagar Multiple Random Variables
Joint Probability DistributionsBayes’ Rule and Applications
Conditional Expectations, Variances Etc.
Conditional Expected Value: DefinitionConditional Expected Value: ExampleConditioning on an Event, Independent Events
Conditional Moments, Variance Etc.
The notions of moments, variance, mean, mode etc. all carry overfrom the ‘unconditional’ case to the ‘conditional’ case.
Whenever an observation is made, just replace the marginaldistribution by the conditional distribution.
M. Vidyasagar Multiple Random Variables
Joint Probability DistributionsBayes’ Rule and Applications
Conditional Expectations, Variances Etc.
Conditional Expected Value: DefinitionConditional Expected Value: ExampleConditioning on an Event, Independent Events
Outline
1 Joint Probability Distributions
Joint and Marginal Probability Distributions
Independence, Conditional Probability Distributions
Covariance and Correlation Coefficients
2 Bayes’ Rule and Applications
A Motivating Example
Bayes’ Rule
3 Conditional Expectations, Variances Etc.
Conditional Expected Value: Definition
Conditional Expected Value: Example
Conditioning on an Event, Independent Events
M. Vidyasagar Multiple Random Variables
Joint Probability DistributionsBayes’ Rule and Applications
Conditional Expectations, Variances Etc.
Conditional Expected Value: DefinitionConditional Expected Value: ExampleConditioning on an Event, Independent Events
Conditioning on an Event: Definition
Until we have been conditioning on an ‘outcome’: Thus wecompute the conditional distribution of X given an observation (oroutcome) Y = yj . But it is also possible condition on an ‘event’ –and this can be done for just a single r.v.; we don’t need a ‘joint’r.v.
Suppose X is a r.v. taking values in a finite set A with distributionφ, and suppose S, T ⊆ A. Thus S, T are ‘events’. We define theconditional probability of T given S by
Pr(T |S) :=Pr(S ∩ T )
Pr(S).
In terms of φ we can also write this as
Pφ(T |S) =Pφ(S ∩ T )
Pφ(S).
M. Vidyasagar Multiple Random Variables
Joint Probability DistributionsBayes’ Rule and Applications
Conditional Expectations, Variances Etc.
Conditional Expected Value: DefinitionConditional Expected Value: ExampleConditioning on an Event, Independent Events
Conditioning on an Event: Example
Suppose we draw one card from a standard deck of cards. So X isa r.v. assuming one of 52 values. Assume that each card is equallylikely, so that φ is the uniform distribution.
Now suppose S is the event X is a red card, while T is the eventthat X is an honor, that is, 10 through ace. What is theconditional probability that X is an honor given that it is a redcard?
Since there are 26 red cards, Pφ(S) = 26/52 = 0.5. Similarly thereare 20 honors, so Pφ(T ) = 20/52 = 5/13. Finally, there are 10 redhonor cards, so Pφ(S ∩ T ) = 10/52 = 5/26. So
Pφ(T |S) =5/26
0.5=
5
13.
M. Vidyasagar Multiple Random Variables
Joint Probability DistributionsBayes’ Rule and Applications
Conditional Expectations, Variances Etc.
Conditional Expected Value: DefinitionConditional Expected Value: ExampleConditioning on an Event, Independent Events
Conditioning on an Event: Formula
Suppose φ is the distribution of X, and S ⊆ A. Then theconditional distribution φ|S on A is defined by
φ|S(xi) :=
{0 if xi 6∈ S,φi
Pφ(S)if xi ∈ S.
So the conditional distribution is defined for every element of theoriginal sample space A, but is concentrated on the set S. Thecorresponding probability measure is precisely what was definedearlier.
M. Vidyasagar Multiple Random Variables
Joint Probability DistributionsBayes’ Rule and Applications
Conditional Expectations, Variances Etc.
Conditional Expected Value: DefinitionConditional Expected Value: ExampleConditioning on an Event, Independent Events
Independent Events
Two events S, T ⊆ A are said to be independent if
Pφ(S ∩ T ) = Pφ(S)Pφ(T ).
For instance, in the card example above, we had that S was theevent that the card was red, and T was the event that the cardwas an honor. We found that1
P (S) = 0.5, P (T ) = 5/13, P (S ∩ T ) = 5/26.
Since the above relationship is satisfied, we can say that whether acard is an honor or not is independent of whether it is red or not.
This example is ‘obvious’ – but there can also be some‘non-obvious’ examples, as we shall see next!
1We leave off the subscript φ for clarityM. Vidyasagar Multiple Random Variables
Joint Probability DistributionsBayes’ Rule and Applications
Conditional Expectations, Variances Etc.
Conditional Expected Value: DefinitionConditional Expected Value: ExampleConditioning on an Event, Independent Events
A Counter-Intuitive Example
Suppose N is some number, say 20. Define A = {1, . . . , N} andlet φ be the uniform distribution on A. Pick two prime numbers,say 2 and 3. Define S to be the event that X is divisible by 2, andT to be the event that X is divisible by 3. So S ∩ T is the eventthat X is divisible by both 2 and 3.
An easy calculation shows that
P (S) = 10/20 = 0.5, P (T ) = 6/20 = 0.3, P (S∩T ) = 3/20 = 0.15.
So once again S and T are independent.
The exercises bring out some more facets of this problem.
M. Vidyasagar Multiple Random Variables
Top Related