14: Conditional Distributions
Transcript of 14: Conditional Distributions
14: Conditional DistributionsDavid VarodayanFebruary 7, 2020Adapted from slides by Lisa Yan
1
Sum of independent random variables
2
Review
ππ~Bin ππ1,ππ ,ππ~Bin(ππ2,ππ) ππ + ππ ~Bin(ππ1 + ππ2,ππ)ππ,ππ independent
ππ~Poi ππ1 ,ππ~Poi ππ2ππ,ππ independent
ππ + ππ ~Poi(ππ1 + ππ2)
ππ~π©π© ππ1,ππ12ππ~π©π© ππ2,ππ22ππ,ππ independent
ππ + ππ ~π©π©(ππ1 + ππ2,ππ12 + ππ22)
Note: these also hold in the general case (β₯ 2 variables)
Quick questionLet π΄π΄~Bin 30, 0.01 and π΅π΅~Bin 50, 0.02 be independent RVs. Can we use sum of independent Poisson RVs to approximate ππ π΄π΄ + π΅π΅ = 2 ?
3
Review
Quick questionLet π΄π΄~Bin 30, 0.01 and π΅π΅~Bin 50, 0.02 be independent RVs. Can we use sum of independent Poisson RVs to approximate ππ π΄π΄ + π΅π΅ = 2 ?
4
Review
π΄π΄ β ππ~Poi ππ1 = 30 β 0.01 = 0.3π΅π΅ β ππ~Poi ππ2 = 50 β 0.02 = 1 ππ π΄π΄ + π΅π΅ = 2 β ππ ππ + ππ = 2
Sol 1: Approximate as sum of Poissons
ππ + ππ~Poi ππ = ππ1 + ππ2 = 1.3
=ππ2
2!ππβππ
Sol 2: No approximation
ππ π΄π΄ + π΅π΅ = 2 = οΏ½ππ=0
2
ππ π΄π΄ = ππ ππ π΅π΅ = 2 β ππ
β 0.2302
= οΏ½ππ=0
230ππ 0.01ππ 0.99 30βππ 50
2 β ππ 0.022βππ0.9848+ππ β 0.2327
Todayβs plan
Sum of two Uniform independent RVs
Expectation of sum of two RVs
Discrete conditional distributions
Continuous conditional distributions
5
Ratio of probabilities interlude
Convolution
6
the convolutionof ππππ and ππππ
For independent continuous random variables ππ and ππ:
ππππ+ππ πΌπΌ = οΏ½ββ
βππππ ππ ππππ πΌπΌ β ππ ππππ
the convolutionof ππππ and ππππ
Recall for independent discrete random variables ππ and ππ:
ππ ππ + ππ = ππ = οΏ½ππ
ππ ππ = ππ ππ ππ = ππ β ππ
Sum of independent UniformsLet ππ~Uni 0,1 and ππ~Uni 0,1 be independent random variables.What is the distribution of ππ + ππ, ππππ+ππ?
7
ππππ+ππ πΌπΌ = οΏ½ββ
βππππ π₯π₯ ππππ πΌπΌ β π₯π₯ πππ₯π₯
ππ and ππindependent+ continuous
ππππ+ππ πΌπΌ = οΏ½ββ
βππππ ππ ππππ πΌπΌ β ππ ππππ
ππππ ππ ππππ πΌπΌ β ππ = 1 when:
ππππ+ππ πΌπΌ = οΏ½ππ 0 β€ ππ β€ 1
2 β ππ 1 β€ ππ β€ 20 otherwise
0
1/2
πΌπΌ
ππ ππ+πππΌπΌ
1/2 1 3/2 2
1
00 β€ ππ β€ 1 and0 β€ πΌπΌ β ππ β€ 1 β πΌπΌ β 1 β€ ππ β€ πΌπΌ
The precise integrationbounds on ππ depend on πΌπΌ.
Todayβs plan
Sum of two Uniform independent RVs
Expectation of sum of two RVs
Discrete conditional distributions
Continuous conditional distributions
8
Properties of Expectation, extended to two RVs1. Linearity:
πΈπΈ ππππ + ππππ + ππ = πππΈπΈ ππ + πππΈπΈ ππ + ππ
2. Expectation of a sum = sum of expectation:
πΈπΈ ππ + ππ = πΈπΈ ππ + πΈπΈ ππ
3. Unconscious statistician:
9
(weβve seen this; weβll prove this next)
πΈπΈ ππ ππ,ππ = οΏ½π₯π₯
οΏ½π¦π¦
ππ π₯π₯,π¦π¦ ππππ,ππ(π₯π₯,π¦π¦)
πΈπΈ ππ ππ,ππ = οΏ½ββ
βοΏ½ββ
βππ π₯π₯,π¦π¦ ππππ,ππ π₯π₯,π¦π¦ πππ₯π₯ πππ¦π¦
Proof of expectation of a sum of RVs
10
Even if the joint distribution is unknown, you can calculate the expectation of sum as sum of expectations.
Example: πΈπΈ βππ=1ππ ππππ = βππ=1ππ πΈπΈ ππππ despite dependent trials ππππ
πΈπΈ ππ + ππ = πΈπΈ ππ + πΈπΈ ππ
οΏ½π₯π₯
οΏ½π¦π¦
π₯π₯ + π¦π¦ ππππ,ππ π₯π₯,π¦π¦ LOTUS,ππ ππ,ππ = ππ + ππ
= οΏ½π₯π₯
οΏ½π¦π¦
π₯π₯ππππ,ππ π₯π₯,π¦π¦ + οΏ½π₯π₯
οΏ½π¦π¦
π¦π¦ππππ,ππ π₯π₯,π¦π¦
Linearity of summations(cont. case: linearity of integrals)
= οΏ½π₯π₯
π₯π₯οΏ½π¦π¦
ππππ,ππ π₯π₯,π¦π¦ + οΏ½π¦π¦
π¦π¦οΏ½π₯π₯
ππππ,ππ π₯π₯,π¦π¦
Marginal PMFs for ππ and ππ= οΏ½π₯π₯
π₯π₯ππππ π₯π₯ + οΏ½π¦π¦
π¦π¦ππππ π¦π¦
= πΈπΈ ππ + πΈπΈ[ππ]
πΈπΈ ππ + ππ =
Expectations of common RVs
11
ππ~Bin(ππ,ππ) πΈπΈ ππ = ππππ
πΈπΈ ππ = πΈπΈ οΏ½ππ=1
ππ
ππππ = οΏ½ππ=1
ππ
πΈπΈ ππππ = οΏ½ππ=1
ππ
ππ = ππππππ = οΏ½ππ=1
ππ
ππππLet ππππ = ππth trial is headsππππ~Ber ππ ,πΈπΈ ππππ = ππ
Review
Expectations of common RVs
12
ππ~Bin(ππ,ππ) πΈπΈ ππ = ππππ
ππ~NegBin(ππ, ππ) πΈπΈ ππ = ππππ
ππ = οΏ½ππ=1
ππ
ππππ πΈπΈ ππ = πΈπΈ οΏ½ππ=1
ππ
ππππ = οΏ½ππ=1
ππ
πΈπΈ ππππ = οΏ½ππ=1
ππ
ππ = ππππLet ππππ = ππth trial is headsππππ~Ber ππ ,πΈπΈ ππππ = ππ
How should we define ππππ? A. ππππ = ππth trial is heads. ππππ~Ber ππ , ππ = 1, β¦ ,ππB. ππππ = # trials to get ππth success (after ππ β 1 th success)
ππππ~Geo ππ , ππ = 1, β¦ , ππC. ππππ = # successes in ππ trials
ππππ~Bin ππ,ππ , ππ = 1, β¦ , ππ, we look for ππ ππππ = 1
ππ = οΏ½ππ=1
?
ππππ
Suppose:
Expectations of common RVs
13
ππ~Bin(ππ,ππ) πΈπΈ ππ = ππππ
ππ~NegBin(ππ, ππ) πΈπΈ ππ = ππππ
ππ = οΏ½ππ=1
ππ
ππππ πΈπΈ ππ = πΈπΈ οΏ½ππ=1
ππ
ππππ = οΏ½ππ=1
ππ
πΈπΈ ππππ = οΏ½ππ=1
ππ
ππ = ππππLet ππππ = ππth trial is headsππππ~Ber ππ ,πΈπΈ ππππ = ππ
ππ = οΏ½ππ=1
ππ
ππππ
Let ππππ = # trials to get ππthsuccess (afterππ β 1 th success)
ππππ~Geo ππ ,πΈπΈ ππππ = 1ππ
πΈπΈ ππ = πΈπΈ οΏ½ππ=1
ππ
ππππ = οΏ½ππ=1
ππ
πΈπΈ ππππ = οΏ½ππ=1
ππ1ππ
=ππππ
Midterm exam
When: Monday, February 10, 7:00pm-9:00pmWhere: Cubberley Auditorium
Not permitted: book/computer/calculatorPermitted: Three 8.5βx11β double-sided sheets of notes
Covers: Up to (and including) week 4 + Lecture Notes 11Practice: http://web.stanford.edu/class/cs109/exams/midterm.htmlReview session: Tomorrow Saturday, 3-5pm, STLC 111
Announcements
14
not recorded; materials will be posted though
Todayβs plan
Sum of two Uniform independent RVs
Expectation of sum of two RVs
Discrete conditional distributions
Continuous conditional distributions
15
CS109 roadmap
16
Multiple events:
ππ πΈπΈ πΉπΉ =ππ πΈπΈπΉπΉππ πΉπΉ
conditional probability
ππ πΈπΈπΉπΉ = ππ πΈπΈ ππ(πΉπΉ)
independence
Joint (Multivariate) distributions
joint PMF/PDF
ππππ,ππ π₯π₯,π¦π¦ππππ,ππ π₯π₯,π¦π¦
intersection
ππ πΈπΈ β© πΉπΉ= ππ πΈπΈπΉπΉ
conditional distributions?
today
independentRVs
sum of independent RVs
Discrete conditional distributionsRecall the definition of the conditional probability of event πΈπΈ given event πΉπΉ:
ππ πΈπΈ πΉπΉ =ππ πΈπΈπΉπΉππ πΉπΉ
For discrete random variables ππ and ππ, the conditional PMF of ππ given ππ is
ππ ππ = π₯π₯ ππ = π¦π¦ =ππ ππ = π₯π₯,ππ = π¦π¦
ππ ππ = π¦π¦
ππππ|ππ π₯π₯|π¦π¦ =ππππ,ππ π₯π₯,π¦π¦ππππ π¦π¦
17
Different notation,same idea:
Quick checkNumber or function?
1. ππ ππ = 2 ππ = 5
2. ππ ππ = π₯π₯ ππ = 5
3. ππ ππ = 2 ππ = π¦π¦
4. ππ ππ = π₯π₯ ππ = π¦π¦
18
True or false?
5.
6.
7. οΏ½π₯π₯
οΏ½π¦π¦
ππ ππ = π₯π₯|ππ = π¦π¦ ππ ππ = π¦π¦ = 1
οΏ½π¦π¦
ππ ππ = 2|ππ = π¦π¦ = 1
οΏ½π₯π₯
ππ ππ = π₯π₯|ππ = 5 = 1
ππ ππ = π₯π₯ ππ = π¦π¦ =ππ ππ = π₯π₯,ππ = π¦π¦
ππ ππ = π¦π¦
Discrete probabilities of CS109Each student responds with
(major ππ, year ππ, bool excited about midterm ππ):
19
CSππ = 1
SymSys/MCS/EEππ = 2
Other Eng/Sci/Mathππ = 3
Hum/SocSci/
Lingππ = 4
Double majorππ = 5
Undec.ππ = 6
ππ = 1 .006 .000 .000 .000 .000 .000ππ = 2 .155 .069 .034 .006 .023 .029ππ = 3 .092 .063 .023 .006 .006 .000ππ = 4 .017 .029 .011 .006 .000 .000ππ β₯ 5 .029 .006 .011 .006 .000 .000ππ = 1 .000 .000 .000 .000 .000 .000ππ = 2 .126 .040 .017 .017 .000 .017ππ = 3 .046 .040 .006 .011 .000 .006ππ = 4 .006 .006 .000 .000 .000 .000ππ β₯ 5 .006 .000 .017 .011 .000 .000
ππ=
0ππ
=1
ππ ππ β₯ 5,ππ = 1 ,ππ = 1
Joint PMF of ππ,ππ,ππ
CSππ = 1
SymSys/MCS/EEππ = 2
Other Eng/Sci/
Mathππ = 3
Hum/SocSci/
Lingππ = 4
Double majorππ = 5
Undec.ππ = 6
ππ = 0 .299 .167 .080 .023 .029 .029ππ = 1 .184 .086 .040 .040 .000 .023
ππ ππ = 1 ,ππ = 1 = 0.18
Joint PMF of ππ,ππ
Discrete probabilities of CS109The below tables are probability tables for the conditional PMFsππ ππ = ππ|ππ = π₯π₯ and ππ ππ = π₯π₯|ππ = ππ .
1. Which table is which?2. Fill in the missing probability.
20
CSππ = 1
SymSys/MCS/EEππ = 2
Other Eng/Sci/
Mathππ = 3
Hum/SocSci/
Lingππ = 4
Double majorππ = 5
Undec.ππ = 6
ππ = 0 .299 .167 .080 .023 .029 .029ππ = 1 .184 .086 .040 .040 .000 .023
ππ ππ = 1 ,ππ = 1 = 0.18
Joint PMF of ππ,ππ
CSππ = 1
SymSys/MCS/EEππ = 2
Other Eng/Sci/
Mathππ = 3
Hum/SocSci/
Lingππ = 4
Double majorππ = 5
Undec.ππ = 6
ππ = 0 .619 .659 .667 .364 1.000 .556ππ = 1 .381 .341 .333 .636 .000 .444
CSππ = 1
SymSys/MCS/EEππ = 2
Other Eng/Sci/
Mathππ = 3
Hum/SocSci/
Lingππ = 4
Double majorππ = 5
Undec.ππ = 6
ππ = 0 .477 .266 .128 .037 .046 .046ππ = 1 .231 .108 .108 .000 .062
ππ ππ = π₯π₯ ππ = π¦π¦ =ππ ππ = π₯π₯,ππ = π¦π¦
ππ ππ = π¦π¦
Be clear about which probabilities should sum to one.
Todayβs plan
Sum of two Uniform independent RVs
Expectation of sum of two RVs
Discrete conditional distributions
Continuous conditional distributions
21
Ratio of probabilities interlude
Relative probabilities of continuous random variablesLet ππ = time to finish problem set 2.Suppose ππ~π©π© 10,2 .How much more likely are you to complete in 10 hours than 5 hours?
22
ππ ππ = 10ππ ππ = 5
=
A. 00
= undefined
B. ππ 10ππ 5
Relative probabilities of continuous random variables
23
Ratios of PDFs are meaningful
ππ ππ = ππ = ππ ππ βππ2β€ ππ β€ ππ +
ππ2
ππ ππ = 10ππ ππ = 5
=ππ 10ππ 5
ππ ππ = ππππ ππ = ππ
=ππππ ππππππ ππ
=ππ ππππ ππTherefore
= οΏ½ππβππ2
ππ+ππ2ππ π₯π₯ πππ₯π₯ β ππππ(ππ)
=
1ππ 2ππ
ππβ10 β ππ 2
2ππ2
1ππ 2ππ
ππβ5 β ππ 2
2ππ2
=ππβ
10 β10 2
2β 2
ππβ5 β 10 2
2β 2
=ππ0
ππβ254
= 518
Let ππ = time to finish problem set 2.Suppose ππ~π©π© 10,2 .How much more likely are you to complete in 10 hours than 5 hours?
Todayβs plan
Sum of two Uniform independent RVs
Expectation of sum of two RVs
Discrete conditional distributions
Continuous conditional distributions
24
Ratio of probabilities interlude
For continuous RVs ππ and ππ, the conditional PDF of ππ given ππ is
ππππ|ππ π₯π₯|π¦π¦ =ππππ,ππ π₯π₯,π¦π¦ππππ π¦π¦
Intuition:
Note that conditional PDF ππππ|ππ is a true density:
Continuous conditional distributions
25
ππ ππ = π₯π₯ ππ = π¦π¦ =ππ ππ = π₯π₯,ππ = π¦π¦
ππ ππ = π¦π¦ππππ|ππ π₯π₯ π¦π¦ ππππ =
ππππ,ππ π₯π₯,π¦π¦ πππ₯π₯πππ¦π¦ππππ π¦π¦ πππ¦π¦
οΏ½ββ
βπππ₯π₯ π₯π₯|π¦π¦ πππ₯π₯ = οΏ½
ββ
β ππππ,ππ π₯π₯,π¦π¦ππππ π¦π¦
πππ₯π₯ =β«βββ ππππ,ππ π₯π₯,π¦π¦ πππ₯π₯
ππππ π¦π¦=ππππ π¦π¦ππππ π¦π¦
= 1
Bayesβ Theorem with Continuous RVsFor continuous RVs ππ and ππ,
ππππ|ππ π¦π¦|π₯π₯ =ππππ|ππ π₯π₯|π¦π¦ ππππ π¦π¦
ππππ π₯π₯
26
ππ ππ = π¦π¦ ππ = π₯π₯ =ππ ππ = π₯π₯|ππ = π¦π¦ ππ ππ = π¦π¦
ππ ππ = π₯π₯
ππππ|ππ π¦π¦ π₯π₯ ππππ =ππππ|ππ π₯π₯|π¦π¦ ππππ ππππ π¦π¦ πππ¦π¦
ππππ π₯π₯ ππππ
Intuition:
Tracking in 2-D space?
27
You want to knowthe 2-D location ofan object.
Your satellite pinggives you a noisy 1-Dmeasurement of thedistance of the objectfrom the satellite (0,0).
Tracking in 2-D spaceβ’ You have a prior belief about the 2-D location of an object, ππ,ππ .β’ You observe a noisy distance measurement, π·π· = 4.β’ What is your updated (posterior) belief of the 2-D location of the object after
observing the measurement?
28
posteriorbelief
likelihood(of evidence)
priorbelief
normalization constant
Recall Bayes terminology:
ππππ,ππ|π·π· π₯π₯, π¦π¦|ππ =πππ·π·|ππ,ππ ππ|π₯π₯, π¦π¦ ππππ,ππ π₯π₯, π¦π¦
πππ·π· ππ
Top-down view
Tracking in 2-D spaceβ’ You have a prior belief about the 2-D location of an object, ππ,ππ .β’ You observe a noisy distance measurement, π·π· = 4.β’ What is your updated (posterior) belief of the 2-D location of the object after
observing the measurement?
29
Let ππ,ππ = objectβs 2-D location.(your satellite is at (0,0)
Suppose the prior distribution is asymmetric bivariate normal distribution:
π₯π₯
π¦π¦
ππ ππ,πππ₯π₯,π¦π¦
3-D view
ππππ,ππ π₯π₯,π¦π¦ =1
2ππ22ππβ
π₯π₯β3 2+ π¦π¦β3 2
2 22
normalizing constant
= πΎπΎ1 β ππβ π₯π₯β3 2+ π¦π¦β3 2
8
Tracking in 2-D spaceβ’ You have a prior belief about the 2-D location of an object, ππ,ππ .β’ You observe a noisy distance measurement, π·π· = 4.β’ What is your updated (posterior) belief of the 2-D location of the object after
observing the measurement?
30
Let π·π· = distance from the satellite (radially).Suppose you knew your actual position: π₯π₯,π¦π¦ .β’ π·π· is still noisy! Suppose noise is unit variance: ππ2 = 1β’ On average, π·π· is your actual position: ππ = π₯π₯2 + π¦π¦2
If you knew your actual location π₯π₯,π¦π¦ , you could say how likely
a measurement π·π· = 4 is!!
Tracking in 2-D space
31
prob
abili
ty d
ensi
ty ππ = π₯π₯2 + π¦π¦2
ππ2 = 1
Distance measurement of a ping is normal with respect to the true location.
Tracking in 2-D spaceβ’ You have a prior belief about the 2-D location of an object, ππ,ππ .β’ You observe a noisy distance measurement, π·π· = 4.β’ What is your updated (posterior) belief of the 2-D location of the object after
observing the measurement?
32
If noise is normal: π·π·|ππ,ππ~ππ ππ = π₯π₯2 + π¦π¦2 ,ππ2 = 1
Distance measurement of a ping is normal with respect to the true location.
If you knew your actual location π₯π₯,π¦π¦ , you could say how likely
a measurement π·π· = 4 is!!
prob
abili
ty d
ensi
ty ππ = π₯π₯2 + π¦π¦2
ππ2 = 1
Tracking in 2-D spaceβ’ You have a prior belief about the 2-D location of an object, ππ,ππ .β’ You observe a noisy distance measurement, π·π· = 4.β’ What is your updated (posterior) belief of the 2-D location of the object after
observing the measurement?
33
If you knew your actual location π₯π₯,π¦π¦ , you could say how likely
a measurement π·π· = 4 is!!π·π·|ππ,ππ~π©π© ππ = π₯π₯2 + π¦π¦2 ,ππ2 = 1
πππ·π·|ππ,ππ π·π· = ππ|ππ = π₯π₯,ππ = π¦π¦ =1
ππ 2ππππβ ππβππ 2
2ππ2
=12ππ
ππβ ππβ π₯π₯2+π¦π¦2
2
2 = ππβ ππβ π₯π₯2+π¦π¦2
2
2
normalizing constant
πΎπΎ2 β
Tracking in 2-D spaceβ’ You have a prior belief about the 2-D location of an object, ππ,ππ .β’ You observe a noisy distance measurement, π·π· = 4.β’ What is your updated (posterior) belief of the 2-D location of the object after
observing the measurement?
34
Observation likelihood
πππ·π·|ππ,ππ ππ|π₯π₯,π¦π¦ = πΎπΎ2 β ππβ ππβ π₯π₯2+π¦π¦2
2
2
ππ = π₯π₯2 + π¦π¦2
ππ2 = 1Top-down view
Prior belief
ππππ,ππ π₯π₯,π¦π¦ = πΎπΎ1 β ππβ π₯π₯β3 2+ π¦π¦β3 2
8
ππππ,ππ|π·π· π₯π₯,π¦π¦|4 = ππππ,ππ|π·π· ππ = π₯π₯,ππ = π¦π¦|π·π· = 4Posteriorbelief
Tracking in 2-D spaceWhat is your updated (posterior) belief of the 2-D location of the object after observing the measurement?
35
ππππ,ππ|π·π· ππ = π₯π₯,ππ = π¦π¦|π·π· = 4 =πππ·π·|ππ,ππ π·π· = 4|ππ = π₯π₯,ππ = π¦π¦ ππππ,ππ π₯π₯,π¦π¦
ππ(π·π· = 4)BayesβTheorem
=πΎπΎ2 β ππ
β4β π₯π₯2+π¦π¦2
2
2 β πΎπΎ1 β ππβ
π₯π₯β3 2+ π¦π¦β3 2
8
ππ(π·π· = 4)
likelihood of π·π· = 4 prior belief
=πΎπΎ3 β ππ
β4β π₯π₯2+π¦π¦2
2
2 +π₯π₯β3 2+ π¦π¦β3 2
8
ππ(π·π· = 4)
= πΎπΎ4 β ππβ
4β π₯π₯2+π¦π¦22
2 + π₯π₯β3 2+ π¦π¦β3 2
8
Tracking in 2-D space: Posterior belief
36
ππππ,ππ π₯π₯,π¦π¦ = πΎπΎ1 β ππβ π₯π₯β3 2+ π¦π¦β3 2
8
Prior belief Posterior beliefTop-down view
π¦π¦
3-D view
π₯π₯
π¦π¦
π₯π₯
Top-down view 3-D view
ππππ,ππ|π·π· π₯π₯,π¦π¦|4 =
πΎπΎ4β ππβ
4β π₯π₯2+π¦π¦22
2 +π₯π₯β3 2+ π¦π¦β3 2
8