© Daniel S. Weld 1 Naïve Bayes & Expectation Maximization CSE 573.
-
date post
21-Dec-2015 -
Category
Documents
-
view
218 -
download
0
Transcript of © Daniel S. Weld 1 Naïve Bayes & Expectation Maximization CSE 573.
![Page 1: © Daniel S. Weld 1 Naïve Bayes & Expectation Maximization CSE 573.](https://reader030.fdocuments.in/reader030/viewer/2022032522/56649d6c5503460f94a4c4a5/html5/thumbnails/1.jpg)
© Daniel S. Weld 1
Naïve Bayes & Expectation Maximization
CSE 573
![Page 2: © Daniel S. Weld 1 Naïve Bayes & Expectation Maximization CSE 573.](https://reader030.fdocuments.in/reader030/viewer/2022032522/56649d6c5503460f94a4c4a5/html5/thumbnails/2.jpg)
© Daniel S. Weld 2
Logistics
• Team Meetings• Midterm
Open book, notes Studying
• See AIMA exercises
![Page 3: © Daniel S. Weld 1 Naïve Bayes & Expectation Maximization CSE 573.](https://reader030.fdocuments.in/reader030/viewer/2022032522/56649d6c5503460f94a4c4a5/html5/thumbnails/3.jpg)
© Daniel S. Weld 3
573 Schedule
Agency
Problem Spaces
Search
Knowledge Representation & Inference
Planning SupervisedLearning
Logic-Based
Probabilistic
ReinforcementLearning
Selected Topics
![Page 4: © Daniel S. Weld 1 Naïve Bayes & Expectation Maximization CSE 573.](https://reader030.fdocuments.in/reader030/viewer/2022032522/56649d6c5503460f94a4c4a5/html5/thumbnails/4.jpg)
© Daniel S. Weld 4
Coming Soon
Artificial Life
Selected Topics
Crossword Puzzles
Intelligent Internet Systems
![Page 5: © Daniel S. Weld 1 Naïve Bayes & Expectation Maximization CSE 573.](https://reader030.fdocuments.in/reader030/viewer/2022032522/56649d6c5503460f94a4c4a5/html5/thumbnails/5.jpg)
© Daniel S. Weld 5
Topics
• Test & Mini Projects• Review • Naive Bayes
Maximum Likelihood Estimates Working with Probabilities
• Expectation Maximization• Challenge
![Page 6: © Daniel S. Weld 1 Naïve Bayes & Expectation Maximization CSE 573.](https://reader030.fdocuments.in/reader030/viewer/2022032522/56649d6c5503460f94a4c4a5/html5/thumbnails/6.jpg)
Estimation Models
Uniform The most likely
Any The most likely
AnyWeighted
combination
Maximum Likelihood Estimate
Maximum A Posteriori Estimate
Bayesian Estimate
Prior Hypothesis
![Page 7: © Daniel S. Weld 1 Naïve Bayes & Expectation Maximization CSE 573.](https://reader030.fdocuments.in/reader030/viewer/2022032522/56649d6c5503460f94a4c4a5/html5/thumbnails/7.jpg)
0
1
2
3
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Continuous Case
Probability of heads
Rela
tive L
ikelih
ood
![Page 8: © Daniel S. Weld 1 Naïve Bayes & Expectation Maximization CSE 573.](https://reader030.fdocuments.in/reader030/viewer/2022032522/56649d6c5503460f94a4c4a5/html5/thumbnails/8.jpg)
0
1
2
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Continuous CasePrior
0
1
2
3
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Exp 1: Heads
0
1
2
3
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
0
1
2
3
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Exp 2: Tails
0
1
2
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
0
1
2
3
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
uniform
with backgroundknowledge
![Page 9: © Daniel S. Weld 1 Naïve Bayes & Expectation Maximization CSE 573.](https://reader030.fdocuments.in/reader030/viewer/2022032522/56649d6c5503460f94a4c4a5/html5/thumbnails/9.jpg)
Continuous Case
0
1
2
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
0
1
2
3
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Bayesian EstimateMAP Estimate
ML Estimate
Posterior after 2 experiments:
w/ uniform prior
with backgroundknowledge
![Page 10: © Daniel S. Weld 1 Naïve Bayes & Expectation Maximization CSE 573.](https://reader030.fdocuments.in/reader030/viewer/2022032522/56649d6c5503460f94a4c4a5/html5/thumbnails/10.jpg)
-1
0
1
2
3
4
5
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
-1
0
1
2
3
4
5
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
After 10 Experiments...Posterior:
Bayesian EstimateMAP Estimate
ML Estimatew/ uniform prior
with backgroundknowledge
![Page 11: © Daniel S. Weld 1 Naïve Bayes & Expectation Maximization CSE 573.](https://reader030.fdocuments.in/reader030/viewer/2022032522/56649d6c5503460f94a4c4a5/html5/thumbnails/11.jpg)
© Daniel S. Weld 11
Topics
• Test & Mini Projects• Review • Naive Bayes
Maximum Likelihood Estimates Working with Probabilities
• Expectation Maximization• Challenge
![Page 12: © Daniel S. Weld 1 Naïve Bayes & Expectation Maximization CSE 573.](https://reader030.fdocuments.in/reader030/viewer/2022032522/56649d6c5503460f94a4c4a5/html5/thumbnails/12.jpg)
Naive Bayes
•A Bayes Net where all nodes are children of a single root node•Why? Expressive and accurate? Easy to learn?
=`Is “apple” in message?’
![Page 13: © Daniel S. Weld 1 Naïve Bayes & Expectation Maximization CSE 573.](https://reader030.fdocuments.in/reader030/viewer/2022032522/56649d6c5503460f94a4c4a5/html5/thumbnails/13.jpg)
Naive Bayes
•All nodes are children of a single root node•Why? Expressive and accurate? No - why? Easy to learn?
![Page 14: © Daniel S. Weld 1 Naïve Bayes & Expectation Maximization CSE 573.](https://reader030.fdocuments.in/reader030/viewer/2022032522/56649d6c5503460f94a4c4a5/html5/thumbnails/14.jpg)
Naive Bayes
•All nodes are children of a single root node•Why? Expressive and accurate? No Easy to learn? Yes
![Page 15: © Daniel S. Weld 1 Naïve Bayes & Expectation Maximization CSE 573.](https://reader030.fdocuments.in/reader030/viewer/2022032522/56649d6c5503460f94a4c4a5/html5/thumbnails/15.jpg)
Naive Bayes
•All nodes are children of a single root node•Why? Expressive and accurate? No Easy to learn? Yes Useful? Sometimes
![Page 16: © Daniel S. Weld 1 Naïve Bayes & Expectation Maximization CSE 573.](https://reader030.fdocuments.in/reader030/viewer/2022032522/56649d6c5503460f94a4c4a5/html5/thumbnails/16.jpg)
Inference In Naive Bayes
Goal, given evidence (words in an email)
Decide if an email is spam
P(S) = 0.6
P(A|S) = 0.01P(A|¬S) = 0.1
P(B|S) = 0.001P(B|¬S) = 0.01
P(F|S) = 0.8P(F|¬S) = 0.05
P(A|S) = ?P(A|¬S) = ?
![Page 17: © Daniel S. Weld 1 Naïve Bayes & Expectation Maximization CSE 573.](https://reader030.fdocuments.in/reader030/viewer/2022032522/56649d6c5503460f94a4c4a5/html5/thumbnails/17.jpg)
Inference In Naive BayesP(S) = 0.6
P(A|S) = 0.01P(A|¬S) = 0.1
P(B|S) = 0.001P(B|¬S) = 0.01
P(F|S) = 0.8P(F|¬S) = 0.05
P(A|S) = ?P(A|¬S) = ?
Independence to the rescue!
P(S|E)
P(E)
P(E)
![Page 18: © Daniel S. Weld 1 Naïve Bayes & Expectation Maximization CSE 573.](https://reader030.fdocuments.in/reader030/viewer/2022032522/56649d6c5503460f94a4c4a5/html5/thumbnails/18.jpg)
Inference In Naive BayesP(S) = 0.6
P(A|S) = 0.01P(A|¬S) = 0.1
P(B|S) = 0.001P(B|¬S) = 0.01
P(F|S) = 0.8P(F|¬S) = 0.05
P(A|S) = ?P(A|¬S) = ?
Spam if P(S|E) > P(¬S|E)
But...
P(E)
P(E)
![Page 19: © Daniel S. Weld 1 Naïve Bayes & Expectation Maximization CSE 573.](https://reader030.fdocuments.in/reader030/viewer/2022032522/56649d6c5503460f94a4c4a5/html5/thumbnails/19.jpg)
Inference In Naive BayesP(S) = 0.6
P(A|S) = 0.01P(A|¬S) = 0.1
P(B|S) = 0.001P(B|¬S) = 0.01
P(F|S) = 0.8P(F|¬S) = 0.05
P(A|S) = ?P(A|¬S) = ?
![Page 20: © Daniel S. Weld 1 Naïve Bayes & Expectation Maximization CSE 573.](https://reader030.fdocuments.in/reader030/viewer/2022032522/56649d6c5503460f94a4c4a5/html5/thumbnails/20.jpg)
Parameter Estimation Revisited
Can we calculate Maximum Likelihood estimate of θ easily?
P(S) = θ
Data:
+
Prior
0
1
2
0 0.2 0.4 0.6 0.8 1
θ=
0
1
2
0 0.2 0.4 0.6 0.8 1
θLooking for the maximum of a function:- find the derivative- set it to zero
Max Likelihoodestimate
![Page 21: © Daniel S. Weld 1 Naïve Bayes & Expectation Maximization CSE 573.](https://reader030.fdocuments.in/reader030/viewer/2022032522/56649d6c5503460f94a4c4a5/html5/thumbnails/21.jpg)
© Daniel S. Weld 21
Topics
• Test & Mini Projects• Review • Naive Bayes
Maximum Likelihood Estimates Working with Probabilities
•Smoothing•Computational Details•Continuous Quantities
• Expectation Maximization• Challenge
![Page 22: © Daniel S. Weld 1 Naïve Bayes & Expectation Maximization CSE 573.](https://reader030.fdocuments.in/reader030/viewer/2022032522/56649d6c5503460f94a4c4a5/html5/thumbnails/22.jpg)
Evidence is Easy?
•Or…. Are their problems?
# + # #P(Xi | S) =
![Page 23: © Daniel S. Weld 1 Naïve Bayes & Expectation Maximization CSE 573.](https://reader030.fdocuments.in/reader030/viewer/2022032522/56649d6c5503460f94a4c4a5/html5/thumbnails/23.jpg)
Smooth with a Prior
p = prior probabilitym = weight
Note that if m = 10, it means “I’ve seen 10 samples that make me believe P(Xi | S) = p”
Hence, m is referred to as the equivalent sample size
# + # # + mp
+ mP(Xi | S) =
![Page 24: © Daniel S. Weld 1 Naïve Bayes & Expectation Maximization CSE 573.](https://reader030.fdocuments.in/reader030/viewer/2022032522/56649d6c5503460f94a4c4a5/html5/thumbnails/24.jpg)
Probabilities: Important Detail!
Any more potential problems here?
• P(spam | X1 … Xn) = P(spam | Xi)i
• We are multiplying lots of small numbers Danger of underflow! 0.557 = 7 E -18
• Solution? Use logs and add! p1 * p2 = e log(p1)+log(p2)
Always keep in log form
![Page 25: © Daniel S. Weld 1 Naïve Bayes & Expectation Maximization CSE 573.](https://reader030.fdocuments.in/reader030/viewer/2022032522/56649d6c5503460f94a4c4a5/html5/thumbnails/25.jpg)
© Daniel S. Weld 25
P(S | X)• Easy to compute from data if X discrete
Instance X Spam?
1 T F
2 T F
3 F T
4 T T
5 T F
• P(S | X) = ¼ ignoring smoothing…
![Page 26: © Daniel S. Weld 1 Naïve Bayes & Expectation Maximization CSE 573.](https://reader030.fdocuments.in/reader030/viewer/2022032522/56649d6c5503460f94a4c4a5/html5/thumbnails/26.jpg)
© Daniel S. Weld 26
P(S | X)• What if X is real valued?
Instance X Spam?
1 0.01 False
2 0.01 False
3 0.02 False
4 0.03 True
5 0.05 True
• What now?
----- <T
----- <T
----- <T
----- >T
----- >T
![Page 27: © Daniel S. Weld 1 Naïve Bayes & Expectation Maximization CSE 573.](https://reader030.fdocuments.in/reader030/viewer/2022032522/56649d6c5503460f94a4c4a5/html5/thumbnails/27.jpg)
© Daniel S. Weld 27
Anything Else?# X S?1 0.0
1
F
2 0.01
F
3 0.02
F
4 0.03
T
5 0.05
T
.01 .02 .03 .04 .05
3
2
1
0
P(S|.04)?
![Page 28: © Daniel S. Weld 1 Naïve Bayes & Expectation Maximization CSE 573.](https://reader030.fdocuments.in/reader030/viewer/2022032522/56649d6c5503460f94a4c4a5/html5/thumbnails/28.jpg)
© Daniel S. Weld 28
Fit Gaussians# X S?1 0.0
1
F
2 0.01
F
3 0.02
F
4 0.03
T
5 0.05
T
.01 .02 .03 .04 .05
3
2
1
0
![Page 29: © Daniel S. Weld 1 Naïve Bayes & Expectation Maximization CSE 573.](https://reader030.fdocuments.in/reader030/viewer/2022032522/56649d6c5503460f94a4c4a5/html5/thumbnails/29.jpg)
© Daniel S. Weld 29
Smooth with Gaussianthen sum
“Kernel Density Estimation”
# X S?1 0.0
1
F
2 0.01
F
3 0.02
F
4 0.03
T
5 0.05
T
.01 .02 .03 .04 .05
3
2
1
0
![Page 30: © Daniel S. Weld 1 Naïve Bayes & Expectation Maximization CSE 573.](https://reader030.fdocuments.in/reader030/viewer/2022032522/56649d6c5503460f94a4c4a5/html5/thumbnails/30.jpg)
© Daniel S. Weld 32
Topics
• Test & Mini Projects• Review • Naive Bayes• Expectation Maximization
Review: Learning Bayesian Networks•Parameter Estimation•Structure Learning
Hidden Nodes• Challenge
![Page 31: © Daniel S. Weld 1 Naïve Bayes & Expectation Maximization CSE 573.](https://reader030.fdocuments.in/reader030/viewer/2022032522/56649d6c5503460f94a4c4a5/html5/thumbnails/31.jpg)
© Daniel S. Weld 33
An Example Bayes Net
Earthquake Burglary
Alarm
Nbr2CallsNbr1Calls
Pr(B=t) Pr(B=f) 0.05 0.95
Radio
Pr(A|E,B)e,b 0.9 (0.1)e,b 0.2 (0.8)e,b 0.85 (0.15)e,b 0.01 (0.99)
![Page 32: © Daniel S. Weld 1 Naïve Bayes & Expectation Maximization CSE 573.](https://reader030.fdocuments.in/reader030/viewer/2022032522/56649d6c5503460f94a4c4a5/html5/thumbnails/32.jpg)
Parameter Estimation and Bayesian Networks
E B R A J MT F T T F TF F F F F TF T F T T TF F F T T TF T F F F F
...
We have: - Bayes Net structure and observations- We need: Bayes Net parameters
![Page 33: © Daniel S. Weld 1 Naïve Bayes & Expectation Maximization CSE 573.](https://reader030.fdocuments.in/reader030/viewer/2022032522/56649d6c5503460f94a4c4a5/html5/thumbnails/33.jpg)
Parameter Estimation and Bayesian Networks
E B R A J MT F T T F TF F F F F TF T F T T TF F F T T TF T F F F F
...
P(A|E,B) = ?P(A|E,¬B) = ?P(A|¬E,B) = ?P(A|¬E,¬B) = ?
![Page 34: © Daniel S. Weld 1 Naïve Bayes & Expectation Maximization CSE 573.](https://reader030.fdocuments.in/reader030/viewer/2022032522/56649d6c5503460f94a4c4a5/html5/thumbnails/34.jpg)
Parameter Estimation and Bayesian Networks
E B R A J MT F T T F TF F F F F TF T F T T TF F F T T TF T F F F F
...
P(A|E,B) = ?P(A|E,¬B) = ?P(A|¬E,B) = ?P(A|¬E,¬B) = ?
Prior
0
1
2
0 0.2 0.4 0.6 0.8 1
+ data= 0
1
2
0 0.2 0.4 0.6 0.8 1
Now com
pute
eith
er M
AP or
Bayes
ian
estim
ate
![Page 35: © Daniel S. Weld 1 Naïve Bayes & Expectation Maximization CSE 573.](https://reader030.fdocuments.in/reader030/viewer/2022032522/56649d6c5503460f94a4c4a5/html5/thumbnails/35.jpg)
© Daniel S. Weld 37
Recap
• Given a BN structure (with discrete or continuous variables), we can learn the parameters of the conditional prop tables.
Spam
Nigeria NudeSex
Earthqk Burgl
Alarm
N2N1
![Page 36: © Daniel S. Weld 1 Naïve Bayes & Expectation Maximization CSE 573.](https://reader030.fdocuments.in/reader030/viewer/2022032522/56649d6c5503460f94a4c4a5/html5/thumbnails/36.jpg)
What if we don’t know structure?
![Page 37: © Daniel S. Weld 1 Naïve Bayes & Expectation Maximization CSE 573.](https://reader030.fdocuments.in/reader030/viewer/2022032522/56649d6c5503460f94a4c4a5/html5/thumbnails/37.jpg)
Learning The Structureof Bayesian Networks
• Search thru the space… of possible network structures! (for now, assume we observe all variables)
• For each structure, learn parameters• Pick the one that fits observed data best
Caveat – won’t we end up fully connected????
• When scoring, add a penalty model complexity
Problem !?!?
![Page 38: © Daniel S. Weld 1 Naïve Bayes & Expectation Maximization CSE 573.](https://reader030.fdocuments.in/reader030/viewer/2022032522/56649d6c5503460f94a4c4a5/html5/thumbnails/38.jpg)
Learning The Structureof Bayesian Networks
• Search thru the space • For each structure, learn parameters• Pick the one that fits observed data best
• Problem? Exponential number of networks! And we need to learn parameters for each! Exhaustive search out of the question!
• So what now?
![Page 39: © Daniel S. Weld 1 Naïve Bayes & Expectation Maximization CSE 573.](https://reader030.fdocuments.in/reader030/viewer/2022032522/56649d6c5503460f94a4c4a5/html5/thumbnails/39.jpg)
Learning The Structureof Bayesian Networks
Local search! Start with some network structure Try to make a change (add or delete or reverse edge) See if the new network is any better
What should be the initial state?
![Page 40: © Daniel S. Weld 1 Naïve Bayes & Expectation Maximization CSE 573.](https://reader030.fdocuments.in/reader030/viewer/2022032522/56649d6c5503460f94a4c4a5/html5/thumbnails/40.jpg)
Initial Network Structure?
• Uniform prior over random networks?
• Network which reflects expert knowledge?
![Page 41: © Daniel S. Weld 1 Naïve Bayes & Expectation Maximization CSE 573.](https://reader030.fdocuments.in/reader030/viewer/2022032522/56649d6c5503460f94a4c4a5/html5/thumbnails/41.jpg)
© Daniel S. Weld 43
Learning BN Structure
![Page 42: © Daniel S. Weld 1 Naïve Bayes & Expectation Maximization CSE 573.](https://reader030.fdocuments.in/reader030/viewer/2022032522/56649d6c5503460f94a4c4a5/html5/thumbnails/42.jpg)
The Big Picture
• We described how to do MAP (and ML) learning of a Bayes net (including structure)
• How would Bayesian learning (of BNs) differ?•Find all possible networks
•Calculate their posteriors
•When doing inference, return weighed combination of predictions from all networks!
![Page 43: © Daniel S. Weld 1 Naïve Bayes & Expectation Maximization CSE 573.](https://reader030.fdocuments.in/reader030/viewer/2022032522/56649d6c5503460f94a4c4a5/html5/thumbnails/43.jpg)
© Daniel S. Weld 45
Hidden Variables
• But we can’t observe the disease variable• Can’t we learn without it?
![Page 44: © Daniel S. Weld 1 Naïve Bayes & Expectation Maximization CSE 573.](https://reader030.fdocuments.in/reader030/viewer/2022032522/56649d6c5503460f94a4c4a5/html5/thumbnails/44.jpg)
© Daniel S. Weld 46
We –could-• But we’d get a fully-connected network
• With 708 parameters (vs. 78) Much harder to learn!
![Page 45: © Daniel S. Weld 1 Naïve Bayes & Expectation Maximization CSE 573.](https://reader030.fdocuments.in/reader030/viewer/2022032522/56649d6c5503460f94a4c4a5/html5/thumbnails/45.jpg)
© Daniel S. Weld 47
Chicken & Egg Problem
• If we knew that a training instance (patient) had the disease… It would be easy to learn P(symptom |
disease) But we can’t observe disease, so we don’t.
• If we knew params, e.g. P(symptom | disease) then it’d be easy to estimate if the patient had the disease. But we don’t know these parameters.
![Page 46: © Daniel S. Weld 1 Naïve Bayes & Expectation Maximization CSE 573.](https://reader030.fdocuments.in/reader030/viewer/2022032522/56649d6c5503460f94a4c4a5/html5/thumbnails/46.jpg)
© Daniel S. Weld 48
Expectation Maximization (EM)(high-level version)
• Pretend we do know the parameters Initialize randomly
• [E step] Compute probability of instance having each possible value of the hidden variable
• [M step] Treating each instance as fractionally having both values compute the new parameter values
• Iterate until convergence!
![Page 47: © Daniel S. Weld 1 Naïve Bayes & Expectation Maximization CSE 573.](https://reader030.fdocuments.in/reader030/viewer/2022032522/56649d6c5503460f94a4c4a5/html5/thumbnails/47.jpg)
© Daniel S. Weld 49
Simplest Version• Mixture of two distributions
• Know: form of distribution & variance, % =5• Just need mean of each distribution.01 .03 .05 .07 .09
![Page 48: © Daniel S. Weld 1 Naïve Bayes & Expectation Maximization CSE 573.](https://reader030.fdocuments.in/reader030/viewer/2022032522/56649d6c5503460f94a4c4a5/html5/thumbnails/48.jpg)
© Daniel S. Weld 50
Input Looks Like
.01 .03 .05 .07 .09
![Page 49: © Daniel S. Weld 1 Naïve Bayes & Expectation Maximization CSE 573.](https://reader030.fdocuments.in/reader030/viewer/2022032522/56649d6c5503460f94a4c4a5/html5/thumbnails/49.jpg)
© Daniel S. Weld 51
We Want to Predict
.01 .03 .05 .07 .09
?
![Page 50: © Daniel S. Weld 1 Naïve Bayes & Expectation Maximization CSE 573.](https://reader030.fdocuments.in/reader030/viewer/2022032522/56649d6c5503460f94a4c4a5/html5/thumbnails/50.jpg)
© Daniel S. Weld 52
Expectation Maximization (EM)
• Pretend we do know the parameters Initialize randomly: set 1=?; 2=?
.01 .03 .05 .07 .09
![Page 51: © Daniel S. Weld 1 Naïve Bayes & Expectation Maximization CSE 573.](https://reader030.fdocuments.in/reader030/viewer/2022032522/56649d6c5503460f94a4c4a5/html5/thumbnails/51.jpg)
© Daniel S. Weld 53
Expectation Maximization (EM)• Pretend we do know the parameters
Initialize randomly• [E step] Compute probability of instance
having each possible value of the hidden variable
.01 .03 .05 .07 .09
![Page 52: © Daniel S. Weld 1 Naïve Bayes & Expectation Maximization CSE 573.](https://reader030.fdocuments.in/reader030/viewer/2022032522/56649d6c5503460f94a4c4a5/html5/thumbnails/52.jpg)
© Daniel S. Weld 54
Expectation Maximization (EM)• Pretend we do know the parameters
Initialize randomly• [E step] Compute probability of instance
having each possible value of the hidden variable
.01 .03 .05 .07 .09
![Page 53: © Daniel S. Weld 1 Naïve Bayes & Expectation Maximization CSE 573.](https://reader030.fdocuments.in/reader030/viewer/2022032522/56649d6c5503460f94a4c4a5/html5/thumbnails/53.jpg)
© Daniel S. Weld 55
Expectation Maximization (EM)• Pretend we do know the parameters
Initialize randomly• [E step] Compute probability of instance
having each possible value of the hidden variable
.01 .03 .05 .07 .09
• [M step] Treating each instance as fractionally having both values compute the new parameter values
![Page 54: © Daniel S. Weld 1 Naïve Bayes & Expectation Maximization CSE 573.](https://reader030.fdocuments.in/reader030/viewer/2022032522/56649d6c5503460f94a4c4a5/html5/thumbnails/54.jpg)
© Daniel S. Weld 56
ML Mean of Single Gaussian
Uml = argminu i(xi – u)2
.01 .03 .05 .07 .09
![Page 55: © Daniel S. Weld 1 Naïve Bayes & Expectation Maximization CSE 573.](https://reader030.fdocuments.in/reader030/viewer/2022032522/56649d6c5503460f94a4c4a5/html5/thumbnails/55.jpg)
© Daniel S. Weld 57
Expectation Maximization (EM)• Pretend we do know the parameters
Initialize randomly• [E step] Compute probability of instance
having each possible value of the hidden variable
.01 .03 .05 .07 .09
• [M step] Treating each instance as fractionally having both values compute the new parameter values
![Page 56: © Daniel S. Weld 1 Naïve Bayes & Expectation Maximization CSE 573.](https://reader030.fdocuments.in/reader030/viewer/2022032522/56649d6c5503460f94a4c4a5/html5/thumbnails/56.jpg)
© Daniel S. Weld 58
Iterate
![Page 57: © Daniel S. Weld 1 Naïve Bayes & Expectation Maximization CSE 573.](https://reader030.fdocuments.in/reader030/viewer/2022032522/56649d6c5503460f94a4c4a5/html5/thumbnails/57.jpg)
© Daniel S. Weld 59
Expectation Maximization (EM)•
• [E step] Compute probability of instance having each possible value of the hidden variable
.01 .03 .05 .07 .09
![Page 58: © Daniel S. Weld 1 Naïve Bayes & Expectation Maximization CSE 573.](https://reader030.fdocuments.in/reader030/viewer/2022032522/56649d6c5503460f94a4c4a5/html5/thumbnails/58.jpg)
© Daniel S. Weld 60
Expectation Maximization (EM)•
• [E step] Compute probability of instance having each possible value of the hidden variable
.01 .03 .05 .07 .09
• [M step] Treating each instance as fractionally having both values compute the new parameter values
![Page 59: © Daniel S. Weld 1 Naïve Bayes & Expectation Maximization CSE 573.](https://reader030.fdocuments.in/reader030/viewer/2022032522/56649d6c5503460f94a4c4a5/html5/thumbnails/59.jpg)
© Daniel S. Weld 61
Expectation Maximization (EM)•
• [E step] Compute probability of instance having each possible value of the hidden variable
.01 .03 .05 .07 .09
• [M step] Treating each instance as fractionally having both values compute the new parameter values
![Page 60: © Daniel S. Weld 1 Naïve Bayes & Expectation Maximization CSE 573.](https://reader030.fdocuments.in/reader030/viewer/2022032522/56649d6c5503460f94a4c4a5/html5/thumbnails/60.jpg)
© Daniel S. Weld 62
Until Convergence
• Problems Need to assume form of distribution Local Maxima
• But It really works in practice! Can easilly extend to multiple variables
• E.g. Mean & Variance• Or much more complex models…
![Page 61: © Daniel S. Weld 1 Naïve Bayes & Expectation Maximization CSE 573.](https://reader030.fdocuments.in/reader030/viewer/2022032522/56649d6c5503460f94a4c4a5/html5/thumbnails/61.jpg)
© Daniel S. Weld 63
Crossword Puzzles