Learning Bayesian Network Structures · Suppose we have some network structure, which is a DAG. We...
Transcript of Learning Bayesian Network Structures · Suppose we have some network structure, which is a DAG. We...
![Page 1: Learning Bayesian Network Structures · Suppose we have some network structure, which is a DAG. We can de ne its neighborhood in DAG-space as all networks we can reach by applying](https://reader034.fdocuments.in/reader034/viewer/2022052017/60305a76253a9519423db89e/html5/thumbnails/1.jpg)
Greedy Hill Climbing Dynamic Programming Wrap-up
Learning Bayesian Network Structures
Brandon Malone
Much of this material is adapted from Heckerman 1998
Many of the images were taken from the Internet
February 14, 2014
Brandon Malone Learning Bayesian Network Structures
![Page 2: Learning Bayesian Network Structures · Suppose we have some network structure, which is a DAG. We can de ne its neighborhood in DAG-space as all networks we can reach by applying](https://reader034.fdocuments.in/reader034/viewer/2022052017/60305a76253a9519423db89e/html5/thumbnails/2.jpg)
Greedy Hill Climbing Dynamic Programming Wrap-up
Learning Bayesian Network Structures
Suppose we have a dataset D.
H3K27me3 H2AK126su H4AK5ac H2AS1ph H3K27ac Transcription CountPresent Absent Present Absent Absent Inactive 25Absent Absent Present Present Absent Active 10Present Absent Absent Present Absent Inactive 15Present Absent Present Present Absent Active 5Absent Present Absent Absent Present Active 15Absent Present Present Absent Absent Inactive 5Absent Absent Absent Absent Present Active 20Present Present Present Absent Absent Inactive 10
What structure N best explains a dataset D?
We will use scoring functions to rate each network.The one with the best score is “better.”
Brandon Malone Learning Bayesian Network Structures
![Page 3: Learning Bayesian Network Structures · Suppose we have some network structure, which is a DAG. We can de ne its neighborhood in DAG-space as all networks we can reach by applying](https://reader034.fdocuments.in/reader034/viewer/2022052017/60305a76253a9519423db89e/html5/thumbnails/3.jpg)
Greedy Hill Climbing Dynamic Programming Wrap-up
Learning Bayesian Network Structures
Suppose we have a dataset D.
H3K27me3 H2AK126su H4AK5ac H2AS1ph H3K27ac Transcription CountPresent Absent Present Absent Absent Inactive 25Absent Absent Present Present Absent Active 10Present Absent Absent Present Absent Inactive 15Present Absent Present Present Absent Active 5Absent Present Absent Absent Present Active 15Absent Present Present Absent Absent Inactive 5Absent Absent Absent Absent Present Active 20Present Present Present Absent Absent Inactive 10
What structure N best explains a dataset D?
We will use scoring functions to rate each network.The one with the best score is “better.”
Brandon Malone Learning Bayesian Network Structures
![Page 4: Learning Bayesian Network Structures · Suppose we have some network structure, which is a DAG. We can de ne its neighborhood in DAG-space as all networks we can reach by applying](https://reader034.fdocuments.in/reader034/viewer/2022052017/60305a76253a9519423db89e/html5/thumbnails/4.jpg)
Greedy Hill Climbing Dynamic Programming Wrap-up
1 Greedy Hill Climbing
2 Dynamic Programming
3 Wrap-up
Brandon Malone Learning Bayesian Network Structures
![Page 5: Learning Bayesian Network Structures · Suppose we have some network structure, which is a DAG. We can de ne its neighborhood in DAG-space as all networks we can reach by applying](https://reader034.fdocuments.in/reader034/viewer/2022052017/60305a76253a9519423db89e/html5/thumbnails/5.jpg)
Greedy Hill Climbing Dynamic Programming Wrap-up
Greedy Hill Climbing
Suppose we have some network structure, which is a DAG.
We can define its neighborhood in DAG-space as all networks wecan reach by applying an operator.
Operators
Add an edge
Delete an edge
Reverse an edge
At each search step, we find the best neighbor and move to it.
Brandon Malone Learning Bayesian Network Structures
![Page 6: Learning Bayesian Network Structures · Suppose we have some network structure, which is a DAG. We can de ne its neighborhood in DAG-space as all networks we can reach by applying](https://reader034.fdocuments.in/reader034/viewer/2022052017/60305a76253a9519423db89e/html5/thumbnails/6.jpg)
Greedy Hill Climbing Dynamic Programming Wrap-up
Operator: Add an edge
We add one edge which is missing from the structure.
H2AK126su
H3K27me3
K3K27ac
H2AS1phH4AK5ac
H4AK5ac
⇒H2AK126su
H3K27me3
K3K27ac
H2AS1phH4AK5ac
H4AK5ac
The resulting structure must still be a DAG.
Brandon Malone Learning Bayesian Network Structures
![Page 7: Learning Bayesian Network Structures · Suppose we have some network structure, which is a DAG. We can de ne its neighborhood in DAG-space as all networks we can reach by applying](https://reader034.fdocuments.in/reader034/viewer/2022052017/60305a76253a9519423db89e/html5/thumbnails/7.jpg)
Greedy Hill Climbing Dynamic Programming Wrap-up
Operator: Delete an edge
We remove one existing edge from the structure.
H2AK126su
H3K27me3
K3K27ac
H2AS1phH4AK5ac
H4AK5ac
⇒H2AK126su
H3K27me3
K3K27ac
H2AS1phH4AK5ac
H4AK5ac
The resulting structure is guaranteed to still be a DAG.
Brandon Malone Learning Bayesian Network Structures
![Page 8: Learning Bayesian Network Structures · Suppose we have some network structure, which is a DAG. We can de ne its neighborhood in DAG-space as all networks we can reach by applying](https://reader034.fdocuments.in/reader034/viewer/2022052017/60305a76253a9519423db89e/html5/thumbnails/8.jpg)
Greedy Hill Climbing Dynamic Programming Wrap-up
Operator: Reverse an edge
We flip one existing edge in the structure.
H2AK126su
H3K27me3
K3K27ac
H2AS1phH4AK5ac
H4AK5ac
⇒H2AK126su
H3K27me3
K3K27ac
H2AS1phH4AK5ac
H4AK5ac
The resulting structure must still be a DAG.
Why do we need this operator?
Brandon Malone Learning Bayesian Network Structures
![Page 9: Learning Bayesian Network Structures · Suppose we have some network structure, which is a DAG. We can de ne its neighborhood in DAG-space as all networks we can reach by applying](https://reader034.fdocuments.in/reader034/viewer/2022052017/60305a76253a9519423db89e/html5/thumbnails/9.jpg)
Greedy Hill Climbing Dynamic Programming Wrap-up
Escaping local optima
Greedy hill climbing can “get stuck” in local optima.
Two simple strategies can be effective for escaping.
TABU list. Do not revisit recently seen structures.
Random restarts. Apply some operators at random when ata local optimum.
Can we guarantee anything?Brandon Malone Learning Bayesian Network Structures
![Page 10: Learning Bayesian Network Structures · Suppose we have some network structure, which is a DAG. We can de ne its neighborhood in DAG-space as all networks we can reach by applying](https://reader034.fdocuments.in/reader034/viewer/2022052017/60305a76253a9519423db89e/html5/thumbnails/10.jpg)
Greedy Hill Climbing Dynamic Programming Wrap-up
Efficient implementations
During each search step, we have O(n2)
operators.
How many values change from one step to the next?
Not many, due to decomposability.So we can effectively cache family scores.
Which counts do we need?
Only those relevant for the scores we need.These can be efficiently calculated using an AD-tree.
Brandon Malone Learning Bayesian Network Structures
![Page 11: Learning Bayesian Network Structures · Suppose we have some network structure, which is a DAG. We can de ne its neighborhood in DAG-space as all networks we can reach by applying](https://reader034.fdocuments.in/reader034/viewer/2022052017/60305a76253a9519423db89e/html5/thumbnails/11.jpg)
Greedy Hill Climbing Dynamic Programming Wrap-up
Efficient implementations
During each search step, we have O(n2)
operators.
How many values change from one step to the next?
Not many, due to decomposability.So we can effectively cache family scores.
Which counts do we need?
Only those relevant for the scores we need.These can be efficiently calculated using an AD-tree.
Brandon Malone Learning Bayesian Network Structures
![Page 12: Learning Bayesian Network Structures · Suppose we have some network structure, which is a DAG. We can de ne its neighborhood in DAG-space as all networks we can reach by applying](https://reader034.fdocuments.in/reader034/viewer/2022052017/60305a76253a9519423db89e/html5/thumbnails/12.jpg)
Greedy Hill Climbing Dynamic Programming Wrap-up
Efficient implementations
During each search step, we have O(n2)
operators.
How many values change from one step to the next?
Not many, due to decomposability.So we can effectively cache family scores.
Which counts do we need?
Only those relevant for the scores we need.These can be efficiently calculated using an AD-tree.
Brandon Malone Learning Bayesian Network Structures
![Page 13: Learning Bayesian Network Structures · Suppose we have some network structure, which is a DAG. We can de ne its neighborhood in DAG-space as all networks we can reach by applying](https://reader034.fdocuments.in/reader034/viewer/2022052017/60305a76253a9519423db89e/html5/thumbnails/13.jpg)
Greedy Hill Climbing Dynamic Programming Wrap-up
Greedy hill climbing algorithm
procedure GreedyHillClimbing(initial structure, Ninit , dataset D, scoring function s, stopping
criteria C )N ∗ ← Ninit , N ′ ← N ∗, tabu ← {N ∗}while C is not satisfied doN ′′ ← arg maxN∈neighborhood(N ′)andN /∈tabu s(N )if s(N ′) > s(N ′′) then . Check for local optimumN ′′ ← random(N ′) . Apply random operators
end ifif s(N ′′) > s(N ∗) then . Check for new bestN ∗ ← N ′′
end iftabu ← tabu ∪N ′N ′ ← N ′′ . Move to the neighbor
end whilereturn N ∗
end procedure
Brandon Malone Learning Bayesian Network Structures
![Page 14: Learning Bayesian Network Structures · Suppose we have some network structure, which is a DAG. We can de ne its neighborhood in DAG-space as all networks we can reach by applying](https://reader034.fdocuments.in/reader034/viewer/2022052017/60305a76253a9519423db89e/html5/thumbnails/14.jpg)
Greedy Hill Climbing Dynamic Programming Wrap-up
Dynamic Programming
Greedy hill climbing can be efficiently implemented, but it offers noguarantees about the quality of the network it finds.
Score-based structure learning is NP-complete (reduction fromfeedback arc set).
In practice, dynamic programming can effectively learn provablyoptimal networks with up to about 30 variables.
Brandon Malone Learning Bayesian Network Structures
![Page 15: Learning Bayesian Network Structures · Suppose we have some network structure, which is a DAG. We can de ne its neighborhood in DAG-space as all networks we can reach by applying](https://reader034.fdocuments.in/reader034/viewer/2022052017/60305a76253a9519423db89e/html5/thumbnails/15.jpg)
Greedy Hill Climbing Dynamic Programming Wrap-up
Exploiting scoring function decomposability
Observation: All BNs are DAGs, and all DAGs have a topologicalorder.
Observation: Most scoring functions are decomposable.
Score(N : D) =n∑i
Score(Xi ,PAi : D)
The relationships among Xi ’s parents does not matter.We will typically omit D and refer to Score(X ,PAX ) as local scores.
Brandon Malone Learning Bayesian Network Structures
![Page 16: Learning Bayesian Network Structures · Suppose we have some network structure, which is a DAG. We can de ne its neighborhood in DAG-space as all networks we can reach by applying](https://reader034.fdocuments.in/reader034/viewer/2022052017/60305a76253a9519423db89e/html5/thumbnails/16.jpg)
Greedy Hill Climbing Dynamic Programming Wrap-up
Dynamic programming decomposition
Suppose we are given an ordering: X3, X2, X1, X4.
X1 X2
X4X3
We can select optimal parents as follows.
BestScore(X ,U) = minPA∈U
Score(X ,PA)
PAX = arg minPA∈U
Score(X ,PA)
U is the set of variables which precede X in the order.
Brandon Malone Learning Bayesian Network Structures
![Page 17: Learning Bayesian Network Structures · Suppose we have some network structure, which is a DAG. We can de ne its neighborhood in DAG-space as all networks we can reach by applying](https://reader034.fdocuments.in/reader034/viewer/2022052017/60305a76253a9519423db89e/html5/thumbnails/17.jpg)
Greedy Hill Climbing Dynamic Programming Wrap-up
Dynamic programming decomposition
Suppose we are given an ordering: X3, X2, X1, X4.
X1 X2
X4X3
X1 X2
X4X3
We can select optimal parents as follows.
BestScore(X ,U) = minPA∈U
Score(X ,PA)
PAX = arg minPA∈U
Score(X ,PA)
U is the set of variables which precede X in the order.
Brandon Malone Learning Bayesian Network Structures
![Page 18: Learning Bayesian Network Structures · Suppose we have some network structure, which is a DAG. We can de ne its neighborhood in DAG-space as all networks we can reach by applying](https://reader034.fdocuments.in/reader034/viewer/2022052017/60305a76253a9519423db89e/html5/thumbnails/18.jpg)
Greedy Hill Climbing Dynamic Programming Wrap-up
Dynamic programming decomposition
Suppose we are given an ordering: X3, X2, X1, X4.
X1 X2
X4X3
X1 X2
X4X3
X1 X2
X4X3
We can select optimal parents as follows.
BestScore(X ,U) = minPA∈U
Score(X ,PA)
PAX = arg minPA∈U
Score(X ,PA)
U is the set of variables which precede X in the order.
Brandon Malone Learning Bayesian Network Structures
![Page 19: Learning Bayesian Network Structures · Suppose we have some network structure, which is a DAG. We can de ne its neighborhood in DAG-space as all networks we can reach by applying](https://reader034.fdocuments.in/reader034/viewer/2022052017/60305a76253a9519423db89e/html5/thumbnails/19.jpg)
Greedy Hill Climbing Dynamic Programming Wrap-up
Dynamic programming decomposition
Suppose we are given an ordering: X3, X2, X1, X4.
X1 X2
X4X3
X1 X2
X4X3
X1 X2
X4X3
X1 X2
X4X3
We can select optimal parents as follows.
BestScore(X ,U) = minPA∈U
Score(X ,PA)
PAX = arg minPA∈U
Score(X ,PA)
U is the set of variables which precede X in the order.
Brandon Malone Learning Bayesian Network Structures
![Page 20: Learning Bayesian Network Structures · Suppose we have some network structure, which is a DAG. We can de ne its neighborhood in DAG-space as all networks we can reach by applying](https://reader034.fdocuments.in/reader034/viewer/2022052017/60305a76253a9519423db89e/html5/thumbnails/20.jpg)
Greedy Hill Climbing Dynamic Programming Wrap-up
Dynamic programming decomposition
Suppose we are given an ordering: X3, X2, X1, X4.
X1 X2
X4X3
X1 X2
X4X3
X1 X2
X4X3
X1 X2
X4X3
X1 X2
X4X3
We can select optimal parents as follows.
BestScore(X ,U) = minPA∈U
Score(X ,PA)
PAX = arg minPA∈U
Score(X ,PA)
U is the set of variables which precede X in the order.
Brandon Malone Learning Bayesian Network Structures
![Page 21: Learning Bayesian Network Structures · Suppose we have some network structure, which is a DAG. We can de ne its neighborhood in DAG-space as all networks we can reach by applying](https://reader034.fdocuments.in/reader034/viewer/2022052017/60305a76253a9519423db89e/html5/thumbnails/21.jpg)
Greedy Hill Climbing Dynamic Programming Wrap-up
Dynamic programming graph
{X1, X2, X3}
{X1, X2} {X2, X3} {X2, X4} {X3, X4}{X1, X3} {X1, X4}
{X2, X3, X4}{X1, X2, X4} {X1, X3, X4}
{X1, X2, X3, X4}
{X2} {X3} {X4}{X1}
∅
Each node U contains an optimal subnetwork and Score(U).
Following an edge means adding a leaf to the subnetwork.
A path from ∅ to {X1,
X2, X3, X4} induces an
order on the variables.
Based on the order, we
select optimal parents.
This associates a cost
with each edge.
Observation: The path
taken to reach a node
does not affect the later
choices.
Brandon Malone Learning Bayesian Network Structures
![Page 22: Learning Bayesian Network Structures · Suppose we have some network structure, which is a DAG. We can de ne its neighborhood in DAG-space as all networks we can reach by applying](https://reader034.fdocuments.in/reader034/viewer/2022052017/60305a76253a9519423db89e/html5/thumbnails/22.jpg)
Greedy Hill Climbing Dynamic Programming Wrap-up
Dynamic programming graph
{X1, X2, X3}
{X1, X2} {X2, X3} {X2, X4} {X3, X4}{X1, X3} {X1, X4}
{X2, X3, X4}{X1, X2, X4} {X1, X3, X4}
{X1, X2, X3, X4}
{X2} {X3} {X4}{X1}
∅
Each node U contains an optimal subnetwork and Score(U).
Following an edge means adding a leaf to the subnetwork.
A path from ∅ to {X1,
X2, X3, X4} induces an
order on the variables.
Based on the order, we
select optimal parents.
This associates a cost
with each edge.
Observation: The path
taken to reach a node
does not affect the later
choices.
Brandon Malone Learning Bayesian Network Structures
![Page 23: Learning Bayesian Network Structures · Suppose we have some network structure, which is a DAG. We can de ne its neighborhood in DAG-space as all networks we can reach by applying](https://reader034.fdocuments.in/reader034/viewer/2022052017/60305a76253a9519423db89e/html5/thumbnails/23.jpg)
Greedy Hill Climbing Dynamic Programming Wrap-up
Dynamic programming graph
{X1, X2, X3}
{X2, X3} {X2, X4} {X3, X4}{X1, X4}
{X2, X3, X4}{X1, X2, X4} {X1, X3, X4}
{X1, X2, X3, X4}
{X2} {X3} {X4}{X1}
∅
{X1, X2} {X1, X3}
Each node U contains an optimal subnetwork and Score(U).
Following an edge means adding a leaf to the subnetwork.
A path from ∅ to {X1,
X2, X3, X4} induces an
order on the variables.
Based on the order, we
select optimal parents.
This associates a cost
with each edge.
Observation: The path
taken to reach a node
does not affect the later
choices.
Brandon Malone Learning Bayesian Network Structures
![Page 24: Learning Bayesian Network Structures · Suppose we have some network structure, which is a DAG. We can de ne its neighborhood in DAG-space as all networks we can reach by applying](https://reader034.fdocuments.in/reader034/viewer/2022052017/60305a76253a9519423db89e/html5/thumbnails/24.jpg)
Greedy Hill Climbing Dynamic Programming Wrap-up
Dynamic programming graph
{X2, X3} {X2, X4} {X3, X4}{X1, X4}
{X2, X3, X4}{X1, X2, X4} {X1, X3, X4}
{X1, X2, X3, X4}
{X2} {X3} {X4}{X1}
∅
{X1, X2} {X1, X3}
{X1, X2, X3}
Each node U contains an optimal subnetwork and Score(U).
Following an edge means adding a leaf to the subnetwork.
A path from ∅ to {X1,
X2, X3, X4} induces an
order on the variables.
Based on the order, we
select optimal parents.
This associates a cost
with each edge.
Observation: The path
taken to reach a node
does not affect the later
choices.
Brandon Malone Learning Bayesian Network Structures
![Page 25: Learning Bayesian Network Structures · Suppose we have some network structure, which is a DAG. We can de ne its neighborhood in DAG-space as all networks we can reach by applying](https://reader034.fdocuments.in/reader034/viewer/2022052017/60305a76253a9519423db89e/html5/thumbnails/25.jpg)
Greedy Hill Climbing Dynamic Programming Wrap-up
Dynamic programming graph
{X2, X3} {X2, X4} {X3, X4}{X1, X4}
{X2, X3, X4}{X1, X2, X4} {X1, X3, X4}
{X1, X2, X3, X4}
{X2} {X3} {X4}{X1}
∅
{X1, X2} {X1, X3}
{X1, X2, X3}
Each node U contains an optimal subnetwork and Score(U).
Following an edge means adding a leaf to the subnetwork.
A path from ∅ to {X1,
X2, X3, X4} induces an
order on the variables.
Based on the order, we
select optimal parents.
This associates a cost
with each edge.
Observation: The path
taken to reach a node
does not affect the later
choices.
Brandon Malone Learning Bayesian Network Structures
![Page 26: Learning Bayesian Network Structures · Suppose we have some network structure, which is a DAG. We can de ne its neighborhood in DAG-space as all networks we can reach by applying](https://reader034.fdocuments.in/reader034/viewer/2022052017/60305a76253a9519423db89e/html5/thumbnails/26.jpg)
Greedy Hill Climbing Dynamic Programming Wrap-up
Dynamic programming graph
{X2, X3} {X2, X4} {X3, X4}{X1, X4}
{X2, X3, X4}{X1, X2, X4} {X1, X3, X4}
{X1, X2, X3, X4}
{X2} {X3} {X4}{X1}
∅
{X1, X2} {X1, X3}
{X1, X2, X3}
Each node U contains an optimal subnetwork and Score(U).
Following an edge means adding a leaf to the subnetwork.
The cost of a path is
equal to the sum of the
costs of its edges.
The cost of the shortest
path corresponds to the
optimal network.
We can explore the
graph, e.g., in breadth-
first order to find the
shortest path to each
node.
Brandon Malone Learning Bayesian Network Structures
![Page 27: Learning Bayesian Network Structures · Suppose we have some network structure, which is a DAG. We can de ne its neighborhood in DAG-space as all networks we can reach by applying](https://reader034.fdocuments.in/reader034/viewer/2022052017/60305a76253a9519423db89e/html5/thumbnails/27.jpg)
Greedy Hill Climbing Dynamic Programming Wrap-up
Dealing with local scores
{X1, X3}
{X1, X2, X3}
The cost of an edge from U to U ∪ X isdefined as follows.
Cost(U,U ∪ X ) ..= BestScore(X ,U)
BestScore(X ,U) = minPA∈U
Score(X ,PA)
How can we efficiently calculate this value?
Precompute “all” local scores for X and sort them.
Linearly scan when given a new query parent set U.(Fancier tricks tend to be much slower.)
Pruning TheoremSay PA ⊂ PA′ and Score(X ,PA) < Score(X ,PA′). Then PA′ is not optimal.
Brandon Malone Learning Bayesian Network Structures
![Page 28: Learning Bayesian Network Structures · Suppose we have some network structure, which is a DAG. We can de ne its neighborhood in DAG-space as all networks we can reach by applying](https://reader034.fdocuments.in/reader034/viewer/2022052017/60305a76253a9519423db89e/html5/thumbnails/28.jpg)
Greedy Hill Climbing Dynamic Programming Wrap-up
Dealing with local scores
{X1, X3}
{X1, X2, X3}
The cost of an edge from U to U ∪ X isdefined as follows.
Cost(U,U ∪ X ) ..= BestScore(X ,U)
BestScore(X ,U) = minPA∈U
Score(X ,PA)
How can we efficiently calculate this value?
Precompute “all” local scores for X and sort them.
Linearly scan when given a new query parent set U.(Fancier tricks tend to be much slower.)
Pruning TheoremSay PA ⊂ PA′ and Score(X ,PA) < Score(X ,PA′). Then PA′ is not optimal.
Brandon Malone Learning Bayesian Network Structures
![Page 29: Learning Bayesian Network Structures · Suppose we have some network structure, which is a DAG. We can de ne its neighborhood in DAG-space as all networks we can reach by applying](https://reader034.fdocuments.in/reader034/viewer/2022052017/60305a76253a9519423db89e/html5/thumbnails/29.jpg)
Greedy Hill Climbing Dynamic Programming Wrap-up
Dynamic programming algorithm
procedure Expand(node U, sorted family scores BestScore)for each leaf in V \ U do
newScore ← Score(U) + BestScore(leaf ,U)if newScore < Score(U ∪ leaf ) then
Score(U ∪ leaf )← newScoreend if
end forend procedure
procedure Main(variables V, sorted family scores BestScore)Score(∅)← 0for layer l = 0 to |V| do
for each node U such that |U| = l doexpand(U, BestScore)
end forend forreturn Score(V)
end procedure
Brandon Malone Learning Bayesian Network Structures
![Page 30: Learning Bayesian Network Structures · Suppose we have some network structure, which is a DAG. We can de ne its neighborhood in DAG-space as all networks we can reach by applying](https://reader034.fdocuments.in/reader034/viewer/2022052017/60305a76253a9519423db89e/html5/thumbnails/30.jpg)
Greedy Hill Climbing Dynamic Programming Wrap-up
Recap
During this part of the course, we have discussed:
A greedy hill climbing algorithm for structure learning
A dynamic programming algorithm which guarantees to findthe optimal structure
Brandon Malone Learning Bayesian Network Structures
![Page 31: Learning Bayesian Network Structures · Suppose we have some network structure, which is a DAG. We can de ne its neighborhood in DAG-space as all networks we can reach by applying](https://reader034.fdocuments.in/reader034/viewer/2022052017/60305a76253a9519423db89e/html5/thumbnails/31.jpg)
Greedy Hill Climbing Dynamic Programming Wrap-up
Next in probabilistic models
We will return to parameter learning.
Expectation-maximization for learning parameters in mixturemodels
(If time permits) Estimating parameters in topic models
Brandon Malone Learning Bayesian Network Structures