L11: Uses of Bayesian Networks
description
Transcript of L11: Uses of Bayesian Networks
![Page 1: L11: Uses of Bayesian Networks](https://reader035.fdocuments.in/reader035/viewer/2022062804/56814c3e550346895db94352/html5/thumbnails/1.jpg)
L11: Uses of Bayesian Networks
Nevin L. Zhang
Department of Computer Science & Engineering
The Hong Kong University of Science & Technology
http://www.cse.ust.hk/~lzhang/
![Page 2: L11: Uses of Bayesian Networks](https://reader035.fdocuments.in/reader035/viewer/2022062804/56814c3e550346895db94352/html5/thumbnails/2.jpg)
Page 2
Outline
Traditional Uses
Structure Discovery
Density Estimation
Classification
Clustering
An HKUST Project
![Page 3: L11: Uses of Bayesian Networks](https://reader035.fdocuments.in/reader035/viewer/2022062804/56814c3e550346895db94352/html5/thumbnails/3.jpg)
Page 3
Traditional Uses
Probabilistic Expert Systems Diagnostic Prediction
Example:BN for diagnosing
“blue baby” over phone
in a London Hospital
Comparable to specialist,
Better than others
![Page 4: L11: Uses of Bayesian Networks](https://reader035.fdocuments.in/reader035/viewer/2022062804/56814c3e550346895db94352/html5/thumbnails/4.jpg)
Page 4
Traditional Uses
Language for describing probabilistic models in Science &
Engineering Example: BN for turbo code
![Page 5: L11: Uses of Bayesian Networks](https://reader035.fdocuments.in/reader035/viewer/2022062804/56814c3e550346895db94352/html5/thumbnails/5.jpg)
Page 5
Traditional Uses
Language for describing probabilistic models in Science &
Engineering Example: BN from Bioinformatics
![Page 6: L11: Uses of Bayesian Networks](https://reader035.fdocuments.in/reader035/viewer/2022062804/56814c3e550346895db94352/html5/thumbnails/6.jpg)
Page 6
Outline
Traditional Uses
Structure Discovery
Density Estimation
Classification
Clustering
An HKUST Project
![Page 7: L11: Uses of Bayesian Networks](https://reader035.fdocuments.in/reader035/viewer/2022062804/56814c3e550346895db94352/html5/thumbnails/7.jpg)
Page 7
BN for Structure Discovery
Given: Data set D on variables X1, X2, …, Xn
Discover dependence, independence, and even causal
relationship among the variable.
Example: Evolution trees
![Page 8: L11: Uses of Bayesian Networks](https://reader035.fdocuments.in/reader035/viewer/2022062804/56814c3e550346895db94352/html5/thumbnails/8.jpg)
Page 8
Phylogenetic Trees
Assumption All organisms on Earth have a common ancestor
This implies that any set of species is related.
Phylogeny The relationship between any set of species.
Phylogenetic tree Usually, the relationship can be represented by a tree which is called a
phylogenetic (evolution) tree this is not always true
![Page 9: L11: Uses of Bayesian Networks](https://reader035.fdocuments.in/reader035/viewer/2022062804/56814c3e550346895db94352/html5/thumbnails/9.jpg)
Page 9
Phylogenetic Trees
Phylogenetic trees
giant
pandalesser
panda
moose
goshawk vulture
duck
alligator
Time
Current-day species at bottom
![Page 10: L11: Uses of Bayesian Networks](https://reader035.fdocuments.in/reader035/viewer/2022062804/56814c3e550346895db94352/html5/thumbnails/10.jpg)
Page 10
Phylogenetic Trees
TAXA (sequences) identify species
Edge lengths represent evolution time
Assumption: bifurcating tree topology
Time
AGGGCAT
TAGCCCA
TAGACTT
AGCACAA
AGCGCTT
AAGACTT
AGCACTT
AAGGCCT
AAGGCAT
![Page 11: L11: Uses of Bayesian Networks](https://reader035.fdocuments.in/reader035/viewer/2022062804/56814c3e550346895db94352/html5/thumbnails/11.jpg)
Page 11
Characterize relationship between taxa using substitution probability:P(x | y, t): probability that ancestral sequence y evolves into sequence x along
an edge of length t
P(X7), P(X5|X7, t5), P(X6|X7, t6), P(S1|X5, t1), P(S2|X5, t2), ….
Probabilistic Models of Evolution
s3 s4s1 s2
t5 t6
t1 t2 t3 t4x5 x6
x7
![Page 12: L11: Uses of Bayesian Networks](https://reader035.fdocuments.in/reader035/viewer/2022062804/56814c3e550346895db94352/html5/thumbnails/12.jpg)
Page 12
What should P(x|y, t) be? Two assumptions of commonly used models
There are only substitutions, no insertions/deletions (aligned) One-to-one correspondence between sites in different sequences
Each site evolves independently and identically
P(x|y, t) = ∏i=1 to m P(x(i) | y(i), t)
m is sequence length
Probabilistic Models of Evolution
AGGGCAT
TAGCCCA
TAGACTTAGCACAA
AGCGCTT
AAGACTT
AGCACTT
AAGGCCT
AAGGCAT
![Page 13: L11: Uses of Bayesian Networks](https://reader035.fdocuments.in/reader035/viewer/2022062804/56814c3e550346895db94352/html5/thumbnails/13.jpg)
Page 13
What should P(x(i)|y(i), t) be? Jukes-Cantor (Character Evolution) Model [1969]
Rate of substitution a (Constant or parameter?)
Multiplicativity (lack of memory)
Probabilistic Models of Evolution
rt st st st
st rt st st
st st rt st
st st st rt
A
C
G
T
A C G Trt = 1/4 (1 + 3e-4t)
st = 1/4 (1 - e-4t)
Limit values when
t = 0 or t = infinity?
b
tbcPtabPttacP ),|(),|(),|( 2121
![Page 14: L11: Uses of Bayesian Networks](https://reader035.fdocuments.in/reader035/viewer/2022062804/56814c3e550346895db94352/html5/thumbnails/14.jpg)
Page 14
Tree Reconstruction
Given: collection of current-day taxa
Find: tree Tree topology: T Edge lengths: t
Maximum likelihood Find tree to maximize P(data
| tree)
AGGGCAT
TAGCCCA
TAGACTT
AGCACAA
AGCGCTT
AGGGCAT, TAGCCCA, TAGACTT, AGCACAA, AGCGCTT
![Page 15: L11: Uses of Bayesian Networks](https://reader035.fdocuments.in/reader035/viewer/2022062804/56814c3e550346895db94352/html5/thumbnails/15.jpg)
Page 15
When restricted to one particular site, a phylogenetic tree is an LT
model where The structure is a binary tree and variables share the same state space.
The conditional probabilities are from the character evolution model,
parameterized by edge lengths instead of usual parameterization.
The model is the same for different sites
Tree Reconstruction
AGGGCAT
TAGCCCA
TAGACTT
AGCACAA
AGCGCTT
AAGACTT
AGCACTT
AAGGCCT
![Page 16: L11: Uses of Bayesian Networks](https://reader035.fdocuments.in/reader035/viewer/2022062804/56814c3e550346895db94352/html5/thumbnails/16.jpg)
Page 16
Tree Reconstruction
Current-day Taxa: AGGGCAT, TAGCCCA, TAGACTT, AGCACAA, AGCGCTT
Samples for LT model. One Sample per site. The samples are i.i.d. 1st site: (A, T, T, A, A),
2nd site: (G, A, A, G, G),
3rd site: (G, G, G, C, C),
AGGGCAT
TAGCCCA
TAGACTT
AGCACAA
AGCGCTT
AAGACTT
AGCACTT
AAGGCCT
![Page 17: L11: Uses of Bayesian Networks](https://reader035.fdocuments.in/reader035/viewer/2022062804/56814c3e550346895db94352/html5/thumbnails/17.jpg)
Page 17
Tree Reconstruction
Finding ML phylogenetic tree == Finding ML LT model
Model space: Model structures: binary tree where all variables share the same state
space, which is known.
Parameterization: one parameter for each edge. (In general, P(x|y) has |x||
y|-1 parameters).
The objective is to find relationships among variables.
Applying new LTM algorithms to Phylogenetic tree reconstruction?
![Page 18: L11: Uses of Bayesian Networks](https://reader035.fdocuments.in/reader035/viewer/2022062804/56814c3e550346895db94352/html5/thumbnails/18.jpg)
Page 18
Outline
Traditional Uses
Structure Discovery
Density Estimation
Classification
Clustering
An HKUST Project
![Page 19: L11: Uses of Bayesian Networks](https://reader035.fdocuments.in/reader035/viewer/2022062804/56814c3e550346895db94352/html5/thumbnails/19.jpg)
Page 19
BN for Density Estimation
Given: Data set D on variables X1, X2, …, Xn
Estimate: P(X1, X2, …, Xn) under some constraints
..
Uses of the estimate: Inference
Classification
![Page 20: L11: Uses of Bayesian Networks](https://reader035.fdocuments.in/reader035/viewer/2022062804/56814c3e550346895db94352/html5/thumbnails/20.jpg)
Page 20
BN Methods for Density Estimation
Chow-Liu tree with X1, X2, …, Xn as nodes Easy to compute
Easy to use
Might not be good estimation of “true” distribution
BN with X1, X2, …, Xn as nodes Can be good estimation of “true” distribution.
Might be difficult to find
Might be complex to use
![Page 21: L11: Uses of Bayesian Networks](https://reader035.fdocuments.in/reader035/viewer/2022062804/56814c3e550346895db94352/html5/thumbnails/21.jpg)
Page 21
BN Methods for Density Estimation
LC model with X1, X2, …, Xn as manifest variables (Lowd
and Domingos 2005) Determine the cardinality of the latent variable using hold-out
validation,
Optimize the parameters using EM.
..
Easy to compute
Can be good estimation of “true” distribution
Might be complex to use (cardinality of latent variable might be very
large)
![Page 22: L11: Uses of Bayesian Networks](https://reader035.fdocuments.in/reader035/viewer/2022062804/56814c3e550346895db94352/html5/thumbnails/22.jpg)
Page 22
BN Methods for Density Estimation
LT model for density estimation Pearl 1988: As model over manifest variables, LTMs
Are computationally very simple to work with. Can represent complex relationships among manifest variables.
![Page 23: L11: Uses of Bayesian Networks](https://reader035.fdocuments.in/reader035/viewer/2022062804/56814c3e550346895db94352/html5/thumbnails/23.jpg)
Page 23
BN Methods for Density Estimation New approximate inference algorithm for Bayesian networks (Wang, Zhang
and Chen, AAAI 08, JAIR 32: 879-900, 08 )
Sample Learn
sparse sparse dense dense
![Page 24: L11: Uses of Bayesian Networks](https://reader035.fdocuments.in/reader035/viewer/2022062804/56814c3e550346895db94352/html5/thumbnails/24.jpg)
Page 24
Outline
Traditional Uses
Structure Discovery
Density Estimation
Classification
Clustering
An HKUST Project
![Page 25: L11: Uses of Bayesian Networks](https://reader035.fdocuments.in/reader035/viewer/2022062804/56814c3e550346895db94352/html5/thumbnails/25.jpg)
Page 25
Bayesian Networks for Classification
The problem: Given data:
Find mapping (A1, A2, …, An) |- C
Possible solutions ANN
Decision tree (Quinlan)
…
(SVM: Continuous data)
A1 A2 … An C
0 1 … 0 T
1 0 … 1 F
.. .. .. .. ..
![Page 26: L11: Uses of Bayesian Networks](https://reader035.fdocuments.in/reader035/viewer/2022062804/56814c3e550346895db94352/html5/thumbnails/26.jpg)
Page 26
Bayesian Networks for Classification
Naïve Bayes model From data, learn
P(C), P(Ai|C)
Classification arg max_c P(C=c|A1=a1, …, An=an)
Very good in practice
![Page 27: L11: Uses of Bayesian Networks](https://reader035.fdocuments.in/reader035/viewer/2022062804/56814c3e550346895db94352/html5/thumbnails/27.jpg)
Page 27
Drawback of NB: Attributes mutually independent given class variable
Often violated, leading to double counting.
Fixes: General BN classifiers
Tree augmented Naïve Bayes (TAN) models
Hierarchical NB
Bayes rule + Density Estimation
…
Bayesian Networks for Classification
![Page 28: L11: Uses of Bayesian Networks](https://reader035.fdocuments.in/reader035/viewer/2022062804/56814c3e550346895db94352/html5/thumbnails/28.jpg)
Page 28
General BN classifier Treat class variable just as another variable
Learn a BN.
Classify the next instance based on values of variables in the Markov
blanket of the class variable.
Pretty bad because it does not utilize all available information because of
Markov boundary
Bayesian Networks for Classification
![Page 29: L11: Uses of Bayesian Networks](https://reader035.fdocuments.in/reader035/viewer/2022062804/56814c3e550346895db94352/html5/thumbnails/29.jpg)
Page 29
Bayesian Networks for Classification
TAN model Friedman, N., Geiger, D., and Goldszmidt, M. (1997).
Bayesian networks classifiers. Machine Learning, 29:131-163.
Capture dependence among attributes using a tree structure.
During learning, First learn a tree among attributes: use Chow-Liu algorithm
Add class variable and estimate parameters
Classification arg max_c P(C=c|A1=a1, …, An=an)
![Page 30: L11: Uses of Bayesian Networks](https://reader035.fdocuments.in/reader035/viewer/2022062804/56814c3e550346895db94352/html5/thumbnails/30.jpg)
Page 30
Bayesian Networks for Classification
Hierarchical Naïve Bayes models N. L. Zhang, T. D. Nielsen, and F. V. Jensen (2002).
Latent variable discovery in classification models. Artificial Intelligence
in Medicine, to appear.
Capture dependence among attributes using latent variables
Detect interesting latent structures besides classification
Algorithm in the step of DHC..
…
![Page 31: L11: Uses of Bayesian Networks](https://reader035.fdocuments.in/reader035/viewer/2022062804/56814c3e550346895db94352/html5/thumbnails/31.jpg)
Page 31
Bayesian Networks for Classification
Bayes Rule
. Chow-Liu
LC model
LT Model
Wang Yi: Bayes rule + LT model is for superior
![Page 32: L11: Uses of Bayesian Networks](https://reader035.fdocuments.in/reader035/viewer/2022062804/56814c3e550346895db94352/html5/thumbnails/32.jpg)
Page 32
Outline
Traditional Uses
Structure Discovery
Density Estimation
Classification
Clustering
An HKUST Project
![Page 33: L11: Uses of Bayesian Networks](https://reader035.fdocuments.in/reader035/viewer/2022062804/56814c3e550346895db94352/html5/thumbnails/33.jpg)
Page 33
BN for Clustering
Latent class (LC) model One latent variable A set of manifest variables
Conditional Independence Assumption: Xi’s mutually independent given Y. Also known as Local Independence Assumption
Used for cluster analysis of categorical data Determine cardinality of Y: number of clusters Determine P(Xi|Y): characteristics of clusters
![Page 34: L11: Uses of Bayesian Networks](https://reader035.fdocuments.in/reader035/viewer/2022062804/56814c3e550346895db94352/html5/thumbnails/34.jpg)
Page 34
BN for Clustering
Clustering Criteria
Distance based clustering: Minimizes intra-cluster variation and/or maximizes inter-cluster
variation
LC Model-based clustering: The criterion follows from the conditional independence assumption
Divide data into clusters such that, in each cluster, manifest
variables are mutually independent under the empirical distribution.
![Page 35: L11: Uses of Bayesian Networks](https://reader035.fdocuments.in/reader035/viewer/2022062804/56814c3e550346895db94352/html5/thumbnails/35.jpg)
Page 35
BN for Clustering
Local independence assumption often not true
LT models generalize LC models Relax the independence assumption
Each latent variable gives a way to partition data… multidimensional
clustering
![Page 36: L11: Uses of Bayesian Networks](https://reader035.fdocuments.in/reader035/viewer/2022062804/56814c3e550346895db94352/html5/thumbnails/36.jpg)
Page 36
ICAC Data
// 31 variables, 1200 samples
C_City: s0 s1 s2 s3 // very common, quit common, uncommon, ..
C_Gov: s0 s1 s2 s3
C_Bus: s0 s1 s2 s3
Tolerance_C_Gov: s0 s1 s2 s3 //totally intolerable, intolerable, tolerable,...
Tolerance_C_Bus: s0 s1 s2 s3
WillingReport_C: s0 s1 s2 // yes, no, depends
LeaveContactInfo: s0 s1 // yes, no
I_EncourageReport: s0 s1 s2 s3 s4 // very sufficient, sufficient, average, ...
I_Effectiveness: s0 s1 s2 s3 s4 //very e, e, a, in-e, very in-e
I_Deterrence: s0 s1 s2 s3 s4 // very sufficient, sufficient, average, ...
…..
-1 -1 -1 0 0 -1 -1 -1 -1 -1 -1 0 -1 -1 -1 0 1 1 -1 -1 2 0 2 2 1 3 1 1 4 1 0 1.0
-1 -1 -1 0 0 -1 -1 1 1 -1 -1 0 0 -1 1 -1 1 3 2 2 0 0 0 2 1 2 0 0 2 1 0 1.0
-1 -1 -1 0 0 -1 -1 2 1 2 0 0 0 2 -1 -1 1 1 1 0 2 0 1 2 -1 2 0 1 2 1 0 1.0
….
![Page 37: L11: Uses of Bayesian Networks](https://reader035.fdocuments.in/reader035/viewer/2022062804/56814c3e550346895db94352/html5/thumbnails/37.jpg)
Page 37
Latent Structure Discovery
Y2: Demographic info; Y3: Tolerance toward corruption
Y4: ICAC performance; Y7: ICAC accountability
Y5: Change in level of corruption; Y6: Level of corruption
(Zhang, Poon, Wang and Chen 2008)
![Page 38: L11: Uses of Bayesian Networks](https://reader035.fdocuments.in/reader035/viewer/2022062804/56814c3e550346895db94352/html5/thumbnails/38.jpg)
Page 38
Interpreting Partition
Y2 partition the population into 4 clusters
What is the partition about? What is “criterion”?
On what manifest variables do the clusters differ the most?
Mutual information: The larger I(Y2, X), the more the 4 clusters differ on X
![Page 39: L11: Uses of Bayesian Networks](https://reader035.fdocuments.in/reader035/viewer/2022062804/56814c3e550346895db94352/html5/thumbnails/39.jpg)
Page 39
Interpreting Partition
Information curves: Partition of Y2 is based on Income, Age, Education, Sex
Interpretation: Y2 --- Represents a partition of the population based
on demographic information
Y3 --- Represents a partition based on Tolerance toward Corruption
![Page 40: L11: Uses of Bayesian Networks](https://reader035.fdocuments.in/reader035/viewer/2022062804/56814c3e550346895db94352/html5/thumbnails/40.jpg)
Page 40
Interpreting Clusters
Y2=s0: Low income youngsters; Y2=s1: Women with no/low income
Y2=s2: people with good education and good income;
Y2=s3: people with poor education and average income
![Page 41: L11: Uses of Bayesian Networks](https://reader035.fdocuments.in/reader035/viewer/2022062804/56814c3e550346895db94352/html5/thumbnails/41.jpg)
Page 41
Interpreting Clustering
Y3=s0: people who find corruption totally intolerable; 57%
Y3=s1: people who find corruption intolerable; 27%
Y3=s2: people who find corruption tolerable; 15%
Interesting finding:
Y3=s2: 29+19=48% find C-Gov totally intolerable or intolerable; 5% for C-Bus
Y3=s1: 54% find C-Gov totally intolerable; 2% for C-Bus
Y3=s0: Same attitude toward C-Gov and C-Bus
People who are touch on corruption are equally tough toward C-Gov and C-Bus.
People who are relaxed about corruption are more relaxed toward C-Bus than C-GOv
![Page 42: L11: Uses of Bayesian Networks](https://reader035.fdocuments.in/reader035/viewer/2022062804/56814c3e550346895db94352/html5/thumbnails/42.jpg)
Page 42
Relationship Between Dimensions
Interesting finding: Relationship btw background and tolerance toward corruption
Y2=s2: ( good education and good income) the least tolerant. 4% tolerable
Y2=s3: (poor education and average income) the most tolerant. 32% tolerable
The other two classes are in between.
![Page 43: L11: Uses of Bayesian Networks](https://reader035.fdocuments.in/reader035/viewer/2022062804/56814c3e550346895db94352/html5/thumbnails/43.jpg)
Page 43
Result of LCA
Partition not meaningful
Reason: Local Independence not true
Another way to look at it LCA assumes that all the manifest variables joint defines a meaningful way to cluster
data Obviously not true for ICAC data Instead, one should look for subsets that do define meaningful partition and perform
cluster analysis on them This is what we do with LTA
![Page 44: L11: Uses of Bayesian Networks](https://reader035.fdocuments.in/reader035/viewer/2022062804/56814c3e550346895db94352/html5/thumbnails/44.jpg)
Page 44
Finite Mixture Models
Y: discrete latent variable Xi: continuous P(X1, X2, …, Xn|Y): Usually
multivariate Gaussian No independence assumption
Assume states of Y: 1, 2, …, k
P(X1, X2, …, Xn)
= P(Y=i)P(X1, X2, …, Xn|Y=i):
Mixture of k Gaussian components
![Page 45: L11: Uses of Bayesian Networks](https://reader035.fdocuments.in/reader035/viewer/2022062804/56814c3e550346895db94352/html5/thumbnails/45.jpg)
Page 45
Finite Mixture Models
Used to cluster continuous data
Learning Determine
k: number of clusters
P(Y)
P(X1, …, Xn|Y)
Also assume: All attributes define coherent partition Not realistic
LT models are a natural framework for clustering high dimensional data
![Page 46: L11: Uses of Bayesian Networks](https://reader035.fdocuments.in/reader035/viewer/2022062804/56814c3e550346895db94352/html5/thumbnails/46.jpg)
Page 46
Outline
Traditional Uses
Structure Discovery
Density Estimation
Classification
Clustering
An HKUST Project
![Page 47: L11: Uses of Bayesian Networks](https://reader035.fdocuments.in/reader035/viewer/2022062804/56814c3e550346895db94352/html5/thumbnails/47.jpg)
Page 47
Observation on How Human Brain Does Thinking
Human beings often invoke latent variables to explain regularities that
we observe.
Example 1 Observe Regularity:
Beers, Diapers often bought together in early evening
Hypothesize (latent) cause: There must be a common (latent) cause
Identify the cause and explain regularity Shopping by Father of Babies on the way home from work
Based on our understanding of the world
![Page 48: L11: Uses of Bayesian Networks](https://reader035.fdocuments.in/reader035/viewer/2022062804/56814c3e550346895db94352/html5/thumbnails/48.jpg)
Page 48
Example 2 Background: At night, watch lighting throw windows of apartments
in big buildings
Observe Regularity: Lighting from several apartments were changing in brightness and color
at the same times and in perfect synchrony.
Hypothesize common (latent) cause: There must be a (late) common cause
Identify the cause and explain the phenomenon: People watching the same TV channel.
Based on understanding of the world
Observation on How Human Brain Does Thinking
![Page 49: L11: Uses of Bayesian Networks](https://reader035.fdocuments.in/reader035/viewer/2022062804/56814c3e550346895db94352/html5/thumbnails/49.jpg)
Page 49
Back to Ancient Time
Observe Regularity Several symptoms often occur together
‘intolerance to cold’, ‘cold limbs’, and ‘cold lumbus and back’
Hypothesize common latent cause: There must be a common latent cause
Identify the cause Answer based on understanding of world at that time, primitive
Conclusion: Yang deficiency (阳虚 ) Explanation: Yang is like the sun, it warms your body. If you don’t have enough
of it, feel cold.
![Page 50: L11: Uses of Bayesian Networks](https://reader035.fdocuments.in/reader035/viewer/2022062804/56814c3e550346895db94352/html5/thumbnails/50.jpg)
Page 50
Back to Ancient Time
Regularity observed: Several symptoms often occur together
Tidal fever (潮热 ) , heat sensation in palm and feet (手足心热 ), palpitation (心慌心跳 ), thready and rapid pulse (脉细数 )
Hypothesize common latent cause: There must be a common latent cause
Identify the cause and explain the regularirt Yin deficiency causing internal heart (阴虚内热 )
Yin and Yang should be in balance. If Yin is in deficiency, Yang will be in excess
relatively, and hence causes heat.
![Page 51: L11: Uses of Bayesian Networks](https://reader035.fdocuments.in/reader035/viewer/2022062804/56814c3e550346895db94352/html5/thumbnails/51.jpg)
Page 51
Traditional Chinese Medicine (TCM)
Claim TCM Theories = Statistical Regularities + Subjective Interpretations
How to justify the claim
![Page 52: L11: Uses of Bayesian Networks](https://reader035.fdocuments.in/reader035/viewer/2022062804/56814c3e550346895db94352/html5/thumbnails/52.jpg)
Page 52
A Case Study
We collected a data set about kidney deficiency (肾虚 )
35 symptom variables, 2600 records
![Page 53: L11: Uses of Bayesian Networks](https://reader035.fdocuments.in/reader035/viewer/2022062804/56814c3e550346895db94352/html5/thumbnails/53.jpg)
Page 53Result of Data Analysis
Y0-Y34: manifest variables from data
X0-X13: latent variables introduced by data analysis
Structure interesting, supports TCM’s theories about various symptoms.
![Page 54: L11: Uses of Bayesian Networks](https://reader035.fdocuments.in/reader035/viewer/2022062804/56814c3e550346895db94352/html5/thumbnails/54.jpg)
Page 54
Other TCM Data Sets
From Beijing U of TCM, 973 project Depression Hepatitis B Chronic Renal Failure COPD Menopause
China Academy of TCM Subhealth Diabetes
In all cases, results of LT analysis match relevant TCM Theories
![Page 55: L11: Uses of Bayesian Networks](https://reader035.fdocuments.in/reader035/viewer/2022062804/56814c3e550346895db94352/html5/thumbnails/55.jpg)
Page 55
Result on the Depression Data
![Page 56: L11: Uses of Bayesian Networks](https://reader035.fdocuments.in/reader035/viewer/2022062804/56814c3e550346895db94352/html5/thumbnails/56.jpg)
Page 56
Significance
Conclusion TCM Theories = Statistical Regularities + Subjective Interpretations
Significance TCM theories are partially based on objective facts
Boast user confidence
Can help to lay a modern statistical foundation for TCM Systematically identify statistical regularities about occurrence of
symptoms, find natural partitions
Establish objective and quantitative diagnosis standards
Assist in double-blind experiments for evaluate and improve the efficacy
of TCM treatment
![Page 57: L11: Uses of Bayesian Networks](https://reader035.fdocuments.in/reader035/viewer/2022062804/56814c3e550346895db94352/html5/thumbnails/57.jpg)
Page 57
Outline
Traditional Uses
Structure Discovery
Density Estimation
Classification
Clustering
An HKUST Project