Data Mining in Medicine
-
Upload
ana-maria-raileanu -
Category
Documents
-
view
219 -
download
0
Transcript of Data Mining in Medicine
8/8/2019 Data Mining in Medicine
http://slidepdf.com/reader/full/data-mining-in-medicine 1/42
Datamining in Medicine:Selected Techniques and
Applications
Copyright, 2002 © webAI Group, www.datamining.ro
Author Adrian Giurca
8/8/2019 Data Mining in Medicine
http://slidepdf.com/reader/full/data-mining-in-medicine 2/42
Ov er v iewGenerally, data mining (sometimes called data or knowledge disco v ery) is the process of analyzingdata from different perspecti v es and summarizing itinto useful information - information that can beused to increase re v enue, cuts costs, or both. Datamining software is one of a number of analyticaltools for analyzing data. It allows users to analyze
data from many different dimensions or angles,categorize it, and summarize the relationshipsidentified. Technically, data mining is the process of finding correlation or patterns among dozens of
fields in large relational databases.
8/8/2019 Data Mining in Medicine
http://slidepdf.com/reader/full/data-mining-in-medicine 3/42
The nature of Medical Data
The rapidly emerging globally of datarequires standards in terminology,
v ocabularies and formats to support datasharing, standards for interfaces betweendifferent sources of data and integration of heterogeneous data (including images),and standards in the design of electronic
patient records.
8/8/2019 Data Mining in Medicine
http://slidepdf.com/reader/full/data-mining-in-medicine 4/42
The nature of Medical Data
Many en v ironments still lack suchstandards, which hinders the use of data
analysis tools on large global databases,limiting their applications to datasetscollected for specific diagnostic,screening, prognostic, monitoring, therapysupport or other patient management
purposes.
8/8/2019 Data Mining in Medicine
http://slidepdf.com/reader/full/data-mining-in-medicine 5/42
The nature of Medical Data
Patient records collected for diagnosis and prognosis typically encompass v alues of
anamnestic, clinical and laboratory parameters, as well as results of particular inv estigations, specific to the gi v en task.
8/8/2019 Data Mining in Medicine
http://slidepdf.com/reader/full/data-mining-in-medicine 6/42
The nature of Medical Data
Such datasets are characterized by
incompleteness (missing parameter v alues),
incorrectness (systematic or random noise in thedata),
sparness (few and/or non-representable patientrecords a v ailable),
inexactness (inappropriate selection of parameters for the gi v en task).
8/8/2019 Data Mining in Medicine
http://slidepdf.com/reader/full/data-mining-in-medicine 7/42
The nature of Medical DataDatasets collected in monitoring (either acutemonitoring of a particular patient in an intensi v ecare unit, or discrete monitoring o v er long
periods of time in the case of patients withchronic diseases) ha v e additional characteristics:they in v olv e the measurements of a set of
parameters at different times, requesting thetemporal component to be taken into account indata analysis.
8/8/2019 Data Mining in Medicine
http://slidepdf.com/reader/full/data-mining-in-medicine 8/42
Selected Medical Data Mining
TechniquesCurrent trends in medical decision makingshow awareness of the need to introduce
formal reasoning, as well as intelligentdata analysis techniques in the extractionof knowledge, regularities, trends and
representativ
e cases from patient datastored in medical records.
8/8/2019 Data Mining in Medicine
http://slidepdf.com/reader/full/data-mining-in-medicine 9/42
Selected Medical Data Mining
TechniquesFormal techniques include:
decision theorysymbolic reasoning technologymethods at their intersection, such as
probabilistic belief networks
8/8/2019 Data Mining in Medicine
http://slidepdf.com/reader/full/data-mining-in-medicine 10/42
Selected Medical Data Mining
TechniquesIntelligent data analysis techniques include:
machine learning
clusteringdata v isualizationinterpretation of time-ordered data ( deri v ation
and rev
ision of temporal trends and other formsof temporal data abstraction).
8/8/2019 Data Mining in Medicine
http://slidepdf.com/reader/full/data-mining-in-medicine 11/42
Selected Medical Data Mining
TechniquesMachine learning methods can be classified into threemajor groups:inducti v e learning of symbolic rules (such asinduction of rules, decision trees and logic
programs)statistical or pattern-recognition methods (such as k-nearest neighbors or instance-based learning,discriminate analysis and Bayesian classifiers)artificial neural networks (such as networks with
backpropagation learning, Kohonen's self organizingnetwork and Hofield's associati v e memory)
8/8/2019 Data Mining in Medicine
http://slidepdf.com/reader/full/data-mining-in-medicine 12/42
Selected Medical Data Mining
TechniquesMachine learning methods ha v e been applied toa v ariety of medical domains in order to impro v e
medical decision making.These include diagnostic and prognostic
problems in: oncology, li v er pathology,neuropsychology, gynaecology.Impro v ed medical diagnosis and prognosis may
be achie v ed through automatic analysis of patient data stored in medical records i.e. bylearning from past experiences.
8/8/2019 Data Mining in Medicine
http://slidepdf.com/reader/full/data-mining-in-medicine 13/42
Selected Medical Data Mining
TechniquesGiv en patient records with corresponding diagnoses,machine learning methods are able to diagnose newcases. More specifically, suppose E is a set of examples
with known classifications.An example is described by the v alues of a fixedcollection of features (attributes): A i, i =1,...,N at
Each attribute can either ha v e a finite set of v alues
(discrete) or take real numbers as v alues (continous).An indi v idual example e j, j =1,...,N ex is a n-tuple of v alues v ik
of attributes A i Each example is assigned oneof N cl possible v alues in the class v ariable C(classifications):c i, i =1,«, N cl.
A
8/8/2019 Data Mining in Medicine
http://slidepdf.com/reader/full/data-mining-in-medicine 14/42
Selected Medical Data Mining
TechniquesFor example, in the domain of early diagnosis of rheumatic diseases,the patient record comprise 16 anamnestic attributes. Some of theseare continuous (age, duration of morning stiffness) and some are
discrete (e.g. joint pain, which can be arthrotic, arthritic, or not present at all). There are eight possible diagnoses: ± degenerati v e spin diseases ± inflammatory spine diseases ± other inflamatory diseases ± extraarticular rheumatism ± crystal-induced syno v itis ± non-specific rheumatic manifestations ± non-rheumatic diseases
8/8/2019 Data Mining in Medicine
http://slidepdf.com/reader/full/data-mining-in-medicine 15/42
Selected Medical Data Mining
TechniquesTo classify (diagnose ) new cases, machine learning methodscan take different approaches. ± They can construct explicit symbolic rules that generalize
the training cases( rule induction and decision treeinduction). The induced rules or decision trees can then beused to classify new cases.
± To store (some of) the training cases for reference(instance-based learning). New cases can then be classified
by comparing them to the reference cases. ± To compute , for a gi v en case to be classified , the
conditional probability of classes according to theBayesian formula and assign the most probable class to thecase.
8/8/2019 Data Mining in Medicine
http://slidepdf.com/reader/full/data-mining-in-medicine 16/42
How does data mining work?While large-scale information technology has been e v olv ing separate transactionand analytical systems, data mining pro v ides the link between the two. Datamining software analyzes relationships and patterns in stored transaction data
based on open-ended user queries. Se v eral types of analytical software areav ailable: statistical, machine learning, and neural networks. Generally, any of
four types of relationships are sought:y C lasses : Stored data is used to locate data in predetermined groups. For example, a
restaurant chain could mine customer purchase data to determine when customers v isitand what they typically order. This information could be used to increase traffic byhav ing daily specials.
y C lusters : Data items are grouped according to logical relationships or consumer
preferences. For example, data can be mined to identify market segments or consumer affinities.
y Associations : Data can be mined to identify associations. The beer-diaper example isan example of associati v e mining.
y Sequential patterns : Data is mined to anticipate beha v ior patterns and trends. For example, an outdoor equipment retailer could predict the likelihood of a backpack
being purchased based on a consumer's purchase of sleeping bags and hiking shoes.
8/8/2019 Data Mining in Medicine
http://slidepdf.com/reader/full/data-mining-in-medicine 17/42
Fiv e major elements:
y Extract, transform, and load transaction data onto thedata warehouse system.
y Store and manage the data in a multidimensionaldatabase system.
y Prov ide data access to business analysts andinformation technology professionals.
y Analyze the data by application software.y Present the data in a useful format, such as a graph
or table.
8/8/2019 Data Mining in Medicine
http://slidepdf.com/reader/full/data-mining-in-medicine 18/42
Different le v els of analysisy Artificial neural networks : Non-linear predicti v e models that learn through training and resemble
biological neural networks in structure.
y G enetic algorithms : O ptimization techniques that use processes such as genetic combination, mutation,and natural selection in a design based on the concepts of natural e v olution.
y Decision trees : Tree-shaped structures that represent sets of decisions. These decisions generate rules for
the classification of a dataset. Specific decision tree methods include Classification and Regression Trees(CART) and Chi Square Automatic Interaction Detection (CHAID) . CART and CHAID are decision treetechniques used for classification of a dataset. They pro v ide a set of rules that you can apply to a new(unclassified) dataset to predict which records will ha v e a gi v en outcome. CART segments a dataset bycreating 2-way splits while CHAID segments using chi square tests to create multi-way splits. CARTtypically requires less data preparation than CHAID.
y Nearest neighbor method : A technique that classifies each record in a dataset based on a combination of the classes of the k record(s) most similar to it in a historical dataset (where k 1). Sometimes called the k -
nearest neighbor technique.y R ule induction : The extraction of useful if-then rules from data based on statistical significance.
y Data visualization : The v isual interpretation of complex relationships in multidimensional data. Graphicstools are used to illustrate data relationships.
8/8/2019 Data Mining in Medicine
http://slidepdf.com/reader/full/data-mining-in-medicine 19/42
What technological
infrastructure is required?Today, data mining applications are a v ailable on all size systems for mainframe,client/ser v er, and PC platforms. System prices range from se v eral thousanddollars for the smallest applications up to $1 million a terabyte for the largest.Enterprise-wide applications generally range in size from 10 gigabytes to o v er 11 terabytes. There are two critical technological dri v ers:
y Size of the database : the more data being processed and maintained, the more powerful thesystem required.
y Q uery complexity : the more complex the queries and the greater the number of queries being processed, the more powerful the system required.
Relational database storage and management technology is adequate for manydata mining applications less than 50 gigabytes. Howe v er, this infrastructure
needs to be significantly enhanced to support larger applications. Some v endorshav e added extensi v e indexing capabilities to impro v e query performance.O thers use new hardware architectures such as Massi v ely Parallel Processors(MPP) to achie v e order-of-magnitude impro v ements in query time. For example,MPP systems from NCR link hundreds of high-speed Pentium processors toachie v e performance le v els exceeding those of the largest supercomputers.
8/8/2019 Data Mining in Medicine
http://slidepdf.com/reader/full/data-mining-in-medicine 20/42
Software Design
Algorithms
8/8/2019 Data Mining in Medicine
http://slidepdf.com/reader/full/data-mining-in-medicine 21/42
Decision Tree (I)The Decision Tree exploration engine, helps sol v e the task of classifying cases into multiplecategories. Decision Tree is the fastest algorithm when dealing with large amounts of attributes. Decision Tree report pro v ides an easily interpreted decision tree diagram and a
predicted v ersus real table.
P roblems to Solve :
± Classification of cases into multiple categories
Target Attributes :
± Categorical or Boolean (Yes/No) attribute
Output Format :
± Classification statistics
± Predicted v ersus Real table (confusion matrix)
± Decision Tree diagram
Optimal Number of R ecords :
± Minimum of 100 records
± Maximum of 5,000,000 records
8/8/2019 Data Mining in Medicine
http://slidepdf.com/reader/full/data-mining-in-medicine 22/42
Decision Tree (II)Preprocessing Suggested : Summary Statistics - to deselect attributes that contain to many v alues to pro v ide any useful insight to
the exploration engine.
Underlying Algorithms : Information Gain splitting criteria, Shannon information theory and statistical significance tests.
The Data Used : Decision Tree works on data of any type. The DT algorithm is well-poised for analyzing v ery large databases because it does not require loading all the data in machine main memory simultaneously. The software takes a full ad v antageof this feature by implementing incremental DT learning with the help of the OL E DB for Data Mining mechanism. The DT
algorithm calculation time scalesvery well (grows only linearly) with increasing number of data columns. At the same time,it grows more than linearly with the growing number of data records - as N*log(N), where N is the number of records.
Problems to Solve : Decision Tree algorithm helps sol v ing the task of classifying cases into multiple categories. In many cases, thisis the fastest, as well as easily interpreted machine learning algorithm. The DT algorithm pro v ides intuiti v e rules for sol v ing agreat v ariety of classification tasks ranging from predicting buyers/non-buyers in database marketing, to automaticallydiagnosing patient in medicine, and to determining customer attrition causes in banking and insurance.
Target Attribute : The target attribute of a Decision Tree exploration must be of a Boolean (yes/no) or categorical data type.
When to Use This Algorithm : The Decision Tree exploration engine is used for task such as classifying records or predicting
outcomes. You should use decision trees when you goal is to assign your records to a few broad categories. Decision Trees prov ide easily understood rules that can help you identify the best fields for further exploration.
The Output : The Decision Tree report starts of by gi v ing measures resulting from the decision tree. These measures are the Number of non-terminal nodes, Number of lea v es, and depth of the constructed tree. Next, the report pro v ides classificationstatistics on the decision tree. After these measures, the predicti v e v ersus real table is shown.
8/8/2019 Data Mining in Medicine
http://slidepdf.com/reader/full/data-mining-in-medicine 23/42
Cluster Analysis
Cluster engine is used for the automated detecting clusters of records that lie close to each other in a certain sense in the spaceof all v ariables. Such clusters may represent different situationsor target groups, which one might find beneficial to studyseparately. The Cluster engine places records corresponding todifferent clusters in separate datasets for further analysis. Thecluster analysis pro v es to be useful for applications ranging fromdatabase marketing to quality control.
The use of all attributes makes the Cluster algorithm v ery useful for beginning data mining ± it is an undirected method, and does notrequire the selection of a target attribute.
8/8/2019 Data Mining in Medicine
http://slidepdf.com/reader/full/data-mining-in-medicine 24/42
Fuzzy L ogic ClassificationThe algorithm is used for assigning cases to different classes. O n
the output this exploration engine not only produces the prediction to which class the case belongs, but also pro v idesthe obtained symbolic classification rule generalizedautomatically from the training examples. The classifier engine furnishes simpler and more reliable results thansystems based on pure decision trees ideology. The predictionaccuracy obtained for the testing cases is comparable to theaccuracy obtained for the training cases. And again, statisticalsignificance of the generalized rule is determined rigorously
by the classifier engine. Note that the classifier engine canutilize either SKAT or M L R or neural network predictionmethod as its dri v ing mechanism.
8/8/2019 Data Mining in Medicine
http://slidepdf.com/reader/full/data-mining-in-medicine 25/42
L inear RegressionThe Stepwise L inear Regression algorithm is, to our knowledge, the
only system capable of including categorical v ariables, in addition tonumerical and logical v ariables, in the regression analysis.
ML R disco v ers linear relations in data, automatically selecting only
those independent v ariables which influence the target v ariablemost. It also pinpoints redundant, mutually correlating independentv ariables, and includes only their minimal subset in the results.
The L inear Regression is based on a v ery quick and robust calculationalgorithm. As with all other , the rigorous determination of
significance of the obtained results is performed for each modelconsidered. M L R is the fastest exploration engine and thus can beused as a complementary preprocessing module for the SKATexploration engine.
8/8/2019 Data Mining in Medicine
http://slidepdf.com/reader/full/data-mining-in-medicine 26/42
Symbolic Knowledge Acquisition
Technology (SKAT)Data mining is one of the most promising modern information technologies. The corporate world has
learned to deri v e new v alue from data by utilizing v arious intelligent tools and algorithmsdesigned for an automated disco v ery of non-tri v ial, useful, and pre v iously unknown knowledge inraw data.
Which factors influence the future v ariation of the price of some security shares?
What characteristics of a potential customer of some ser v ice make him/her the most probable buyer?
These and numerous other business questions can be successfully addressed by data mining.
The majority of a v ailable data mining tools are based on a few well-established technologies for dataanalysis. Different knowledge disco v ery methods are suited best for different applications.Among the useful knowledge presentation tasks one can name the dependency detection,numerical prediction, explicit relation modeling, or classification rules.
Despite the usefulness of traditional data mining methods in v arious situations, we choose toconcentrate here first on the problems that plague these methods. Then we discuss the solutions tothese problems, which become a v ailable with an ad v ent of SKAT - a next generation data miningtechnology. We outline the reasons, foundations, and commercial implementations of thisemerging approach.
8/8/2019 Data Mining in Medicine
http://slidepdf.com/reader/full/data-mining-in-medicine 27/42
Symbolic Knowledge Acquisition
Technology (SKAT)Among the v arious tasks a data mining system is asked to perform, twoquestions are encountered most frequently:
± Which database fields influence the selected target field?
± Precisely how the target field depends on other fields in the database?
While there are many successful methods designed to answer the firstquestion, it is far more difficult to answer the second. Why does thishappen? Simply, an obser v ation that across a number of cases with closev alues of all parameters except some parameter X, the target parameter
Yv
aries considerably, implies that Y depends on X. For multi-dimensional dependencies the issue becomes less straightforward, butthe basic idea for sol v ing the problem is the same. At the same time, thetask of automated determination of an explicit form of the dependence
between se v eral v ariables is significantly more difficult. The solution tothis problem cannot be based on similar simple-minded considerations.
8/8/2019 Data Mining in Medicine
http://slidepdf.com/reader/full/data-mining-in-medicine 28/42
Symbolic Knowledge
Acquisition Technology (SKAT)Traditional methods for finding the precise form of a soughtrelation implement the search for an expression representing thedependence among possible expressions from some fixed class.This idea is exploited in many existing data mining applications.For example, one of the most straightforward and popular methods of search for simple numerical dependencies - linear regression - selects a solution out of a set of linear formulaeapproximating the sought dependence. Systems from another
popular class of data mining algorithms - decision trees - searchfor classification rules represented as trees in v olv ing simpleequalities and inequalities in the nodes connected by BooleanAND and O R operations.
8/8/2019 Data Mining in Medicine
http://slidepdf.com/reader/full/data-mining-in-medicine 29/42
Symbolic Knowledge
Acquisition Technology (SKAT)Howe v er, beyond the limits of the narrow classes of dependencies thatcan be found by these systems there is an endless sea of dependencieswhich cannot e v en be represented in the language used by thesesystems. For example, assume you are using a decision tree system toanalyze the data holding the following simple rule: "most frequent
buyers of Post cereal are homemakers of age smaller than the in v ersesquare of their family income multiplied by a certain constant". Atraditional system has no means to disco v er such a rule. O nly if onefurnishes to the system explicitly the parameter "in v erse square of thefamily income" can the stated rule be found by traditional systems. Inother words, one has to guess an approximate form of the solution first -and then the machine does the rest of the job efficiently. While guessinga general form of the solution prior to automated modeling might be achallenging brain twister, it certainly does not make life of a corporatedata analyst much easier.
8/8/2019 Data Mining in Medicine
http://slidepdf.com/reader/full/data-mining-in-medicine 30/42
Symbolic Knowledge
Acquisition Technology (SKAT)
8/8/2019 Data Mining in Medicine
http://slidepdf.com/reader/full/data-mining-in-medicine 31/42
Case Study :
Bayesian Classification.
8/8/2019 Data Mining in Medicine
http://slidepdf.com/reader/full/data-mining-in-medicine 32/42
Bayesian Classification: Why?
Probabilistic learning: Calculate explicit probabilities for hypothesis, among the most practical approaches to certaintypes of learning problems
Incremental: Each training example can incrementallyincrease/decrease the probability that a hypothesis is correct.Prior knowledge can be combined with obser v ed data.Probabilistic prediction: Predict multiple hypotheses,
weighted by their probabilitiesStandard: E v en when Bayesian methods arecomputationally intractable, they can pro v ide a standard of optimal decision making against which other methods can
be measured
8/8/2019 Data Mining in Medicine
http://slidepdf.com/reader/full/data-mining-in-medicine 33/42
Bayesian Theorem: Basics
L et X be a data sample whose class label is unknownL et H be a hypothesis that X belongs to class CFor classification problems, determine P(H/X): the
probability that the hypothesis holds gi v en the obser v eddata sample X
P(H): prior probability of hypothesis H (i.e. the initial probability before we obser v e any data, reflects the
background knowledge)P(X): probability that sample data is obser v edP(X|H) : probability of obser v ing the sample X, gi v en that
the hypothesis holds
8/8/2019 Data Mining in Medicine
http://slidepdf.com/reader/full/data-mining-in-medicine 34/42
Bayesian Theorem
Giv en training data X, posteriori probability of a hypothesis H, P(H|X)follows the Bayes theorem
Informally, this can be written as posterior =likelihood x prior / e v idence
MAP (maximum posteriori) hypothesis
Practical difficulty: require initial knowledge of many probabilities,significant computational cost
)()()()(
X P H P H X P X H P !
.)()|(maxarg)|(maxarg h P h D P
H h
Dh P
H h M A P h |
8/8/2019 Data Mining in Medicine
http://slidepdf.com/reader/full/data-mining-in-medicine 35/42
Naï v e Bayesian Classifier
Each data sample X is represented as a v ector {x 1, x2, «, x n}
There are m classes C 1, C2, «, C m
Giv en unknown data sample X, the classifier will predict thatX belongs to class C i, iff
P(C i|X) > P (C j|X) where 1 e j e m , I { J
By Bayes theorem, P(C i|X)= P(X|C i)P(C i)/ P(X)
8/8/2019 Data Mining in Medicine
http://slidepdf.com/reader/full/data-mining-in-medicine 36/42
Naï v e Bayes Classifier A simplified assumption: attributes are conditionally
independent:
The product of occurrence of say 2 elements x 1 and x 2, giv enthe current class is C, is the product of the probabilities of each element taken separately, gi v en the same classP([y 1,y2],C) = P(y 1,C) * P(y 2,C)
No dependence relation between attributesGreatly reduces the computation cost, only count the class
distribution.O nce the probability P(X|C i) is known, assign X to the class
with maximum P(X|C i)*P(C i)
!
!
n
k C i xk P C i P
1
)()(
8/8/2019 Data Mining in Medicine
http://slidepdf.com/reader/full/data-mining-in-medicine 37/42
Training dataset
age income student credit_rating buys_computer <=30 high no fair no<=30 high no excellent no30«40 high no fair yes>40 medium no fair yes>40 low yes fair yes>40 low yes excellent no31«40 low yes excellent yes<=30 medium no fair no<=30 low yes fair yes>40 medium yes fair yes<=30 medium yes excellent yes31«40 medium no excellent yes31«40 high yes fair yes>40 medium no excellent no
Class:
C1:buys_computer= yes
C2:buys_computer=
no
Data sampleX =(age<=30,Income=medium, Student=yesCredit_rating=Fair)
8/8/2019 Data Mining in Medicine
http://slidepdf.com/reader/full/data-mining-in-medicine 38/42
Naï v e Bayesian Classifier:
ExampleCompute P(X/Ci) for each classP(age=³<30´ | buys_computer=³yes´) = 2/9=0.222P(age=³<30´ | buys_computer=³no´) = 3/5 =0.6P(income=³medium´ | buys_computer=³yes´)= 4/9 =0.444P(income=³medium´ | buys_computer=³no´) = 2/5 = 0.4
P(student=³yes´ | buys_computer=³yes)= 6/9 =0.667P(student=³yes´ | buys_computer=³no´)= 1/5=0.2P(credit_rating=³fair´ | buys_computer=³yes´)=6/9=0.667P(credit_rating=³fair´ | buys_computer=³no )=2/5=0.4
X=( age< =30 ,income =medium, student =yes,credit_rating =fair)
P(X|C i) : P(X|buys_computer=³yes´)= 0.222 x 0.444 x 0.667 x 0.0.667 =0.044P(X|buys_computer=³no´)= 0.6 x 0.4 x 0.2 x 0.4 =0.019
P(X|C i)* P(C i ) : P(X|buys_computer=³yes´) * P(buys_computer=³yes´)=0.028P(X|buys_computer=³yes´) * P(buys_computer=³yes´)=0.007
X belongs to class ³buys_computer =yes´
8/8/2019 Data Mining in Medicine
http://slidepdf.com/reader/full/data-mining-in-medicine 39/42
Naï v e Bayesian Classifier:
CommentsAdv antages : ± Easy to implement
± Good results obtained in most of the cases
Disad v antages ± Assumption: class conditional independence , therefore loss of accuracy
± Practically, dependencies exist among v ariables
± E.g., hospitals : patients: Profile : age, family history etc
Symptoms : fe v er, cough etc , Disease : lung cancer, diabetes etc ,
Dependencies among these cannot be modeled by Naïv
e BayesianClassifier, use a Bayesian network
How to deal with these dependencies? ± Bayesian Belief Networks
8/8/2019 Data Mining in Medicine
http://slidepdf.com/reader/full/data-mining-in-medicine 40/42
Nai v e Bayesian Classifier:Example II
Giv en a training set, we can compute the probabilities
O utlook P N H um id ity P Nsunny 2 /9 3 /5 h igh 3 /9 4 /5overcast 4 /9 0 norm al 6 /9 1 /5rain 3 /9 2 /5Tem preatu re W indyho t 2 /9 2 /5 true 3 /9 3 /5m ild 4 /9 2 /5 false 6 /9 2 /5coo l 3 /9 1 /5
8/8/2019 Data Mining in Medicine
http://slidepdf.com/reader/full/data-mining-in-medicine 41/42
Bayesian NetworksBayesian belief network allows a subset of thev ariables conditionally independent
A graphical model of causal relationships ± Represents dependency among the v ariables ± Giv es a specification of joint probability distribution
X Y
ZP
Nodes: random variables
Links: dependencyX,Y are the parents of Z Y is the parent of PNo dependency between Zand P
Has no loops or cycles
8/8/2019 Data Mining in Medicine
http://slidepdf.com/reader/full/data-mining-in-medicine 42/42
Bayesian Belief Network: AnExample
FamilyHistory
Lung C ancer
Positive XR ay
Smoker
Emphysema
Dyspnea
L C
~ L C
(FH, S) (FH, ~S) (~ FH, S) (~ FH, ~S)
0.8
0.2
0.5
0.5
0.7
0.3
0.1
0.9
Bayesian Belief Networks
The conditional probability table forthe variable Lung C ancer:
Shows the conditional probability foreach possible combination of itsparents
!!
n
i Z Pare n ts i zi P zn z P
1))(|(),...,1(