Classification: Decision Trees · Basic Algorithm for Top-Down Learning of Decision Trees [ID3,...
Transcript of Classification: Decision Trees · Basic Algorithm for Top-Down Learning of Decision Trees [ID3,...
![Page 1: Classification: Decision Trees · Basic Algorithm for Top-Down Learning of Decision Trees [ID3, C4.5 by Quinlan] node= root of decision tree Main loop: 1. Aßthe “best” decision](https://reader035.fdocuments.in/reader035/viewer/2022062414/5ec67d03ae6d2609843381e5/html5/thumbnails/1.jpg)
Classification:DecisionTrees
RobotImageCredit:Viktoriya Sukhanova ©123RF.com
TheseslideswereassembledbyByronBoots,withgratefulacknowledgementtoEricEatonandthemanyotherswhomadetheircoursematerialsfreelyavailableonline.Feelfreetoreuseoradapttheseslidesforyourownacademicpurposes,providedthatyouincludeproperattribution.
![Page 2: Classification: Decision Trees · Basic Algorithm for Top-Down Learning of Decision Trees [ID3, C4.5 by Quinlan] node= root of decision tree Main loop: 1. Aßthe “best” decision](https://reader035.fdocuments.in/reader035/viewer/2022062414/5ec67d03ae6d2609843381e5/html5/thumbnails/2.jpg)
LastTime• Commondecompositionofmachinelearning,basedondifferencesin
inputsandoutputs– SupervisedLearning:Learnafunctionmappinginputstooutputsusing
labeledtrainingdata(yougetinstances/exampleswithbothinputsandgroundtruthoutput)
– UnsupervisedLearning:Learnsomethingaboutjustdatawithoutanylabels(harder!),forexampleclusteringinstancesthatare“similar”
– ReinforcementLearning:Learnhowtomakedecisionsgivenasparsereward
• MLisaninteractionbetween:– theinputdata– inputtothefunction(features/attributesofdata)– thethefunction(model)youchoose,and– theoptimizationalgorithmyouusetoexplorespaceoffunctions
• Wearenewatthis,solet’sstartbylearningif/thenrulesforclassification!
![Page 3: Classification: Decision Trees · Basic Algorithm for Top-Down Learning of Decision Trees [ID3, C4.5 by Quinlan] node= root of decision tree Main loop: 1. Aßthe “best” decision](https://reader035.fdocuments.in/reader035/viewer/2022062414/5ec67d03ae6d2609843381e5/html5/thumbnails/3.jpg)
FunctionApproximationProblemSetting• Setofpossibleinstances• Setofpossiblelabels• Unknowntargetfunction• Setoffunctionhypotheses
Input:Trainingexamplesofunknowntargetfunctionf
Output:Hypothesisthatbestapproximatesf
XY
f : X ! YH = {h | h : X ! Y}
h 2 H
BasedonslidebyTomMitchell
{hxi, yii}ni=1 = {hx1, y1i , . . . , hxn, yni}
![Page 4: Classification: Decision Trees · Basic Algorithm for Top-Down Learning of Decision Trees [ID3, C4.5 by Quinlan] node= root of decision tree Main loop: 1. Aßthe “best” decision](https://reader035.fdocuments.in/reader035/viewer/2022062414/5ec67d03ae6d2609843381e5/html5/thumbnails/4.jpg)
SampleDataset(wasTennisPlayed?)• ColumnsdenotefeaturesXi
• Rowsdenotelabeledinstances• Classlabeldenoteswhetheratennisgamewasplayed
hxi, yii
hxi, yii
![Page 5: Classification: Decision Trees · Basic Algorithm for Top-Down Learning of Decision Trees [ID3, C4.5 by Quinlan] node= root of decision tree Main loop: 1. Aßthe “best” decision](https://reader035.fdocuments.in/reader035/viewer/2022062414/5ec67d03ae6d2609843381e5/html5/thumbnails/5.jpg)
DecisionTree• Apossibledecisiontreeforthedata:
• Eachinternalnode:testoneattributeXi
• Eachbranchfromanode:selectsonevalueforXi
• Eachleafnode:predictY
BasedonslidebyTomMitchell
![Page 6: Classification: Decision Trees · Basic Algorithm for Top-Down Learning of Decision Trees [ID3, C4.5 by Quinlan] node= root of decision tree Main loop: 1. Aßthe “best” decision](https://reader035.fdocuments.in/reader035/viewer/2022062414/5ec67d03ae6d2609843381e5/html5/thumbnails/6.jpg)
DecisionTree• Apossibledecisiontreeforthedata:
• Whatpredictionwouldwemakefor<outlook=sunny,temperature=hot,humidity=high,wind=weak>?
BasedonslidebyTomMitchell
![Page 7: Classification: Decision Trees · Basic Algorithm for Top-Down Learning of Decision Trees [ID3, C4.5 by Quinlan] node= root of decision tree Main loop: 1. Aßthe “best” decision](https://reader035.fdocuments.in/reader035/viewer/2022062414/5ec67d03ae6d2609843381e5/html5/thumbnails/7.jpg)
DecisionTree– DecisionBoundary• Decisiontreesdividethefeaturespaceintoaxis-parallel(hyper-)rectangles
• Eachrectangularregionislabeledwithonelabel– oraprobabilitydistributionoverlabels
7
Decisionboundary
![Page 8: Classification: Decision Trees · Basic Algorithm for Top-Down Learning of Decision Trees [ID3, C4.5 by Quinlan] node= root of decision tree Main loop: 1. Aßthe “best” decision](https://reader035.fdocuments.in/reader035/viewer/2022062414/5ec67d03ae6d2609843381e5/html5/thumbnails/8.jpg)
Expressiveness
• Givenaparticularspaceoffunctions,youmaynotbeabletorepresenteverything
• Whatfunctionscandecisiontreesrepresent?
• Decisiontreescanrepresentanyfunctionoftheinputattributes!– Booleanoperations(and,or,
xor,etc.)?– Yes!– Allboolean functions?– Yes!
![Page 9: Classification: Decision Trees · Basic Algorithm for Top-Down Learning of Decision Trees [ID3, C4.5 by Quinlan] node= root of decision tree Main loop: 1. Aßthe “best” decision](https://reader035.fdocuments.in/reader035/viewer/2022062414/5ec67d03ae6d2609843381e5/html5/thumbnails/9.jpg)
Stagesof(Batch)MachineLearningGiven: labeledtrainingdata
• Assumeseachwith
Trainthemodel:modelß classifier.train(X,Y )
Applythemodeltonewdata:• Given:newunlabeledinstance
yprediction ßmodel.predict(x)
model
learner
X,Y
x yprediction
X,Y = {hxi, yii}ni=1
xi ⇠ D(X ) yi = ftarget(xi)
x ⇠ D(X )
![Page 10: Classification: Decision Trees · Basic Algorithm for Top-Down Learning of Decision Trees [ID3, C4.5 by Quinlan] node= root of decision tree Main loop: 1. Aßthe “best” decision](https://reader035.fdocuments.in/reader035/viewer/2022062414/5ec67d03ae6d2609843381e5/html5/thumbnails/10.jpg)
BasicAlgorithmforTop-DownLearningofDecisionTrees
[ID3,C4.5byQuinlan]
node =rootofdecisiontreeMainloop:1. Aß the“best”decisionattributeforthenextnode.2. AssignA asdecisionattributefornode.3. ForeachvalueofA,createanewdescendantofnode.4. Sorttrainingexamplestoleafnodes.5. Iftrainingexamplesareperfectlyclassified,stop.Else,
recurse overnewleafnodes.
Howdowechoosewhichattributeisbest?
![Page 11: Classification: Decision Trees · Basic Algorithm for Top-Down Learning of Decision Trees [ID3, C4.5 by Quinlan] node= root of decision tree Main loop: 1. Aßthe “best” decision](https://reader035.fdocuments.in/reader035/viewer/2022062414/5ec67d03ae6d2609843381e5/html5/thumbnails/11.jpg)
ChoosingtheBestAttributeKeyproblem:choosingwhichattributetosplitagivensetofexamples
• Somepossibilitiesare:– Random: Selectanyattributeatrandom– Least-Values: Choosetheattributewiththesmallestnumberofpossiblevalues
– Most-Values: Choosetheattributewiththelargestnumberofpossiblevalues
– Max-Gain: Choosetheattributethathasthelargestexpectedinformationgain• i.e.,attributethatresultsinsmallestexpectedsizeofsubtreesrootedatitschildren
• TheID3algorithmusestheMax-Gainmethodofselectingthebestattribute
![Page 12: Classification: Decision Trees · Basic Algorithm for Top-Down Learning of Decision Trees [ID3, C4.5 by Quinlan] node= root of decision tree Main loop: 1. Aßthe “best” decision](https://reader035.fdocuments.in/reader035/viewer/2022062414/5ec67d03ae6d2609843381e5/html5/thumbnails/12.jpg)
12
InformationGain
Whichtestismoreinformative?
Split over whether Balance exceeds 50K
Over 50KLess or equal 50K EmployedUnemployed
Split over whether applicant is employed
BasedonslidebyPedroDomingos
![Page 13: Classification: Decision Trees · Basic Algorithm for Top-Down Learning of Decision Trees [ID3, C4.5 by Quinlan] node= root of decision tree Main loop: 1. Aßthe “best” decision](https://reader035.fdocuments.in/reader035/viewer/2022062414/5ec67d03ae6d2609843381e5/html5/thumbnails/13.jpg)
13
Impurity/Entropy (informal)–Measuresthelevelofimpurity inagroupofexamples
InformationGain
BasedonslidebyPedroDomingos
![Page 14: Classification: Decision Trees · Basic Algorithm for Top-Down Learning of Decision Trees [ID3, C4.5 by Quinlan] node= root of decision tree Main loop: 1. Aßthe “best” decision](https://reader035.fdocuments.in/reader035/viewer/2022062414/5ec67d03ae6d2609843381e5/html5/thumbnails/14.jpg)
14
Impurity
Very impure group Less impure Minimum impurity
BasedonslidebyPedroDomingos
![Page 15: Classification: Decision Trees · Basic Algorithm for Top-Down Learning of Decision Trees [ID3, C4.5 by Quinlan] node= root of decision tree Main loop: 1. Aßthe “best” decision](https://reader035.fdocuments.in/reader035/viewer/2022062414/5ec67d03ae6d2609843381e5/html5/thumbnails/15.jpg)
15
Entropy:acommonwaytomeasureimpurity
• Entropy=
pi istheprobabilityofclassiComputeitastheproportionofclassiintheset.
• Entropycomesfrominformationtheory.Thehighertheentropythemoretheinformationcontent.
å-i
ii pp 2log
Whatdoesthatmeanforlearningfromexamples?
BasedonslidebyPedroDomingos
![Page 16: Classification: Decision Trees · Basic Algorithm for Top-Down Learning of Decision Trees [ID3, C4.5 by Quinlan] node= root of decision tree Main loop: 1. Aßthe “best” decision](https://reader035.fdocuments.in/reader035/viewer/2022062414/5ec67d03ae6d2609843381e5/html5/thumbnails/16.jpg)
16
2-ClassCases:
• Whatistheentropyofagroupinwhichallexamplesbelongtothesameclass?– entropy=- 1log21=0
• Whatistheentropyofagroupwith50%ineitherclass?– entropy=-0.5log20.5– 0.5log20.5=1
Minimum impurity
Maximumimpurity
notagoodtrainingsetforlearning
goodtrainingsetforlearning
BasedonslidebyPedroDomingos
H(x) = �nX
i=1
P (x = i) log2 P (x = i)Entropy
![Page 17: Classification: Decision Trees · Basic Algorithm for Top-Down Learning of Decision Trees [ID3, C4.5 by Quinlan] node= root of decision tree Main loop: 1. Aßthe “best” decision](https://reader035.fdocuments.in/reader035/viewer/2022062414/5ec67d03ae6d2609843381e5/html5/thumbnails/17.jpg)
SampleEntropy
11
Sample Entropy
Entropy
Entropy H(X) of a random variable X
Specific conditional entropy H(X|Y=v) of X given Y=v :
Conditional entropy H(X|Y) of X given Y :
Mututal information (aka Information Gain) of X and Y :
SlidebyTomMitchell
![Page 18: Classification: Decision Trees · Basic Algorithm for Top-Down Learning of Decision Trees [ID3, C4.5 by Quinlan] node= root of decision tree Main loop: 1. Aßthe “best” decision](https://reader035.fdocuments.in/reader035/viewer/2022062414/5ec67d03ae6d2609843381e5/html5/thumbnails/18.jpg)
18
InformationGain• Wewanttodeterminewhichattribute inagivensetoftrainingfeaturevectorsismostuseful fordiscriminatingbetweentheclassestobelearned.
• Informationgain tellsushowimportantagivenattributeofthefeaturevectorsis.
• Wewilluseittodecidetheorderingofattributesinthenodesofadecisiontree.
BasedonslidebyPedroDomingos
![Page 19: Classification: Decision Trees · Basic Algorithm for Top-Down Learning of Decision Trees [ID3, C4.5 by Quinlan] node= root of decision tree Main loop: 1. Aßthe “best” decision](https://reader035.fdocuments.in/reader035/viewer/2022062414/5ec67d03ae6d2609843381e5/html5/thumbnails/19.jpg)
FromEntropytoInformationGain
11
Sample Entropy
Entropy
Entropy H(X) of a random variable X
Specific conditional entropy H(X|Y=v) of X given Y=v :
Conditional entropy H(X|Y) of X given Y :
Mututal information (aka Information Gain) of X and Y :
SlidebyTomMitchell
![Page 20: Classification: Decision Trees · Basic Algorithm for Top-Down Learning of Decision Trees [ID3, C4.5 by Quinlan] node= root of decision tree Main loop: 1. Aßthe “best” decision](https://reader035.fdocuments.in/reader035/viewer/2022062414/5ec67d03ae6d2609843381e5/html5/thumbnails/20.jpg)
InformationGain
12
Information Gain is the mutual information between
input attribute A and target variable Y
Information Gain is the expected reduction in entropy
of target variable Y for data sample S, due to sorting
on variable A
SlidebyTomMitchell
![Page 21: Classification: Decision Trees · Basic Algorithm for Top-Down Learning of Decision Trees [ID3, C4.5 by Quinlan] node= root of decision tree Main loop: 1. Aßthe “best” decision](https://reader035.fdocuments.in/reader035/viewer/2022062414/5ec67d03ae6d2609843381e5/html5/thumbnails/21.jpg)
21
CalculatingInformationGain
996.03016log
3016
3014log
3014
22 =÷øö
çèæ ×-÷
øö
çèæ ×-=impurity
787.0174log
174
1713log
1713
22 =÷øö
çèæ ×-÷
øö
çèæ ×-=impurity
Entire population (30 instances)17 instances
13 instances
(Weighted) Average Entropy of Children = 615.0391.03013787.0
3017
=÷øö
çèæ ×+÷
øö
çèæ ×
Information Gain= 0.996 - 0.615 = 0.38
391.01312log
1312
131log
131
22 =÷øö
çèæ ×-÷
øö
çèæ ×-=impurity
InformationGain =entropy(parent)– [averageentropy(children)]
parententropy
childentropy
childentropy
BasedonslidebyPedroDomingos
![Page 22: Classification: Decision Trees · Basic Algorithm for Top-Down Learning of Decision Trees [ID3, C4.5 by Quinlan] node= root of decision tree Main loop: 1. Aßthe “best” decision](https://reader035.fdocuments.in/reader035/viewer/2022062414/5ec67d03ae6d2609843381e5/html5/thumbnails/22.jpg)
12
Information Gain is the mutual information between
input attribute A and target variable Y
Information Gain is the expected reduction in entropy
of target variable Y for data sample S, due to sorting
on variable A
![Page 23: Classification: Decision Trees · Basic Algorithm for Top-Down Learning of Decision Trees [ID3, C4.5 by Quinlan] node= root of decision tree Main loop: 1. Aßthe “best” decision](https://reader035.fdocuments.in/reader035/viewer/2022062414/5ec67d03ae6d2609843381e5/html5/thumbnails/23.jpg)
13
SlidebyTomMitchell
![Page 24: Classification: Decision Trees · Basic Algorithm for Top-Down Learning of Decision Trees [ID3, C4.5 by Quinlan] node= root of decision tree Main loop: 1. Aßthe “best” decision](https://reader035.fdocuments.in/reader035/viewer/2022062414/5ec67d03ae6d2609843381e5/html5/thumbnails/24.jpg)
13
SlidebyTomMitchell
![Page 25: Classification: Decision Trees · Basic Algorithm for Top-Down Learning of Decision Trees [ID3, C4.5 by Quinlan] node= root of decision tree Main loop: 1. Aßthe “best” decision](https://reader035.fdocuments.in/reader035/viewer/2022062414/5ec67d03ae6d2609843381e5/html5/thumbnails/25.jpg)
14
Decision Tree Learning Applet
• http://www.cs.ualberta.ca/%7Eaixplore/learning/
DecisionTrees/Applet/DecisionTreeApplet.html
Which Tree Should We Output?
• ID3 performs heuristic
search through space of
decision trees
• It stops at smallest
acceptable tree. Why?
Occam’s razor: prefer the simplest hypothesis that fits the data
SlidebyTomMitchell
![Page 26: Classification: Decision Trees · Basic Algorithm for Top-Down Learning of Decision Trees [ID3, C4.5 by Quinlan] node= root of decision tree Main loop: 1. Aßthe “best” decision](https://reader035.fdocuments.in/reader035/viewer/2022062414/5ec67d03ae6d2609843381e5/html5/thumbnails/26.jpg)
Preferencebias:Ockham’sRazor• PrinciplestatedbyWilliamofOckham(1285-1347)– “nonsunt multiplicanda entia praeter necessitatem”– entitiesarenottobemultipliedbeyondnecessity– AKAOccam’sRazor,LawofEconomy,orLawofParsimony
• Therefore,thesmallestdecisiontreethatcorrectlyclassifiesallofthetrainingexamplesisbest• FindingtheprovablysmallestdecisiontreeisNP-hard• ...Soinsteadofconstructingtheabsolutesmallesttreeconsistentwiththetrainingexamples,constructonethatisprettysmall
Idea:Thesimplestconsistentexplanationisthebest
![Page 27: Classification: Decision Trees · Basic Algorithm for Top-Down Learning of Decision Trees [ID3, C4.5 by Quinlan] node= root of decision tree Main loop: 1. Aßthe “best” decision](https://reader035.fdocuments.in/reader035/viewer/2022062414/5ec67d03ae6d2609843381e5/html5/thumbnails/27.jpg)
OverfittinginDecisionTrees• Manykindsof“noise” canoccurintheexamples:– Twoexampleshavesameattribute/valuepairs,butdifferentclassifications
– Somevaluesofattributesareincorrectbecauseoferrorsinthedataacquisitionprocessorthepreprocessingphase
– Theinstancewaslabeledincorrectly(+insteadof-)
• Also,someattributesareirrelevanttothedecision-makingprocess– e.g.,colorofadieisirrelevanttoitsoutcome
BasedonSlidefromM.desJardins &T.Finin27
![Page 28: Classification: Decision Trees · Basic Algorithm for Top-Down Learning of Decision Trees [ID3, C4.5 by Quinlan] node= root of decision tree Main loop: 1. Aßthe “best” decision](https://reader035.fdocuments.in/reader035/viewer/2022062414/5ec67d03ae6d2609843381e5/html5/thumbnails/28.jpg)
• Irrelevantattributes canresultinoverfitting thetrainingexampledata– Ifhypothesisspacehasmanydimensions(largenumberofattributes),wemayfindmeaninglessregularity inthedatathatisirrelevanttothetrue,important,distinguishingfeatures
• Ifwehavetoolittletrainingdata,evenareasonablehypothesisspacewill‘overfit’
BasedonSlidefromM.desJardins &T.Finin28
Overfitting inDecisionTrees
![Page 29: Classification: Decision Trees · Basic Algorithm for Top-Down Learning of Decision Trees [ID3, C4.5 by Quinlan] node= root of decision tree Main loop: 1. Aßthe “best” decision](https://reader035.fdocuments.in/reader035/viewer/2022062414/5ec67d03ae6d2609843381e5/html5/thumbnails/29.jpg)
Overfitting inDecisionTreesConsideraddinganoisy trainingexampletothefollowingtree:
Whatwouldbetheeffectofadding:<outlook=sunny,temperature=hot,humidity=normal,wind=strong,playTennis=No>?
BasedonSlidebyPedroDomingos29
![Page 30: Classification: Decision Trees · Basic Algorithm for Top-Down Learning of Decision Trees [ID3, C4.5 by Quinlan] node= root of decision tree Main loop: 1. Aßthe “best” decision](https://reader035.fdocuments.in/reader035/viewer/2022062414/5ec67d03ae6d2609843381e5/html5/thumbnails/30.jpg)
SlidebyPedroDomingos30
OverfittinginDecisionTrees
![Page 31: Classification: Decision Trees · Basic Algorithm for Top-Down Learning of Decision Trees [ID3, C4.5 by Quinlan] node= root of decision tree Main loop: 1. Aßthe “best” decision](https://reader035.fdocuments.in/reader035/viewer/2022062414/5ec67d03ae6d2609843381e5/html5/thumbnails/31.jpg)
SlidebyPedroDomingos31
OverfittinginDecisionTrees
![Page 32: Classification: Decision Trees · Basic Algorithm for Top-Down Learning of Decision Trees [ID3, C4.5 by Quinlan] node= root of decision tree Main loop: 1. Aßthe “best” decision](https://reader035.fdocuments.in/reader035/viewer/2022062414/5ec67d03ae6d2609843381e5/html5/thumbnails/32.jpg)
AvoidingOverfittinginDecisionTreesHowcanweavoidoverfitting?• Stopgrowingwhendatasplitisnotstatisticallysignificant• Acquiremoretrainingdata• Removeirrelevantattributes (manualprocess– notalwayspossible)• Growfulltree,thenpost-prune
Howtoselect“best”tree:• Measureperformanceovertrainingdata• Measureperformanceoverseparatevalidationdataset• Addcomplexitypenaltytoperformancemeasure
(heuristic:simplerisbetter)
BasedonSlidebyPedroDomingos32
![Page 33: Classification: Decision Trees · Basic Algorithm for Top-Down Learning of Decision Trees [ID3, C4.5 by Quinlan] node= root of decision tree Main loop: 1. Aßthe “best” decision](https://reader035.fdocuments.in/reader035/viewer/2022062414/5ec67d03ae6d2609843381e5/html5/thumbnails/33.jpg)
Reduced-ErrorPruning
Splittrainingdatafurtherintotraining andvalidation sets
Growtreebasedontraining set
Dountilfurtherpruningisharmful:1. Evaluateimpactonvalidationsetofpruningeach
possiblenode(plusthosebelowit)2. Greedilyremovethenodethatmostimproves
validationset accuracy
SlidebyPedroDomingos33
![Page 34: Classification: Decision Trees · Basic Algorithm for Top-Down Learning of Decision Trees [ID3, C4.5 by Quinlan] node= root of decision tree Main loop: 1. Aßthe “best” decision](https://reader035.fdocuments.in/reader035/viewer/2022062414/5ec67d03ae6d2609843381e5/html5/thumbnails/34.jpg)
PruningDecisionTrees• Pruningofthedecisiontreeisdonebyreplacingawhole
subtree byaleafnode.• Thereplacementtakesplaceifadecisionruleestablishesthat
theexpectederrorrateinthesubtree isgreaterthaninthesingleleaf.
• Forexample,
Color
1positive0negative
0positive2negative
red blue
Training Color
1positive3negative
1positive1negative
red blue
Validation
BasedonExamplefromM.desJardins &T.Finin
2correct4incorrect
Ifwehadsimplypredictedthemajorityclass(negative),wemake2errorsinsteadof4.
34
![Page 35: Classification: Decision Trees · Basic Algorithm for Top-Down Learning of Decision Trees [ID3, C4.5 by Quinlan] node= root of decision tree Main loop: 1. Aßthe “best” decision](https://reader035.fdocuments.in/reader035/viewer/2022062414/5ec67d03ae6d2609843381e5/html5/thumbnails/35.jpg)
BasedonSlidebyPedroDomingos
EffectofReduced-ErrorPruning
35
Ontrainingdataitlooksgreat
Butthat’snotthecaseforthetest(validation)data
![Page 36: Classification: Decision Trees · Basic Algorithm for Top-Down Learning of Decision Trees [ID3, C4.5 by Quinlan] node= root of decision tree Main loop: 1. Aßthe “best” decision](https://reader035.fdocuments.in/reader035/viewer/2022062414/5ec67d03ae6d2609843381e5/html5/thumbnails/36.jpg)
BasedonSlidebyPedroDomingos
EffectofReduced-ErrorPruning
Thetreeisprunedbacktotheredlinewhereitgivesmoreaccurateresultsonthetestdata 36
![Page 37: Classification: Decision Trees · Basic Algorithm for Top-Down Learning of Decision Trees [ID3, C4.5 by Quinlan] node= root of decision tree Main loop: 1. Aßthe “best” decision](https://reader035.fdocuments.in/reader035/viewer/2022062414/5ec67d03ae6d2609843381e5/html5/thumbnails/37.jpg)
Summary:DecisionTreeLearning• Widelyusedinpractice
• Strengthsinclude– Fastandsimpletoimplement– Canconverttorules– Handlesnoisydata
• Weaknessesinclude– Univariate splits/partitioningusingonlyoneattributeatatime--- limitstypesofpossibletrees
– Largedecisiontreesmaybehardtounderstand– Requiresfixed-lengthfeaturevectors– Non-incremental(i.e.,batchmethod)
37
![Page 38: Classification: Decision Trees · Basic Algorithm for Top-Down Learning of Decision Trees [ID3, C4.5 by Quinlan] node= root of decision tree Main loop: 1. Aßthe “best” decision](https://reader035.fdocuments.in/reader035/viewer/2022062414/5ec67d03ae6d2609843381e5/html5/thumbnails/38.jpg)
38
Summary:DecisionTreeLearning• Representation: decisiontrees• Bias: prefersmalldecisiontrees• Searchalgorithm: greedy• Heuristicfunction: informationgainorinformation
contentorothers• Overfitting /pruning
SlidebyPedroDomingos