Fuzzy-Rough Feature Significance for Fuzzy Decision Trees Advanced Reasoning Group Department of...

15
Fuzzy-Rough Feature Significance Fuzzy-Rough Feature Significance for Fuzzy Decision Trees for Fuzzy Decision Trees Advanced Reasoning Group Department of Computer Science The University of Wales, Aberystwyth Richard Jensen Qiang Shen

Transcript of Fuzzy-Rough Feature Significance for Fuzzy Decision Trees Advanced Reasoning Group Department of...

Page 1: Fuzzy-Rough Feature Significance for Fuzzy Decision Trees Advanced Reasoning Group Department of Computer Science The University of Wales, Aberystwyth.

Fuzzy-Rough Feature Significance Fuzzy-Rough Feature Significance for Fuzzy Decision Treesfor Fuzzy Decision Trees

Advanced Reasoning GroupDepartment of Computer Science

The University of Wales,Aberystwyth

Richard JensenQiang Shen

Page 2: Fuzzy-Rough Feature Significance for Fuzzy Decision Trees Advanced Reasoning Group Department of Computer Science The University of Wales, Aberystwyth.

OutlineOutline

• Utility of decision tree inductionUtility of decision tree induction

• Importance of attribute selectionImportance of attribute selection

• Introduction of fuzzy-rough conceptsIntroduction of fuzzy-rough concepts

• Evaluation of the fuzzy-rough metricEvaluation of the fuzzy-rough metric

• Results of F-ID3 vs FR-ID3Results of F-ID3 vs FR-ID3

• ConclusionsConclusions

Page 3: Fuzzy-Rough Feature Significance for Fuzzy Decision Trees Advanced Reasoning Group Department of Computer Science The University of Wales, Aberystwyth.

Decision TreesDecision Trees

• Popular classification algorithm in data mining Popular classification algorithm in data mining and machine learningand machine learning

• Fuzzy decision trees (FDTs) follow similar Fuzzy decision trees (FDTs) follow similar principles to crisp decision treesprinciples to crisp decision trees

• FDTs allow greater flexibilityFDTs allow greater flexibility

• Partitioning of the instance space; attributes are Partitioning of the instance space; attributes are selected to derive partitionsselected to derive partitions

• Hence, attribute selection is an important factor Hence, attribute selection is an important factor in decision tree qualityin decision tree quality

Page 4: Fuzzy-Rough Feature Significance for Fuzzy Decision Trees Advanced Reasoning Group Department of Computer Science The University of Wales, Aberystwyth.

Fuzzy Decision TreesFuzzy Decision Trees• Object membershipObject membership

• Traditionally, node membership of {0,1}Traditionally, node membership of {0,1}• Here, membership is any value in the range [0,1]Here, membership is any value in the range [0,1]• Calculated from conjunction of membership degrees Calculated from conjunction of membership degrees

along path to the nodealong path to the node

• Fuzzy testsFuzzy tests• Carried out within nodes to determine the membership Carried out within nodes to determine the membership

of feature values to fuzzy setsof feature values to fuzzy sets

• Stopping criteriaStopping criteria

• Measure of feature significanceMeasure of feature significance

Page 5: Fuzzy-Rough Feature Significance for Fuzzy Decision Trees Advanced Reasoning Group Department of Computer Science The University of Wales, Aberystwyth.

Training set Training set SS and (optionally) depth of decision tree and (optionally) depth of decision tree ll

Start to form decision tree from the top level,Start to form decision tree from the top level,

Do loopDo loop untiluntil (1)(1) the depth of the tree gets to the depth of the tree gets to ll or or (2)(2) there is no node to expandthere is no node to expand

a)a)  Gauge significance of each attribute of Gauge significance of each attribute of SS not already not already expanded in this branchexpanded in this branch

b)b)  Expand the attribute with the most significanceExpand the attribute with the most significance

c)c)  Stop expansion of the leaf node of attribute if Stop expansion of the leaf node of attribute if maximum significance obtainedmaximum significance obtained

End do loopEnd do loop

Decision Tree AlgorithmDecision Tree Algorithm

Page 6: Fuzzy-Rough Feature Significance for Fuzzy Decision Trees Advanced Reasoning Group Department of Computer Science The University of Wales, Aberystwyth.

Feature SignificanceFeature Significance• Previous FDT inducers use fuzzy entropyPrevious FDT inducers use fuzzy entropy

• Little research in the area of alternativesLittle research in the area of alternatives

• Fuzzy-rough feature significance has been used Fuzzy-rough feature significance has been used previously in feature selection with much successpreviously in feature selection with much success

• This can also be used to gauge feature This can also be used to gauge feature importance within FDT constructionimportance within FDT construction

• The fuzzy-rough measure extends concepts from The fuzzy-rough measure extends concepts from crispcrisp rough set theory rough set theory

Page 7: Fuzzy-Rough Feature Significance for Fuzzy Decision Trees Advanced Reasoning Group Department of Computer Science The University of Wales, Aberystwyth.

Crisp Rough SetsCrisp Rough Sets

[x][x]BB is the set of all points which are is the set of all points which are indiscernibleindiscernible

with point with point x x in terms of feature subset B.in terms of feature subset B.

UpperUpperApproximationApproximation

Set XSet X

LowerLowerApproximationApproximation

Equivalence Equivalence class class [x][x]BB

Page 8: Fuzzy-Rough Feature Significance for Fuzzy Decision Trees Advanced Reasoning Group Department of Computer Science The University of Wales, Aberystwyth.

Fuzzy Equivalence ClassesFuzzy Equivalence Classes

Image:Rough Fuzzy Hybridization: A New Trend in Decision Making,S. K. Pal and A. Skowron (eds), Springer-Verlag, Singapore, 1999

• Incorporate vaguenessIncorporate vagueness

• Handle real valued dataHandle real valued data

• Cope with noisy dataCope with noisy data

CrispCrisp equivalence class equivalence class

FuzzyFuzzy equivalence class equivalence class

At the centre of Fuzzy-Rough Feature Selection At the centre of Fuzzy-Rough Feature Selection

Page 9: Fuzzy-Rough Feature Significance for Fuzzy Decision Trees Advanced Reasoning Group Department of Computer Science The University of Wales, Aberystwyth.

Fuzzy-Rough SignificanceFuzzy-Rough Significance

• Deals with real-valued features via fuzzy setsDeals with real-valued features via fuzzy sets

• Fuzzy lower approximation:Fuzzy lower approximation:

• Fuzzy positive region:Fuzzy positive region:

• Evaluation function:Evaluation function:

• Feature importance is estimated with thisFeature importance is estimated with this

Page 10: Fuzzy-Rough Feature Significance for Fuzzy Decision Trees Advanced Reasoning Group Department of Computer Science The University of Wales, Aberystwyth.

EvaluationEvaluation

• Is the Is the γγ’ metric a useful gauger of feature ’ metric a useful gauger of feature significance?significance?

• γγ’ metric compared with leading feature rankers:’ metric compared with leading feature rankers:• Information Gain, Gain Ratio, ChiInformation Gain, Gain Ratio, Chi22, Relief, OneR, Relief, OneR

• Applied to test data:Applied to test data:• 30 random feature values for 400 objects30 random feature values for 400 objects• 2 or 3 features used to determine classification2 or 3 features used to determine classification

• Task: locate those features that affect the Task: locate those features that affect the decisiondecision

Page 11: Fuzzy-Rough Feature Significance for Fuzzy Decision Trees Advanced Reasoning Group Department of Computer Science The University of Wales, Aberystwyth.

Evaluation…Evaluation…

• Results for Results for xx**yy**zz22 > 0.125 > 0.125

• Results for (Results for (xx + + yy))33 < 0.125 < 0.125

• FR, IG and GR perform bestFR, IG and GR perform best

• FR metric locates the most important featuresFR metric locates the most important features

Page 12: Fuzzy-Rough Feature Significance for Fuzzy Decision Trees Advanced Reasoning Group Department of Computer Science The University of Wales, Aberystwyth.

FDT ExperimentsFDT Experiments

• Fuzzy ID3 (F-ID3) compared with Fuzzy-Rough ID3 Fuzzy ID3 (F-ID3) compared with Fuzzy-Rough ID3 (FR-ID3)(FR-ID3)

• Only difference between methods is the choice of Only difference between methods is the choice of feature significance measurefeature significance measure

• Datasets used taken from the machine learning Datasets used taken from the machine learning repositoryrepository

• Data split into two equal halves: training and Data split into two equal halves: training and testingtesting

• Resulting trees converted to equivalent rulesetsResulting trees converted to equivalent rulesets

Page 13: Fuzzy-Rough Feature Significance for Fuzzy Decision Trees Advanced Reasoning Group Department of Computer Science The University of Wales, Aberystwyth.

ResultsResults

• Real-valued dataReal-valued data

• Average ruleset sizeAverage ruleset size• 56.7 for F-ID356.7 for F-ID3• 88.6 for FR-ID388.6 for FR-ID3

• F-ID3 performs marginally better than FR-ID3F-ID3 performs marginally better than FR-ID3

Page 14: Fuzzy-Rough Feature Significance for Fuzzy Decision Trees Advanced Reasoning Group Department of Computer Science The University of Wales, Aberystwyth.

Results…Results…

• Crisp dataCrisp data

• Average ruleset sizeAverage ruleset size• 30.2 for F-ID330.2 for F-ID3• 28.8 for FR-ID328.8 for FR-ID3

• FR-ID3 performs marginally better than F-ID3FR-ID3 performs marginally better than F-ID3

Page 15: Fuzzy-Rough Feature Significance for Fuzzy Decision Trees Advanced Reasoning Group Department of Computer Science The University of Wales, Aberystwyth.

ConclusionConclusion

• Decision trees are a popular means of Decision trees are a popular means of classificationclassification

• The selection of branching attributes is key to The selection of branching attributes is key to

resulting tree qualityresulting tree quality

• The use of a fuzzy-rough metric for this purpose The use of a fuzzy-rough metric for this purpose looks promisinglooks promising

• Future workFuture work• Further experimental evaluationFurther experimental evaluation• Fuzzy-rough feature reduction pre-processorFuzzy-rough feature reduction pre-processor