Feature Selection Poonam Buch. 2 The Problem The success of machine learning algorithms is usually...

Feature Selection

Poonam Buch

2

The Problem

The success of machine learning algorithms is usually dependent on the quality of data they operate on.

If data is inadequate or highly dimensional, or contains extraneous and irrelevant information, machine learning algorithms produce less accurate and less understandable results.

3

Motivation

Our primary focus is on obtaining the best overall classification regardless of the number of features needed to obtain that performance.

Feature selection can result in enhanced performance, a reduced hypothesis space and reduced storage requirement.

4

Feature selection is necessary to make learning task efficient and more accurate.

Well-chosen features conserves computation.

5

Methods of Feature Selection

Wrapper Method : It estimates the accuracy of feature subsets. It re-samples using actual induction algorithm.

Filter Method : It eliminates undesirable features out of the data before induction commences and uses entire training data while selection. It is faster than wrapper method.

6

Metrics of filter method of feature selection: Information Gain : It chooses an attribute

which best fits the training instances into subsets corresponding to the values of the attribute.

Bi-normal separation : Defined as F -1(tpr) - F-1(fpr) where F-1 is the standard normal

distribution’s inverse cumulative probability function.

Performance GoalsPrecision: tp/(tp + fp)Recall: tp/(tp + fn)F-measure: It is the average of precisionand recall.tp=no of positive cases containing word.fp=no of negative cases containing word.fn=false negatives.tn=true negatives.

7

This paper evaluates twelve metrics on abenchmark of 229 text classification instances.Algorithm: For each dataset d, For N = 5 trials: For each feature selection metric (various subset of

features) : * Train a classifier on the training set split. * Measure performance on the testing set split. End Evaluate the performance based on precision, recall, F-

measure and accuracy. EndEnd

8

Some Important Results F-measure averaged over 229 problems for each metric, varying the number of features. BNS performs better than using all features.

BNS obtained on average better recall than any other method. But if precision is the goal IG performs the best.

9

Under low skew, IG performs best and reaches the performance using all features. Under high skew, BNS performs substantially better than IG.

Under low skew, BNS is the only metrics that performs better than using all features. Under high skew, it performed best by a wide margin for any number of features selected.

10

Strong point : The paper succeeds in comparing all the metrics of feature selection and the results are well analyzed from point of view of f-measure, recall, precision and accuracy.

Weak point : The paper in trying to analyze all feature selection metrics fails to explain why a certain metric perform better than the other.

Conclusion The paper presents an extensive study of feature

selection metrics for highly dimensional data. The paper contributes a novel evaluation

methodology that considers the common problem of trying to select one or more metrics that have the best chances of obtaining best performance for a given dataset.

BNS paired with Odds ratio yields good precision.

BNS paired with F1 optimizes recall.

Feature Selection Poonam Buch. 2 The Problem The success of machine learning algorithms is usually...

Documents

Transcript of Feature Selection Poonam Buch. 2 The Problem The success of machine learning algorithms is usually...