Naive+KNN.1018

8/12/2019 Naive+KNN.1018

1/16

Naive Bayes Classifier

The naive Bayes classifier assigns an

instance skwith attribute values (A1=v1,

A2=v2, ,Am=vm )to class Ciwith

maximum Prob(Ci|(v1, v2, , vm)) for all i.

The naive Bayes classifier exploits the

Bayess rule and assumes independence

of attributes.

8/12/2019 Naive+KNN.1018

2/16

Likelihood of skbelonging to Ci

Likelihood of skbelonging to Cj

Therefore, when comparing Prob(Ci| (v1, v2, , vm)) andP(Cj |(v1, v2, , vm)), we only need to compute P((v1,

v2, , vm)|Ci)P(Ci) and P((v1, v2, , vm)|Cj)P(Cj)

miim

miP

CPCPC

v,...,v,v

)(|v,...,v,vv,...,v,v|Prob

21

2121

m

jjm

mjP

CPCPC

v,...,v,v

)(|v,...,v,vv,...,v,v|Prob

21

21

21

8/12/2019 Naive+KNN.1018

3/16

8/12/2019 Naive+KNN.1018

4/16

An Example of the Nave Bayes

ClassifierThe weather data, with counts and probabilities

outlook temperature humidity windy play

yes no yes no yes no yes no yes no

sunny 2 3 hot 2 2 high 3 4 false 6 2 9 5

overcast 4 0 mild 4 2 normal 6 1 true 3 3

rainy 3 2 cool 3 1

sunny 2/9 3/5 hot 2/9 2/5 high 3/9 4/5 false 6/9 2/5 9/14 5/14

overcast 4/9 0/5 mild 4/9 2/5 normal 6/9 1/5 true 3/9 3/5

rainy 3/9 2/5 cool 3/9 1/5

A new day


sunny cool high true ?

8/12/2019 Naive+KNN.1018

5/16

Likelihood of yes

Likelihood of no

Therefore, the prediction is No

0053.014

9

9

3

9

3

9

3

9

2

0206.014

5

5

3

5

4

5

1

5

3

8/12/2019 Naive+KNN.1018

6/16

The Naive Bayes Classifier for

Data Sets with Numerical Attribute

Values

One common practice to handle numerical

attribute values is to assume normaldistributions for numerical attributes.

8/12/2019 Naive+KNN.1018

7/16

The numeric weather data with summary statistics


yes no yes no yes no yes no yes no

sunny 2 3 83 85 86 85 false 6 2 9 5

overcast 4 0 70 80 96 90 true 3 3

rainy 3 2 68 65 80 70

64 72 65 9569 71 70 91

75 80

75 70

72 90

81 75

sunny 2/9 3/5 mean 73 74.6 mean 79.1 86.2 false 6/9 2/5 9/14 5/14

overcast 4/9 0/5 std

dev

6.2 7.9 std

dev

10.2 9.7 true 3/9 3/5

rainy 3/9 2/5

8/12/2019 Naive+KNN.1018

8/16

Letx1,x2, ,xnbe the values of a numerical

attribute in the training data set.

2

2

2

1)(

1

1

1

1

2

1

w

ewf

xn

xn

n

i

i

n

i

i

8/12/2019 Naive+KNN.1018

9/16

For examples,

Likelihood of Yes =

Likelihood of No =

000036.0149

930221.00340.0

92

000136.014

5

5

3038.00291.0

5

3

0340.02.62

1Yes|66etemperatur

22.62

27366

ef

8/12/2019 Naive+KNN.1018

10/16

Instance-Based Learning

In instance-based learning, we take k

nearest training samples of a new instance

(v1, v2, , vm) and assign the new

instance to the class that has mostinstances in the knearest training samples.

Classifiers that adopt instance-based

learning are commonly called the KNNclassifiers.

8/12/2019 Naive+KNN.1018

11/16

The basic version of the KNN classifiers works

only for data sets with numerical values.

However, extensions have been proposed for

handling data sets with categorical attributes. If the number of training samples is sufficiently

large, then it can be proved statistically that the

KNN classifier can deliver the accuracy

achievable with learning from the training dataset.

8/12/2019 Naive+KNN.1018

12/16

However, if the number of training

samples is not large enough, the KNN

classifier may not work well.

8/12/2019 Naive+KNN.1018

13/16

If the data set is noiseless, then the 1NN classifier should work well.

In general, the more noisy the data set is, the higher should k be set.

However, the optimal kvalue should be figured out through cross

validation.

The ranges of attribute values should be normalized, before the

KNN classifier is applied. There are two common normalization

approaches

, where and 2are the mean and the variance of

the attribute values, respectively.

vw

vv

vvw

minmax

min

8/12/2019 Naive+KNN.1018

14/16

Cross Validatioan

Most data classification algorithms require someparameters to be set, e.g. kin KNN classifierand the tree pruning threshold in the decisiontree.

One way to find an appropriate parametersetting is through k-fold cross validation,normally k=10.

In the k-fold cross validation, the training data

set is divided into ksubsets. Then k runs of theclassification algorithm is conducted, with eachsubset serving as the test set once, while usingthe remaining (k-1) subsets as the training set.

8/12/2019 Naive+KNN.1018

15/16

The parameter values that yield maximum

accuracy in cross validation are then

adopted.

8/12/2019 Naive+KNN.1018

16/16

Example of the KNN Classifiers

If an 1NN classifier is employed, then the

prediction of = X. If an 3NN classifier is employed, then

prediction of = O.

Naive+KNN.1018

Documents

Transcript of Naive+KNN.1018