INC 551 Artificial Intelligence

27
INC 551 Artificial Intelligence Lecture 11 Machine Learning (Continue)

description

INC 551 Artificial Intelligence. Lecture 11 Machine Learning (Continue). Bayes Classifier. Bayes Rule. Play Tennis Example. John wants to play tennis everyday. However, in some days, the condition is not good. So, he decide not to play. - PowerPoint PPT Presentation

Transcript of INC 551 Artificial Intelligence

Page 1: INC 551  Artificial Intelligence

INC 551 Artificial Intelligence

Lecture 11

Machine Learning (Continue)

Page 2: INC 551  Artificial Intelligence

Bayes Classifier

Bayes Rule

Page 3: INC 551  Artificial Intelligence

Play Tennis Example

John wants to play tennis everyday.

However, in some days, the condition is not good.So, he decide not to play.

The following table is the record for the last 14 days.

Page 4: INC 551  Artificial Intelligence

Outlook Temperature Humidity Wind PlayTennis

Sunny Hot High Weak No

Sunny Hot High Strong No

Overcast Hot High Weak Yes

Rain Mild High Weak Yes

Rain Cool Normal Weak Yes

Rain Cool Normal Strong No

Overcast Cool Normal Strong Yes

Sunny Mild High Weak No

Sunny Cool Normal Weak Yes

Rain Mild Normal Weak Yes

Sunny Mild Normal Strong Yes

Overcast Mild High Strong Yes

Overcast Hot Normal Weak Yes

Rain Mild High Strong No

Page 5: INC 551  Artificial Intelligence

Question:

Today’s condition is

<Sunny, Mild Temperature, Normal Humidity, Strong Wind>

Do you think John will play tennis?

Page 6: INC 551  Artificial Intelligence

)|( PlayTennisconditionPFind

We need to use naïve Bayes assumption.Assume that all events are independent.

)|()|(

)|()|(

)|(

PlayTennisstrongPPlayTennisnormalP

PlayTennismildPPlayTennissunnyP

PlayTennisstrongnormalmildsunnyP

Now, let’s look at each property

Page 7: INC 551  Artificial Intelligence

6.05/3)|(

33.09/3)|(

2.05/1)|(

66.09/6)|(

4.05/2)|(

44.09/4)|(

6.05/3)|(

22.09/2)|(

noPlayTennisstrongWindP

yesPlayTennisstrongWindP

noPlayTennisnormalHumidP

yesPlayTennisnormalHumidP

noPlayTennismildTempP

yesPlayTennismildTempP

noPlayTennissunnyP

yesPlayTennissunnyP

Page 8: INC 551  Artificial Intelligence

0288.06.02.04.06.0

)|(

022.033.066.044.022.0

)|(

noPlayTennisstrongnormalmildsunnyP

yesPlayTennisstrongnormalmildsunnyP

Using Bayes rule

)(

)()|()|(

conditionP

PlayTennisPPlayTennisconditionPconditionPlayTennisP

)(

01415.0

)(

643.0022.0)|(

conditionPconditionPconditionyesPlayTennisP

)(

01028.0

)(

357.00288.0)|(

conditionPconditionPconditionnoPlayTennisP

Page 9: INC 551  Artificial Intelligence

Since P(condition) is the same, we can conclude that John is more likely to play tennis today.

Note that, we do not need to compute P(condition) to get the answer. However, if you want to get the number, we can calculate P(condition) in the way similar to normalize the probability.

02443.001028.001415.0)(

)()()(

conditionP

PlayTennisconditionPPlayTennisconditionPconditionP

Page 10: INC 551  Artificial Intelligence

42.002443.0

01028.0)|(

58.002443.0

01415.0)|(

conditionnoPlayTennisP

conditionyesPlayTennisP

Therefore, John is more likely to play tennis today with 58% chance.

Page 11: INC 551  Artificial Intelligence

Learning and Bayes Classifier

Learning is the adjustment of probability valuesto compute a posterior probability when new dataIs added.

Page 12: INC 551  Artificial Intelligence

Classifying Object ExampleSuppose we want to classify objects into two classes, A and B. There are two features that we can measure from each object, f1 and f2.We sample four objects randomly to be a database and classify it by hand.

Sample f1 f2 Class

1 5.2 1.2 B

2 2.3 5.4 A

3 1.5 4.4 A

4 4.5 2.1 B

Now, we have another sample that have f1=3.2f2=4.2 we want to know what class it is.

Page 13: INC 551  Artificial Intelligence

We want to find )|( featureClassP

Using Bayes rule

)(

)()|()|(

featureP

ClassPClassfeaturePfeatureClassP

From the table, we will count the number of events.

5.04/2)(

5.04/2)(

BClassP

AClassP

Page 14: INC 551  Artificial Intelligence

)|( ClassfeaturePFind

Again, we use the naïve Bayes assumption.Assume that all events are independent.

)|2()|1()|21( ClassfPClassfPClassffP

To find we need to assume probabilitydistribution because the features are continuous value.

The most common form of distribution is Gaussian(normal) distribution.

)|1( ClassfP

Page 15: INC 551  Artificial Intelligence

Gaussian distribution

2

2

2 2

)(exp

2

1)(

x

xP

There are two parameters: mean µ and variance σ

Using the maximum likelihood principle, the meanand the variance can be estimated from the samplesIn the database.

Page 16: INC 551  Artificial Intelligence

Class Af1: Mean = (2.3+1.5 )/2 = 1.9 SD = 0.4f2: Mean = (5.4+4.4 )/2 = 4.9 SD = 0.5

Class Bf1: Mean = (5.2+4.5 )/2 = 4.85 SD = 0.35f2: Mean = (1.2+2.1 )/2 = 1.65 SD = 0.45

)4.0(2

)9.1(exp)4.0(2

1)|1(

2

2

2

xAfP

Page 17: INC 551  Artificial Intelligence

0051.0)4.0(2

)9.12.3(exp)4.0(2

1)|1(

2

2

2

AfP

The object that we want to classify has f1=3.2 f2=4.2.

2995.0)5.0(2

)9.42.4(exp)5.0(2

1)|2(

2

2

2

AfP

05-1.7016e)35.0(2

)85.42.3(exp)35.0(2

1)|1(

2

2

2

BfP

08-9.4375e)45.0(2

)65.12.4(exp)45.0(2

1)|2(

2

2

2

BfP

Page 18: INC 551  Artificial Intelligence

Therefore,

12-1.6059e8-9.4375e5-1.7016e)|21(

0015.02995.00051.0)|21(

BClassffP

AClassffP

)(

)()|()|(

featureP

ClassPClassfeaturePfeatureClassP

From Bayes

)(

5.00015.0)|(

featurePfeatureAP

)(

5.0126059.1)|(

featureP

efeatureBP

Therefore, we should classify the sample as Class A.

Page 19: INC 551  Artificial Intelligence

Nearest Neighbor Classification

NN is considered as no model classification.

Nearest Neighbor’s PrincipleThe unknown sample is classified to be the sameclass as the sample with closet distance.

Page 20: INC 551  Artificial Intelligence

Feature 1

Feature 2

Closet Distance

We classify the sample as a circle.

Page 21: INC 551  Artificial Intelligence

Distance between Samples

k kNN

kkkN

i

kii yxyxyxyxyxD )(..)()()(),( 22111

Sample X and Y have multi-dimension feature values.

0

1

3

2

X

3

5

1

2

Y

The distance between sample X,Y can be calculated bythis formula.

Page 22: INC 551  Artificial Intelligence

k kNN

kkkN

i

kii yxyxyxyxyxD )(..)()()(),( 22111

If k = 1 , the distance is called Manhattan distanceIf k = 2 , the distance is called Euclidean distanceIf k = ∞ , the distance is the maximum value of feature

Euclidean is well-known and is the prefer one.

Page 23: INC 551  Artificial Intelligence

Sample f1 f2 Class

1 5.2 1.2 B

2 2.3 5.4 A

3 1.5 4.4 A

4 4.5 2.1 B

Classifying Object with NN

Now, we have another sample, f1=3.2 f2=4.2We want to know its class.

Page 24: INC 551  Artificial Intelligence

Compute Euclidian distance from it to all other samples

4698.2)1.22.4()5.42.3()4,(

7117.1)4.42.4()5.12.3()3,(

5.1)4.52.4()3.22.3()2,(

6056.3)2.12.4()2.52.3()1,(

22

22

22

22

sxD

sxD

sxD

sxD

The unknown sample has the closest distance to thesecond sample. Therefore, we classify it to be thesame class as the second sample, which is Class A.

Page 25: INC 551  Artificial Intelligence

K-Nearest Neighbor (KNN)

Instead of using the closet sample as the decided class,we use the closet k samples as the decided class.

Page 26: INC 551  Artificial Intelligence

Feature 1

Feature 2

Example k=3

The data is classified as a circle

Page 27: INC 551  Artificial Intelligence

Feature 1

Feature 2

Example k=5

The data is classified as a star.