Naïve Bayes Classifier

download Naïve Bayes Classifier

of 17

description

OutlineBackgroundProbability BasicsProbabilistic ClassificationNaïve Bayes Example: Play TennisRelevant IssuesConclusions

Transcript of Naïve Bayes Classifier

  • 5/28/2018 Na ve Bayes Classifier

    1/17

    COMP24111 Machine Learning

    Nave Bayes Classifier

    Ke Chen

  • 5/28/2018 Na ve Bayes Classifier

    2/17

    COMP24111 Machine Learning

    COMP24111 Machine Learning

    2

    Outline

    Background

    Probability Basics

    Probabilistic Classification

    Nave Bayes

    Example: Play Tennis

    Relevant Issues

    Conclusions

  • 5/28/2018 Na ve Bayes Classifier

    3/17

    COMP24111 Machine Learning

    COMP24111 Machine Learning

    3

    Background

    There are three methods to establish a classifier

    a) Model a classification rule directly

    Examples: k-NN, decision trees, perceptron, SVM

    b) Model the probability of class memberships given input data

    Example: perceptron with the cross-entropy costc) Make a probabilistic model of data within each class

    Examples: naive Bayes, model based classifiers

    a)and b)are examples of discriminative classification

    c)is an example of generativeclassification

    b)and c)are both examples of probabilisticclassification

  • 5/28/2018 Na ve Bayes Classifier

    4/17

    COMP24111 Machine Learning

    4

    Probability Basics

    Prior, conditional and joint probability for random variables

    Prior probability:

    Conditional probability:

    Joint probability:

    Relationship: Independence:

    Bayesian Rule

    )|,)( 121 XP(XX|XP 2

    )()()()(

    X

    XX

    PCPC|P|CP

    )(XP

    ))(),,( 22 ,XP(XPXX 11 XX

    )()|()()|() 2211122 XPXXPXPXXP,XP(X1 )()()),()|(),()|( 212121212 XPXP,XP(XXPXXPXPXXP 1

    EvidencePriorLikelihoodPosterior

  • 5/28/2018 Na ve Bayes Classifier

    5/17

    COMP24111 Machine Learning

    5

    Probability Basics

    Quiz: We have two six-sided dice. When they are tolled, it could end upwith the following occurance: (A) dice 1 lands on side 3, (B) dice 2 lands

    on side 1, and (C) Two dice sum to eight. Answer the following questions:

    ?toequal),(Is8)

    ?),(7)

    ?),(6)

    ?)|(5)

    ?)|(4)

    ?3)

    ?2)

    ?)()1

    P(C)P(A)CAP

    CAP

    BAP

    ACP

    BAP

    P(C)

    P(B)

    AP

  • 5/28/2018 Na ve Bayes Classifier

    6/17

    COMP24111 Machine Learning

    6

    Probabilistic Classification

    Establishing a probabilistic model for classification

    Discriminative model

    ),,,)( 1 n1L X(Xc,,cC|CP XX

    ),,,( 21 nxxx x

    Discriminative

    Probabilistic Classifier

    1x 2x nx

    )|( 1 xcP )|( 2 xcP )|( xLcP

  • 5/28/2018 Na ve Bayes Classifier

    7/17

    COMP24111 Machine Learning

    7

    Probabilistic Classification

    Establishing a probabilistic model for classification (cont.)

    Generative model

    ),,,)( 1 n1L X(Xc,,cCC|P XX

    Generative

    Probabilistic Model

    for Class 1

    )|( 1cP x

    1x 2x nx

    Generative

    Probabilistic Model

    for Class 2

    )|( 2cP x

    1x 2x nx

    Generative

    Probabilistic Model

    for Class L

    )|( LcP x

    1x 2x nx

    ),,,( 21 nxxx x

  • 5/28/2018 Na ve Bayes Classifier

    8/17

    COMP24111 Machine Learning

    8

    Probabilistic Classification

    MAP classification rule MAP: MaximumAPosterior

    Assign xto c* if

    Generative classification with the MAP rule Apply Bayesian rule to convert them into posterior probabilities

    Then apply the MAP rule

    Lc,,cccc|cCP|cCP 1** ,)()( xXxX

    LicCPcC|P

    P

    cCPcC|P|cCP

    ii

    iii

    ,,2,1for)()(

    )(

    )()()(

    xX

    xX

    xXxX

  • 5/28/2018 Na ve Bayes Classifier

    9/17

    COMP24111 Machine Learning

    9

    Nave Bayes

    Bayes classification

    Difficulty: learning the joint probability

    Nave Bayes classification

    Assumption that all input attributes are conditionally independent!

    MAP classification rule: for

    )()|,,()()()( 1 CPCXXPCPC|P|CP n XX

    )|,,( 1 CXXP n

    )|()|()|(

    )|,,()|(

    )|,,(),,,|()|,,,(

    21

    21

    22121

    CXPCXPCXP

    CXXPCXP

    CXXPCXXXPCXXXP

    n

    n

    nnn

    Lnn ccccccPcxPcxPcPcxPcxP ,,,),()]|()|([)()]|()|([ 1*

    1***

    1

    ),,,( 21 nxxx x

  • 5/28/2018 Na ve Bayes Classifier

    10/17

    COMP24111 M hi L i

    10

    Nave Bayes

    Nave Bayes Algorithm (for discrete input attributes)

    Learning Phase: Given a training set S,

    Output: conditional probability tables; for elements

    Test Phase: Given an unknown instance ,

    Look up tables to assign the label c* to Xif

    ;inexampleswith)|(estimate)|(),1;,,1(attributeeachofvalueattributeeveryFor

    ;inexampleswith)(estimate)(

    ofvaluetargeteachFor 1

    S

    S

    ijkjijkj

    jjjk

    ii

    Lii

    cCxXPcCxXPN,knjXx

    cCPcCP

    )c,,c(cc

    Lnn ccccccPcaPcaPcPcaPcaP ,,,),()]|()|([)()]|()|([ 1

    *1

    ***1

    ),,( 1 naa X

    LNX jj ,

  • 5/28/2018 Na ve Bayes Classifier

    11/17

    COMP24111 M hi L i

    11

    Example Example: Play Tennis

  • 5/28/2018 Na ve Bayes Classifier

    12/17

    COMP24111 M hi L i

    12

    Example Learning Phase

    Outlook Play=Yes Play=No

    Sunny 2/9 3/5Overcast 4/9 0/5

    Rain 3/9 2/5

    Temperature Play=Yes Play=No

    Hot 2/9 2/5Mild 4/9 2/5Cool 3/9 1/5

    Humidity Play=Yes Play=No

    High 3/9 4/5Normal 6/9 1/5

    Wind Play=Yes Play=No

    Strong 3/9 3/5Weak 6/9 2/5

    P(Play=Yes) = 9/14 P(Play=No) = 5/14

  • 5/28/2018 Na ve Bayes Classifier

    13/17

    COMP24111 M hi L i

    13

    Example Test Phase

    Given a new instance,x=(Outlook=Sunny, Temperature=Cool, Humidity=High, Wind=Strong)

    Look up tables

    MAP rule

    P(Outlook=Sunny|Play=No) = 3/5

    P(Temperature=Cool|Play==No) = 1/5

    P(Huminity=High|Play=No) = 4/5

    P(Wind=Strong|Play=No) = 3/5

    P(Play=No) = 5/14

    P(Outlook=Sunny|Play=Yes) = 2/9

    P(Temperature=Cool|Play=Yes) = 3/9

    P(Huminity=High|Play=Yes) = 3/9

    P(Wind=Strong|Play=Yes) = 3/9

    P(Play=Yes) = 9/14

    P(Yes|x):[P(Sunny|Yes)P(Cool|Yes)P(High|Yes)P(Strong|Yes)]P(Play=Yes) = 0.0053

    P(No|x):[P(Sunny|No) P(Cool|No)P(High|No)P(Strong|No)]P(Play=No) = 0.0206

    Given the factP(Yes|x) < P(No|x), we label x to be No.

  • 5/28/2018 Na ve Bayes Classifier

    14/17

    COMP24111 M hi L i

    14

    Example Test Phase

    Given a new instance,x=(Outlook=Sunny, Temperature=Cool, Humidity=High, Wind=Strong)

    Look up tables

    MAP rule

    P(Outlook=Sunny|Play=No) = 3/5

    P(Temperature=Cool|Play==No) = 1/5

    P(Huminity=High|Play=No) = 4/5

    P(Wind=Strong|Play=No) = 3/5

    P(Play=No) = 5/14

    P(Outlook=Sunny|Play=Yes) = 2/9

    P(Temperature=Cool|Play=Yes) = 3/9

    P(Huminity=High|Play=Yes) = 3/9

    P(Wind=Strong|Play=Yes) = 3/9

    P(Play=Yes) = 9/14

    P(Yes|x):[P(Sunny|Yes)P(Cool|Yes)P(High|Yes)P(Strong|Yes)]P(Play=Yes) = 0.0053

    P(No|x):[P(Sunny|No) P(Cool|No)P(High|No)P(Strong|No)]P(Play=No) = 0.0206

    Given the factP(Yes|x) < P(No|x), we label x to be No.

  • 5/28/2018 Na ve Bayes Classifier

    15/17

    COMP24111 M hi L i

    15

    Relevant Issues Violation of Independence Assumption

    For many real world tasks,

    Nevertheless, nave Bayes works surprisingly well anyway!

    Zero conditional probability Problem

    If no example contains the attribute value In this circumstance, during test

    For a remedy, conditional probabilities estimated with

    )|()|()|,,( 11 CXPCXPCXXP nn

    0)|(

    ,

    ijkjjkj cCaXPaX0)|()|()|( 1 inijki cxPcaPcxP

    )1examples,virtual""of(numberpriortoweight:

    )ofvaluespossiblefor/1(usually,estimateprior:

    whichforexamplestrainingofnumber:

    Candwhichforexamplestrainingofnumber:

    )|(

    mm

    Xttpp

    cCn

    caXn

    mn

    mpncCaXP

    j

    i

    ijkjc

    cijkj

  • 5/28/2018 Na ve Bayes Classifier

    16/17COMP24111 Machine Learning16

    Relevant Issues Continuous-valued Input Attributes

    Numberless values for an attribute

    Conditional probability modeled with the normal distribution

    Learning Phase:

    Output: normal distributions and

    Test Phase:

    Calculate conditional probabilities with all the normal distributions

    Apply the MAP rule to make a decision

    ijji

    ijji

    ji

    jij

    ji

    ij

    cC

    cX

    XcCXP

    whichforexamplesofXvaluesattributeofdeviationstandard:

    Cwhichforexamplesofvaluesattributeof(avearage)mean:

    2

    )(exp

    2

    1)|(

    2

    2

    Ln ccCXX ,,),,,(for 11 X

    Ln

    ),,(for 1 nXX X

    LicCP i ,,1)(

  • 5/28/2018 Na ve Bayes Classifier

    17/17COMP24111 M hi L i17

    Conclusions

    Nave Bayes based on the independence assumption

    Training is very easy and fast; just requiring considering each

    attribute in each class separately

    Test is straightforward; just looking up tables or calculating

    conditional probabilities with normal distributions

    A popular generative model

    Performance competitive to most of state-of-the-art classifiers

    even in presence of violating independence assumption

    Many successful applications, e.g., spam mail filtering

    A good candidate of a base learner in ensemble learning

    Apart from classification, nave Bayes can do more