3. Bayesian Decision Theory - Sophia - Inria · Bayesian theory. 62 Bayesian Decision Theory ....

63
3. Bayesian Decision Theory 3. Bayesian Decision Theory 3. Bayesian Decision Theory

Transcript of 3. Bayesian Decision Theory - Sophia - Inria · Bayesian theory. 62 Bayesian Decision Theory ....

Page 1: 3. Bayesian Decision Theory - Sophia - Inria · Bayesian theory. 62 Bayesian Decision Theory . Bayesian Decision Theory . Bayesian Decision Theory – Discrete Features– – Discrete

3. Bayesian Decision Theory3. Bayesian Decision Theory3. Bayesian Decision Theory

Page 2: 3. Bayesian Decision Theory - Sophia - Inria · Bayesian theory. 62 Bayesian Decision Theory . Bayesian Decision Theory . Bayesian Decision Theory – Discrete Features– – Discrete

Bayesian theoryPattern Recognition:

2

Bayesian Decision TheoryBayesian Decision TheoryBayesian Decision Theory

• Fundamental statistical approach to the problem of pattern classification

• Assumptions:1. Decision problem is posed in

probabilistic terms2. Ideal case: probability structure

underlying the categories is known perfectly

Page 3: 3. Bayesian Decision Theory - Sophia - Inria · Bayesian theory. 62 Bayesian Decision Theory . Bayesian Decision Theory . Bayesian Decision Theory – Discrete Features– – Discrete

Bayesian theoryPattern Recognition:

3

Statistical Decision TheoryStatisticalStatistical DecisionDecision TheoryTheory

• What is a pattern?

• In statistical pattern recognition, a pattern is a d-dimensional feature vector

Page 4: 3. Bayesian Decision Theory - Sophia - Inria · Bayesian theory. 62 Bayesian Decision Theory . Bayesian Decision Theory . Bayesian Decision Theory – Discrete Features– – Discrete

Bayesian theoryPattern Recognition:

4

Statistical Decision TheoryStatisticalStatistical DecisionDecision TheoryTheory

• The sea bass/salmon example

• State of nature• Prior

• State of nature is a random variable (ω):ω

= ω1 for sea bass; ω

= ω2 for salmon.

• The catch of salmon and sea bass is equiprobable P(ω1 ) = P(ω2 ) (Prior) P(ω1 ) + P( ω2 ) = 1 (exclusivity and

exhaustivity)

Page 5: 3. Bayesian Decision Theory - Sophia - Inria · Bayesian theory. 62 Bayesian Decision Theory . Bayesian Decision Theory . Bayesian Decision Theory – Discrete Features– – Discrete

Bayesian theoryPattern Recognition:

5

Statistical Decision TheoryStatisticalStatistical DecisionDecision TheoryTheory

• Decision rule with only the prior information• Decide ω1 if P(ω1 ) > P(ω2 )

otherwise decide ω2

• Use of the class-conditional information

•p(x | ω1 ) and p(x | ω2 ) describe the difference in lightness between populations of sea and salmon

Page 6: 3. Bayesian Decision Theory - Sophia - Inria · Bayesian theory. 62 Bayesian Decision Theory . Bayesian Decision Theory . Bayesian Decision Theory – Discrete Features– – Discrete

Bayesian theoryPattern Recognition:

6

Statistical Decision TheoryStatistical Decision TheoryStatistical Decision Theory

Page 7: 3. Bayesian Decision Theory - Sophia - Inria · Bayesian theory. 62 Bayesian Decision Theory . Bayesian Decision Theory . Bayesian Decision Theory – Discrete Features– – Discrete

Bayesian theoryPattern Recognition:

7

Statistical Decision TheoryStatisticalStatistical DecisionDecision TheoryTheory

• Bayes formula (demo):

Where in case of two categories

Page 8: 3. Bayesian Decision Theory - Sophia - Inria · Bayesian theory. 62 Bayesian Decision Theory . Bayesian Decision Theory . Bayesian Decision Theory – Discrete Features– – Discrete

Bayesian theoryPattern Recognition:

8

Statistical Decision TheoryStatistical Decision TheoryStatistical Decision Theory

Page 9: 3. Bayesian Decision Theory - Sophia - Inria · Bayesian theory. 62 Bayesian Decision Theory . Bayesian Decision Theory . Bayesian Decision Theory – Discrete Features– – Discrete

Bayesian theoryPattern Recognition:

9

Statistical Decision TheoryStatisticalStatistical DecisionDecision TheoryTheory

•Decision given the posterior probabilities

x is an observation for which:

if P(ω1 | x) > P(ω2 | x) True state of nature = ω1

if P(ω1 | x) < P(ω2 | x) True state of nature = ω2

•Therefore: whenever we observe a particular x, the probability of error is :

P(error | x) = P(ω1 | x) if we decide ω2 P(error | x) = P(ω2 | x) if we decide ω1

Page 10: 3. Bayesian Decision Theory - Sophia - Inria · Bayesian theory. 62 Bayesian Decision Theory . Bayesian Decision Theory . Bayesian Decision Theory – Discrete Features– – Discrete

Bayesian theoryPattern Recognition:

10

Statistical Decision TheoryStatisticalStatistical DecisionDecision TheoryTheory

•Minimizing the probability of error • Decide ω1 if P(ω1 | x) > P(ω2 | x); • otherwise decide ω2

Therefore: P(error | x) = min [P(ω1 | x), P(ω2 | x)]

(Bayes decision rule)

Page 11: 3. Bayesian Decision Theory - Sophia - Inria · Bayesian theory. 62 Bayesian Decision Theory . Bayesian Decision Theory . Bayesian Decision Theory – Discrete Features– – Discrete

Bayesian theoryPattern Recognition:

11

Probability of ErrorProbability of ErrorProbability of Error

Page 12: 3. Bayesian Decision Theory - Sophia - Inria · Bayesian theory. 62 Bayesian Decision Theory . Bayesian Decision Theory . Bayesian Decision Theory – Discrete Features– – Discrete

Bayesian theoryPattern Recognition:

12

Statistical Decision TheoryStatisticalStatistical DecisionDecision TheoryTheory

Generalization of the preceding ideas

– Use of more than one feature – Use more than two states of nature – Allowing actions and not only decide on the state

of nature – Introduce a loss function which is more general

than the probability of error

Page 13: 3. Bayesian Decision Theory - Sophia - Inria · Bayesian theory. 62 Bayesian Decision Theory . Bayesian Decision Theory . Bayesian Decision Theory – Discrete Features– – Discrete

Bayesian theoryPattern Recognition:

13

Statistical Decision TheoryStatistical Decision TheoryStatistical Decision Theory

•Allowing actions other than classification, primarily allows the possibility of rejection – refusing to make a decision in close or bad cases

•The loss function states how costly each action taken is

Page 14: 3. Bayesian Decision Theory - Sophia - Inria · Bayesian theory. 62 Bayesian Decision Theory . Bayesian Decision Theory . Bayesian Decision Theory – Discrete Features– – Discrete

Bayesian theoryPattern Recognition:

14

Statistical Decision TheoryStatistical Decision TheoryStatistical Decision Theory

•Let {ω1 , ω2 ,…, ωc } be the set of c states of nature (“categories”)

•Let {α1 , α2 ,…, αa } be the set of a possible actions

•Let λ(αi | ωj ) be the loss incurred for taking action αi when the state of nature is ωj

Page 15: 3. Bayesian Decision Theory - Sophia - Inria · Bayesian theory. 62 Bayesian Decision Theory . Bayesian Decision Theory . Bayesian Decision Theory – Discrete Features– – Discrete

Bayesian theoryPattern Recognition:

15

Statistical Decision TheoryStatistical Decision TheoryStatistical Decision Theory

• Overall risk (expected loss) R = Sum of all R(αi | x) for i = 1,…,a

• Minimizing the conditional risk R(αi | x) for i = 1,…, a

decision function

Page 16: 3. Bayesian Decision Theory - Sophia - Inria · Bayesian theory. 62 Bayesian Decision Theory . Bayesian Decision Theory . Bayesian Decision Theory – Discrete Features– – Discrete

Bayesian theoryPattern Recognition:

16

Minimum Risk ClassificationMinimum Risk ClassificationMinimum Risk Classification

• Bayes decision rule: Select the action αi for which R(αi | x) is minimum

R is minimum and R in this case is called the Bayes risk = best performance that can be achieved.

Page 17: 3. Bayesian Decision Theory - Sophia - Inria · Bayesian theory. 62 Bayesian Decision Theory . Bayesian Decision Theory . Bayesian Decision Theory – Discrete Features– – Discrete

Bayesian theoryPattern Recognition:

17

Two-Category ClassificationTwoTwo--Category ClassificationCategory Classification

• Two-category classification • α1 : deciding ω1• α2 : deciding ω2

• λij = λ(αi | ωj ) loss incurred for deciding ωi when the true state of nature is ωj

• Conditional risk:

Page 18: 3. Bayesian Decision Theory - Sophia - Inria · Bayesian theory. 62 Bayesian Decision Theory . Bayesian Decision Theory . Bayesian Decision Theory – Discrete Features– – Discrete

Bayesian theoryPattern Recognition:

18

Two-Category ClassificationTwoTwo--Category ClassificationCategory Classification

Our rule is the following:

if action α1 : “decide ω1 ” is taken

This results in the equivalent rule : decide ω1 if:

and decide ω2 otherwise

Page 19: 3. Bayesian Decision Theory - Sophia - Inria · Bayesian theory. 62 Bayesian Decision Theory . Bayesian Decision Theory . Bayesian Decision Theory – Discrete Features– – Discrete

Bayesian theoryPattern Recognition:

19

Two-Category ClassificationTwoTwo--Category ClassificationCategory Classification

•Likelihood ratio:

The preceding rule is equivalent to the following rule (assuming that ):

If

• Then take action α1 (decide ω1 ) • Otherwise take action α2 (decide ω2 )

Page 20: 3. Bayesian Decision Theory - Sophia - Inria · Bayesian theory. 62 Bayesian Decision Theory . Bayesian Decision Theory . Bayesian Decision Theory – Discrete Features– – Discrete

Bayesian theoryPattern Recognition:

20

Two-Category ClassificationTwoTwo--Category ClassificationCategory Classification

• Optimal decision property

• “If the likelihood ratio exceeds a threshold value independent of the input pattern x, we can take optimal actions”

Page 21: 3. Bayesian Decision Theory - Sophia - Inria · Bayesian theory. 62 Bayesian Decision Theory . Bayesian Decision Theory . Bayesian Decision Theory – Discrete Features– – Discrete

Bayesian theoryPattern Recognition:

21

Minimum-Error-Rate ClassificationMinimumMinimum--ErrorError--Rate ClassificationRate Classification

• Actions are decisions on classes

• If action αi is taken and the true state of nature is ωj then:

the decision is correct if i = j and in error if i ≠

j

• Seek a decision rule that minimizes the probability of error which is the error rate

Page 22: 3. Bayesian Decision Theory - Sophia - Inria · Bayesian theory. 62 Bayesian Decision Theory . Bayesian Decision Theory . Bayesian Decision Theory – Discrete Features– – Discrete

Bayesian theoryPattern Recognition:

22

Minimum-Error-Rate ClassificationMinimumMinimum--ErrorError--Rate ClassificationRate Classification

Introduction of the zero-one loss function:

Therefore, the conditional risk is:

“The risk corresponding to this loss function is the average probability of error”

Page 23: 3. Bayesian Decision Theory - Sophia - Inria · Bayesian theory. 62 Bayesian Decision Theory . Bayesian Decision Theory . Bayesian Decision Theory – Discrete Features– – Discrete

Bayesian theoryPattern Recognition:

23

Minimum-Error-Rate ClassificationMinimumMinimum--ErrorError--Rate ClassificationRate Classification

• Minimizing the risk requires maximizingsince

• For Minimum error rate

Page 24: 3. Bayesian Decision Theory - Sophia - Inria · Bayesian theory. 62 Bayesian Decision Theory . Bayesian Decision Theory . Bayesian Decision Theory – Discrete Features– – Discrete

Bayesian theoryPattern Recognition:

24

Minimum-Error-Rate ClassificationMinimumMinimum--ErrorError--Rate ClassificationRate Classification

• Investigate the loss function:

Let then decide

If λ

is the zero-one loss function which means:

Page 25: 3. Bayesian Decision Theory - Sophia - Inria · Bayesian theory. 62 Bayesian Decision Theory . Bayesian Decision Theory . Bayesian Decision Theory – Discrete Features– – Discrete

Bayesian theoryPattern Recognition:

25

Decision Regions: Effect of Loss FunctionsDecision Regions: Effect of Loss FunctionsDecision Regions: Effect of Loss Functions

Page 26: 3. Bayesian Decision Theory - Sophia - Inria · Bayesian theory. 62 Bayesian Decision Theory . Bayesian Decision Theory . Bayesian Decision Theory – Discrete Features– – Discrete

Bayesian theoryPattern Recognition:

26

Classifiers, Discriminant Functions and Decision Surfaces Classifiers, Discriminant Classifiers, Discriminant Functions and Decision SurfacesFunctions and Decision Surfaces

THE MULTICATEGORY CASE

Set of discriminant functions gi (x), i = 1,…, c

The classifier assigns a feature vector x to class ωi

if: gi (x) > gj (x) ∀j ≠

i

Page 27: 3. Bayesian Decision Theory - Sophia - Inria · Bayesian theory. 62 Bayesian Decision Theory . Bayesian Decision Theory . Bayesian Decision Theory – Discrete Features– – Discrete

Bayesian theoryPattern Recognition:

27

Classifiers, Discriminant Functions and Decision Surfaces Classifiers, Discriminant Classifiers, Discriminant Functions and Decision SurfacesFunctions and Decision Surfaces

•Let gi (x) = - R(αi | x) (max. discriminant corresponds to min. risk)

•For the minimum error rate, we take

gi (x) = P(ωi | x)(max. discrimination corresponds to max. posterior)

In this case, we can also write:

gi (x) = P(x | ωi ) P(ωi ) or

gi (x) = ln P(x | ωi ) + ln P(ωi ) (ln: natural logarithm)

Page 28: 3. Bayesian Decision Theory - Sophia - Inria · Bayesian theory. 62 Bayesian Decision Theory . Bayesian Decision Theory . Bayesian Decision Theory – Discrete Features– – Discrete

Bayesian theoryPattern Recognition:

28

General Statistical ClassifierGeneral Statistical ClassifierGeneral Statistical Classifier

Page 29: 3. Bayesian Decision Theory - Sophia - Inria · Bayesian theory. 62 Bayesian Decision Theory . Bayesian Decision Theory . Bayesian Decision Theory – Discrete Features– – Discrete

Bayesian theoryPattern Recognition:

29

Classifiers, Discriminant Functions and Decision Surfaces Classifiers, Discriminant Classifiers, Discriminant Functions and Decision SurfacesFunctions and Decision Surfaces

• Feature space divided into c decision regions if gi (x) > gj (x) ∀j ≠

i then x is in Ri

Ri means assign x to ωi

• The regions are separated by decision boundaries

Page 30: 3. Bayesian Decision Theory - Sophia - Inria · Bayesian theory. 62 Bayesian Decision Theory . Bayesian Decision Theory . Bayesian Decision Theory – Discrete Features– – Discrete

Bayesian theoryPattern Recognition:

30

The Two-Category CaseThe The TwoTwo--CategoryCategory CaseCase

• A classifier is a dichotomizer with two discriminant functions g1 and g2

• Let g(x) ≡

g1 (x) – g2 (x)• Decide ω1 if g(x) > 0; • Otherwise decide ω2

Page 31: 3. Bayesian Decision Theory - Sophia - Inria · Bayesian theory. 62 Bayesian Decision Theory . Bayesian Decision Theory . Bayesian Decision Theory – Discrete Features– – Discrete

Bayesian theoryPattern Recognition:

31

Classifiers, Discriminant Functions and Decision Surfaces Classifiers, Discriminant Classifiers, Discriminant Functions and Decision SurfacesFunctions and Decision Surfaces

Page 32: 3. Bayesian Decision Theory - Sophia - Inria · Bayesian theory. 62 Bayesian Decision Theory . Bayesian Decision Theory . Bayesian Decision Theory – Discrete Features– – Discrete

Bayesian theoryPattern Recognition:

32

The Normal DensityTheThe Normal Normal DensityDensity

Univariate density• Density which is analytically tractable

• Continuous density • A lot of processes are asymptotically Gaussian • Handwritten characters, speech sounds are

examples or prototypes corrupted by random process (central limit theorem)

Where: μ

= mean or expected value of x σ2 = the variance of x

Page 33: 3. Bayesian Decision Theory - Sophia - Inria · Bayesian theory. 62 Bayesian Decision Theory . Bayesian Decision Theory . Bayesian Decision Theory – Discrete Features– – Discrete

Bayesian theoryPattern Recognition:

33

The Normal DensityThe Normal DensityThe Normal Density

Page 34: 3. Bayesian Decision Theory - Sophia - Inria · Bayesian theory. 62 Bayesian Decision Theory . Bayesian Decision Theory . Bayesian Decision Theory – Discrete Features– – Discrete

Bayesian theoryPattern Recognition:

34

The Normal DensityTheThe Normal Normal DensityDensity

• Multivariate density • Multivariate normal density in d dimensions is:

where: mean vector

Σ

= d*d covariance matrix |Σ| and Σ-1 are the determinant

and inverse, respectively

Page 35: 3. Bayesian Decision Theory - Sophia - Inria · Bayesian theory. 62 Bayesian Decision Theory . Bayesian Decision Theory . Bayesian Decision Theory – Discrete Features– – Discrete

Bayesian theoryPattern Recognition:

35

Discriminant Functions for the Normal Density Discriminant Functions for the Normal Density Discriminant Functions for the Normal Density

• We saw that the minimum error-rate classification can be achieved by the discriminant function

• Case of multivariate normal distribution

Page 36: 3. Bayesian Decision Theory - Sophia - Inria · Bayesian theory. 62 Bayesian Decision Theory . Bayesian Decision Theory . Bayesian Decision Theory – Discrete Features– – Discrete

Bayesian theoryPattern Recognition:

36

Discriminant and Classification for Different Cases Discriminant and Classification for Discriminant and Classification for DifferentDifferent CasesCases

• Case• Features are statistically independent• Each feature has the same variance

where:

threshold for the category i

Page 37: 3. Bayesian Decision Theory - Sophia - Inria · Bayesian theory. 62 Bayesian Decision Theory . Bayesian Decision Theory . Bayesian Decision Theory – Discrete Features– – Discrete

Bayesian theoryPattern Recognition:

37

Discriminant and Classification for Different Cases Discriminant and Classification for Discriminant and Classification for DifferentDifferent CasesCases

• A classifier that uses linear discriminant functions is called “a linear machine”

• The decision surfaces for a linear machine are pieces of hyperplanes defined by:

Page 38: 3. Bayesian Decision Theory - Sophia - Inria · Bayesian theory. 62 Bayesian Decision Theory . Bayesian Decision Theory . Bayesian Decision Theory – Discrete Features– – Discrete

Bayesian theoryPattern Recognition:

38

Equal CovariancesEqual Equal CovariancesCovariances

Page 39: 3. Bayesian Decision Theory - Sophia - Inria · Bayesian theory. 62 Bayesian Decision Theory . Bayesian Decision Theory . Bayesian Decision Theory – Discrete Features– – Discrete

Bayesian theoryPattern Recognition:

39

Discriminant and Classification for Different Cases Discriminant and Classification for Discriminant and Classification for DifferentDifferent CasesCases

• The hyperplane separating Ri and Rj

always orthogonal to the line linking the means

Page 40: 3. Bayesian Decision Theory - Sophia - Inria · Bayesian theory. 62 Bayesian Decision Theory . Bayesian Decision Theory . Bayesian Decision Theory – Discrete Features– – Discrete

Bayesian theoryPattern Recognition:

40

Shift in PriorsShift in PriorsShift in Priors

Page 41: 3. Bayesian Decision Theory - Sophia - Inria · Bayesian theory. 62 Bayesian Decision Theory . Bayesian Decision Theory . Bayesian Decision Theory – Discrete Features– – Discrete

Bayesian theoryPattern Recognition:

41

Shift in PriorsShift in PriorsShift in Priors

Page 42: 3. Bayesian Decision Theory - Sophia - Inria · Bayesian theory. 62 Bayesian Decision Theory . Bayesian Decision Theory . Bayesian Decision Theory – Discrete Features– – Discrete

Bayesian theoryPattern Recognition:

42

Discrimination and Classification for Different Cases DiscriminaDiscriminationtion and and ClassificationClassification for for DifferentDifferent CasesCases

• Case Σi = Σ

(covariance of all classes are identical but arbitrary)

Hyperplane separating Ri and Rj

(the hyperplane separating Ri and Rj is generally not orthogonal to the line between the means)

Page 43: 3. Bayesian Decision Theory - Sophia - Inria · Bayesian theory. 62 Bayesian Decision Theory . Bayesian Decision Theory . Bayesian Decision Theory – Discrete Features– – Discrete

Bayesian theoryPattern Recognition:

43

Decision SurfacesDecision SurfacesDecision Surfaces

Page 44: 3. Bayesian Decision Theory - Sophia - Inria · Bayesian theory. 62 Bayesian Decision Theory . Bayesian Decision Theory . Bayesian Decision Theory – Discrete Features– – Discrete

Bayesian theoryPattern Recognition:

44

Decision SurfacesDecision SurfacesDecision Surfaces

Page 45: 3. Bayesian Decision Theory - Sophia - Inria · Bayesian theory. 62 Bayesian Decision Theory . Bayesian Decision Theory . Bayesian Decision Theory – Discrete Features– – Discrete

Bayesian theoryPattern Recognition:

45

Discrimination and Classification for Different Cases DiscriminaDiscriminationtion and Classification for Different and Classification for Different CasesCases

Case Σi = arbitrary• The covariance matrices are different for each category

(Hyperquadratics which are: hyperplanes, pairs of hyperplanes, hyperspheres, hyperellipsoids, hyperparaboloids, etc.)

where:

Page 46: 3. Bayesian Decision Theory - Sophia - Inria · Bayesian theory. 62 Bayesian Decision Theory . Bayesian Decision Theory . Bayesian Decision Theory – Discrete Features– – Discrete

Bayesian theoryPattern Recognition:

46

Decision BoundariesDecision BoundariesDecision Boundaries

Page 47: 3. Bayesian Decision Theory - Sophia - Inria · Bayesian theory. 62 Bayesian Decision Theory . Bayesian Decision Theory . Bayesian Decision Theory – Discrete Features– – Discrete

Bayesian theoryPattern Recognition:

47

Decision BoundariesDecision BoundariesDecision Boundaries

Page 48: 3. Bayesian Decision Theory - Sophia - Inria · Bayesian theory. 62 Bayesian Decision Theory . Bayesian Decision Theory . Bayesian Decision Theory – Discrete Features– – Discrete

Bayesian theoryPattern Recognition:

48

Decision BoundariesDecision BoundariesDecision Boundaries

Page 49: 3. Bayesian Decision Theory - Sophia - Inria · Bayesian theory. 62 Bayesian Decision Theory . Bayesian Decision Theory . Bayesian Decision Theory – Discrete Features– – Discrete

Bayesian theoryPattern Recognition:

49

Funny Dice ExampleFunny Dice ExampleFunny Dice Example

Page 50: 3. Bayesian Decision Theory - Sophia - Inria · Bayesian theory. 62 Bayesian Decision Theory . Bayesian Decision Theory . Bayesian Decision Theory – Discrete Features– – Discrete

Bayesian theoryPattern Recognition:

50

Funny Dice ExampleFunny Dice ExampleFunny Dice Example

Page 51: 3. Bayesian Decision Theory - Sophia - Inria · Bayesian theory. 62 Bayesian Decision Theory . Bayesian Decision Theory . Bayesian Decision Theory – Discrete Features– – Discrete

Bayesian theoryPattern Recognition:

51

Funny Dice ExampleFunny Dice ExampleFunny Dice Example

Page 52: 3. Bayesian Decision Theory - Sophia - Inria · Bayesian theory. 62 Bayesian Decision Theory . Bayesian Decision Theory . Bayesian Decision Theory – Discrete Features– – Discrete

Bayesian theoryPattern Recognition:

52

Funny Dice ExampleFunny Dice ExampleFunny Dice Example

Page 53: 3. Bayesian Decision Theory - Sophia - Inria · Bayesian theory. 62 Bayesian Decision Theory . Bayesian Decision Theory . Bayesian Decision Theory – Discrete Features– – Discrete

Bayesian theoryPattern Recognition:

53

Funny Dice ExampleFunny Dice ExampleFunny Dice Example

Page 54: 3. Bayesian Decision Theory - Sophia - Inria · Bayesian theory. 62 Bayesian Decision Theory . Bayesian Decision Theory . Bayesian Decision Theory – Discrete Features– – Discrete

Bayesian theoryPattern Recognition:

54

Funny Dice ExampleFunny Dice ExampleFunny Dice Example

Page 55: 3. Bayesian Decision Theory - Sophia - Inria · Bayesian theory. 62 Bayesian Decision Theory . Bayesian Decision Theory . Bayesian Decision Theory – Discrete Features– – Discrete

Bayesian theoryPattern Recognition:

55

Funny Dice ExampleFunny Dice ExampleFunny Dice Example

Page 56: 3. Bayesian Decision Theory - Sophia - Inria · Bayesian theory. 62 Bayesian Decision Theory . Bayesian Decision Theory . Bayesian Decision Theory – Discrete Features– – Discrete

Bayesian theoryPattern Recognition:

56

Funny Dice ExampleFunny Dice ExampleFunny Dice Example

Page 57: 3. Bayesian Decision Theory - Sophia - Inria · Bayesian theory. 62 Bayesian Decision Theory . Bayesian Decision Theory . Bayesian Decision Theory – Discrete Features– – Discrete

Bayesian theoryPattern Recognition:

57

Funny Dice ExampleFunny Dice ExampleFunny Dice Example

Page 58: 3. Bayesian Decision Theory - Sophia - Inria · Bayesian theory. 62 Bayesian Decision Theory . Bayesian Decision Theory . Bayesian Decision Theory – Discrete Features– – Discrete

Bayesian theoryPattern Recognition:

58

Funny Dice ExampleFunny Dice ExampleFunny Dice Example

Page 59: 3. Bayesian Decision Theory - Sophia - Inria · Bayesian theory. 62 Bayesian Decision Theory . Bayesian Decision Theory . Bayesian Decision Theory – Discrete Features– – Discrete

Bayesian theoryPattern Recognition:

59

Funny Dice ExampleFunny Dice ExampleFunny Dice Example

Page 60: 3. Bayesian Decision Theory - Sophia - Inria · Bayesian theory. 62 Bayesian Decision Theory . Bayesian Decision Theory . Bayesian Decision Theory – Discrete Features– – Discrete

Bayesian theoryPattern Recognition:

60

Funny Dice ExampleFunny Dice ExampleFunny Dice Example

Page 61: 3. Bayesian Decision Theory - Sophia - Inria · Bayesian theory. 62 Bayesian Decision Theory . Bayesian Decision Theory . Bayesian Decision Theory – Discrete Features– – Discrete

Bayesian theoryPattern Recognition:

61

Classifiers and Error ComputationClassifiers and Error ComputationClassifiers and Error Computation

• Bayes Classifier• Neyman-Pearson Classifier• Minimax Test (Minimax Classifier)• Error Compuation and Error Bounds• Linear Classifiers

Page 62: 3. Bayesian Decision Theory - Sophia - Inria · Bayesian theory. 62 Bayesian Decision Theory . Bayesian Decision Theory . Bayesian Decision Theory – Discrete Features– – Discrete

Bayesian theoryPattern Recognition:

62

Bayesian Decision Theory – Discrete FeaturesBayesian Decision Theory Bayesian Decision Theory –– Discrete FeaturesDiscrete Features

Components of x are binary or integer valued, x can take only one of m discrete values

v1 , v2 , …, vm

Case of independent binary features in the two- category problem

Let where each xi is either 0 or 1, with probabilities:

pi = P(xi = 1 | ω1 )qi = P(xi = 1 | ω2 )

Page 63: 3. Bayesian Decision Theory - Sophia - Inria · Bayesian theory. 62 Bayesian Decision Theory . Bayesian Decision Theory . Bayesian Decision Theory – Discrete Features– – Discrete

Bayesian theoryPattern Recognition:

63

Bayesian Decision Theory – Discrete FeaturesBayesian Decision Theory Bayesian Decision Theory –– Discrete FeaturesDiscrete Features

The discriminant function in this case is: