Introduction to Statistical Pattern Recognition Part II setosa Iris versicolor Iris virginica...
Transcript of Introduction to Statistical Pattern Recognition Part II setosa Iris versicolor Iris virginica...
Introduction to Statistical Pattern Recognition
Part II
1/20/2011 ECE 523: Introduction to Biometrics 1
Outline
1/20/2011 ECE 523: Introduction to Biometrics 2
• Bayes Detection Rule Revisited
• Probability of Error
• Evaluating the Classifier
• Matlab illustrations
1/20/2011 ECE 523: Introduction to Biometrics 3
Bayes Decision Rule
Decide
• Two-class case
• N-class case
Given a feature vector x, assign it to class wj if:
Expanding P(wj|x) and P(wi|x)
1/20/2011 ECE 523: Introduction to Biometrics 4
Bayes Decision Rule
• N-class case
Given a feature vector x, assign it to class wj if:
• Likelihood Ratio: 2-class case
Likelihood ratio Threshold
1/20/2011 ECE 523: Introduction to Biometrics 5
• An error is made when we classify an observation as class wi when it is really in the j-th class. Denote the complement of region i as i
c , the probability of error is
Bayes Decision Rule: Probability of Error (N-class)
1/20/2011 ECE 523: Introduction to Biometrics 6
Bayes Decision Rule: Probability of Error (2-class)
1/20/2011 ECE 523: Introduction to Biometrics 7
• We can set the amount of error we will tolerate for misclassifying one of the classes
Case I: Fish Sorting Example (Salmon vs. Sea Bass)
-6 -4 -2 0 2 4 6 80
0.05
0.1
0.15
0.2
0.25
Feature-x
Posterior 1
Posterior 2
Salmon Sea Bass
x*
Salmon: $20/lb Sea Bass: $10/lb
To satisfy customers, which error should be minimized? Error I or Error II
I II
Bayes Decision Rule: Probability of Error (2-class)
1/20/2011 ECE 523: Introduction to Biometrics 8
Case II: Cancerous vs. Healthy Tissue
-6 -4 -2 0 2 4 6 80
0.05
0.1
0.15
0.2
0.25
Feature-x
Posterior 1
Posterior 2
Healthy Cancerous
x*
I II
Taking into account the patient’s well-being, which error should be minimized? Error I or Error II
Bayes Decision Rule: Probability of Error (2-class)
1/20/2011 ECE 523: Introduction to Biometrics 9
Bayes Decision Rule: Probability of Error (2-class)
-6 -4 -2 0 2 4 6 80
0.05
0.1
0.15
0.2
0.25
Feature-x
Posterior 1
Posterior 2Target Class
Non-target Class
x*
I
Region I shows the probability of false alarm or the probability of wrongly classifying as target (class w1) when it really belongs to class w2.
1/20/2011 ECE 523: Introduction to Biometrics 10
Example
We will look at a univariate classification problem with two classes. The class-conditionals are given by the normal distributions as follows:
The priors are
Adjust the decision boundary such to achieve a desired probability of false alarm, 𝑃 𝐹𝐴 =0.05, e.g., (a) probability that cancerous tissue is classified as healthy or (b) probability that sea bass is classified as salmon
1/20/2011 ECE 523: Introduction to Biometrics 11
Example
We need to find the value of 𝑥∗ such that
𝑥∗ is a quantile, i.e.,
x* = norminv(0.05/0.4,1,1);
x* = -0.15
1/20/2011 ECE 523: Introduction to Biometrics 12
Evaluating the Classifier
• Need to evaluate its usefulness by measuring the percentage of observations we correctly classify
• Important to report the probability of false alarms
1/20/2011 ECE 523: Introduction to Biometrics 13
Evaluating the Classifier
Independent Test Sample
• If sample collection is large, divide it into training and testing sets
• Training set – build the classifier
• Testing set – classify observations in the test set using our classification rule
• Estimated classification rate – proportion of correctly classified observations
• Common mistake that novice researches make is to build a classifier using their sample and then use the same sample for testing
1/20/2011 ECE 523: Introduction to Biometrics 14
Evaluating the Classifier: Independent Test Sample
Database
• Iris flower data set – introduced by Sir Ronald Aylmer Fisher (1936)
• Dataset consists of 50 samples from each of three species of Iris flowers
• Four features measured from each sample, i.e., length and width of sepal and petal in centimeters
Iris setosa Iris versicolor Iris virginica
1/20/2011 ECE 523: Introduction to Biometrics 15
Evaluating the Classifier: Independent Test Sample
Probability of Correct Classification – Independent Test Sample (Formal Procedure)
• Randomly separate 𝑛 samples into two sets of size 𝑛𝑡𝑟𝑎𝑖𝑛 and 𝑛𝑡𝑒𝑠𝑡, where 𝑛𝑡𝑟𝑎𝑖𝑛 + 𝑛𝑡𝑒𝑠𝑡 = 𝑛
• Build the classifier (e.g., Bayes Decision Rule) using the training set
• Present each pattern from the test set to the classifier and obtain a class label for it. Since we know the correct class label for these observations beforehand, we can count the number of patterns (𝑁𝑐𝑐) correctly classified
• Probability of correct classification is
1/20/2011 ECE 523: Introduction to Biometrics 16
Evaluating the Classifier: Independent Test Sample
Matlab illustration (consider only the two species that are hard to separate, i.e., iris
versicolor and iris virginica)
% Load data
load iris
% Get data for training and testing set
% Use only first two features
indtrain = 1:2:50;
indtest = 2:2:50;
versitest = versicolor(indtest,1:2);
versitrain = versicolor(indtrain,1:2);
virgitest = virginica(indtest,1:2);
virgitrain = virginica(indtrain,1:2);
• Randomly separate 𝑛 samples into two sets of size 𝑛𝑡𝑟𝑎𝑖𝑛 and 𝑛𝑡𝑒𝑠𝑡, where 𝑛𝑡𝑟𝑎𝑖𝑛 +𝑛𝑡𝑒𝑠𝑡 = 𝑛
1/20/2011 ECE 523: Introduction to Biometrics 17
Evaluating the Classifier: Independent Test Sample
• Build the classifier (e.g., Bayes Decision Rule) using the training set, assume multivariate normal model for these data
muver = mean(versitrain);
covver = cov(versitrain);
muvir = mean(virgitrain);
covvir = cov(virgitrain);
1/20/2011 ECE 523: Introduction to Biometrics 18
Evaluating the Classifier: Independent Test Sample • Present each pattern from the test set to the classifier and obtain a class label for
it. Since we know the correct class label for these observations beforehand, we can count the number of patterns (𝑁𝑐𝑐) correctly classified
• Use equal priors
% Put all of the test data into one matrix.
X = [versitest; virgitest];
% These are the probability of x given versicolor.
pxgver = csevalnorm(X, muver, covver);
% These are the probability of x given virginica.
pxgvir = csevalnorm(X, muvir, covvir);
% Check which are correctly classified
ind = find(pxgver(1:25) > pxgvir(1:25));
ncc = length(ind);
ind = find(pxgvir(26:50) > pxgver(26:50));
ncc = ncc + length(ind);
pcc = ncc/50
1/20/2011 ECE 523: Introduction to Biometrics 19
Evaluating the Classifier
Cross-validation
• Systematically partition the data into training and testing sets • 𝑛 − 𝑘 observations are used to build the classifier, and the remaining 𝑘 patterns
are used to test it
1/20/2011 ECE 523: Introduction to Biometrics 20
Cross-validation (Formal Procedure) at 𝑘 = 1 (also known as leave-one-out method)
• Set the number of correctly classified to 0, i.e., 𝑁𝐶𝐶 = 0
• Keep out one observation, call it 𝑥𝑖
• Build the classifier using the remaining 𝑛 − 1 observations
• Present the observation 𝑥𝑖 to the classifier and obtain a class label using the classifier from the previous step
• If class label is correct, increment 𝑁𝐶𝐶, i.e., 𝑁𝐶𝐶 = 𝑁𝐶𝐶 + 1
• Repeat steps 2-5 for each pattern in the sample
• Probability of correct classification is
Evaluating the Classifier: Cross-validation
1/20/2011 ECE 523: Introduction to Biometrics 21
Evaluating the Classifier: Cross-validation
Matlab Illustration • Use iris data and estimate probability of correct classification • Use cross-validation with 𝑘 = 1 • Use versicolor and virginica only • Equal priors • Use first two features only • Build the classifier (e.g., Bayes Decision Rule) using the training set, assume
multivariate normal model for these data
1/20/2011 ECE 523: Introduction to Biometrics 22
Evaluating the Classifier: Cross-validation
% Load data
load iris
% Set ncc= 0
ncc = 0;
% Use only first two features
virginica(:,3:4) = [];
versicolor(:,3:4) = [];
% Sample size
[nver,d] = size(versicolor);
[nvir,d] = size(virginica);
n = nvir + nver;
1/20/2011 ECE 523: Introduction to Biometrics 23
Evaluating the Classifier: Cross-validation
% Loop first through all of the patterns corresponding
% to versicolor.
muvir = mean(virginica);
covvir = cov(virginica);
% These will be the same for this part.
for i = 1:nver
% Get the test point and the training set
versitrain = versicolor;
% This is the testing point.
x = versitrain(i,:);
% Delete from training set.
% The result is the training set.
versitrain(i,:)=[];
muver = mean(versitrain);
covver = cov(versitrain);
pxgver = csevalnorm(x,muver,covver);
pxgvir = csevalnorm(x,muvir,covvir);
if pxgver > pxgvir
% then we correctly classified it
ncc = ncc+1;
end
end
1/20/2011 ECE 523: Introduction to Biometrics 24
Evaluating the Classifier: Cross-validation % Loop through all of the patterns of virginica
notes.
muver = mean(versicolor);
covver = cov(versicolor);
% Those remain the same for the following.
for i = 1:nvir
% Get the test point and training set.
virtrain = virginica;
x = virtrain(i,:);
virtrain(i,:)=[];
muvir = mean(virtrain);
covvir = cov(virtrain);
pxgver = csevalnorm(x,muver,covver);
pxgvir = csevalnorm(x,muvir,covvir);
if pxgvir > pxgver
% then we correctly classified it
ncc = ncc+1;
end
end
pcc = ncc/n
1/20/2011 ECE 523: Introduction to Biometrics 25
Homework #2
(A) • Use iris data and estimate probability of correct classification • Use cross-validation with 𝑘 = 2 • Use versicolor and virginica only • Equal priors • Use first two features only • Build the classifier (e.g., Bayes Decision Rule) using the training set, assume
multivariate normal model for these data
(B) • Use iris data and estimate probability of correct classification • Use cross-validation with 𝑘 = 2 • Use versicolor and virginica only • Equal priors • Use all four features • Build the classifier (e.g., Bayes Decision Rule) using the training set, assume
multivariate normal model for these data
1/20/2011 ECE 523: Introduction to Biometrics 26
Future topics
• Receiver Operating Characteristics (ROCs)
• Face Detection in Color Images using Skin Models
1/20/2011 ECE 523: Introduction to Biometrics 27
References
R. O. Duda, P. E. Hart, D. G. Stork, Pattern Classification, 2nd edition, John Wiley & Sons, Inc., 2000 Selim Aksoy, CS 551(Pattern Recognition) Course Website, http://www.cs.bilkent.edu.tr/~saksoy/courses/cs551-Spring2010/index.html W. Martinez and A. Martinez, Computational Statistics Handbook with MATLAB, 2nd edition, Chapman and Hall/CRC, Inc., 2007