Coarse Classification via Discrete Cosine Transform and Quantization
description
Transcript of Coarse Classification via Discrete Cosine Transform and Quantization
1
Coarse Classification via Discrete Cosine
Transform and Quantization
Presented byTe-Wei
Chiang
July 18, 2005
2
Outline
Introduction Feature Extraction and
Quantization Statistical Coarse Classification Experimental Results Conclusion
3
1 Introduction
Paper documents -> Computer codes
OCR(Optical Character Recognition)
4
Requirement of Coarse Classification
high accuracy rate, high reduction capacity, and quick classification time.
5
Two subproblems in the design of classification systems
Feature extraction Classification
6
Two Philosophies of Classification
Statistical Structural
7
Advantages of the Statistical Approach
fixed-length vector, robustness for noisy patterns, less ambiguities, and easy to implement.
8
Weaknesses of the Statistical Approach
Smear or average the parameters High dimensionality
9
Goal of this Paper
The purpose of this paper is to design a general coarse classification scheme: low dependent on domain-specific
knowledge. To achieve this goal, we need:
reliable and general features, and general classification method
10
2 Feature Extraction and Quantization
Find general features that can be applied properly to most application areas, rather than the best features.
Discrete Cosine Transform (DCT) is applied to extract statistical features.
11
2.1. Discrete Cosine Transform (DCT)
The DCT coefficients F(u, v) of an N×N image represented by x(i, j) can be defined as
where
1
0
1
0
),()()(2
),(N
i
N
j
jixvuN
vuF ),2
)12(cos()
2
)12(cos(
N
vj
N
ui
.1
,021
)(otherwise
wforw
12
The DCT coefficients of the character image of “ 佛” .
13
2.2. Quantization
The 2-D DCT coefficient F(u,v) is quantized to F’(u,v) according to the following equation:
Thus, dimension of the feature vector can be reduced after quantization.
14
Illustration of the Quantization Method
15
3 Statistical Coarse Classification
The statistical classification system is operated in two modes: Training(learning) Classification(testing)
16
3.1.Proposed coarse classification
scheme
17
Illustration of Extracting the 2-D DCT Coefficients
18
3.2. Grid code transformation (GCT)
Obtain the quantized DCT coefficient qij
Transform qij to positive integer dij
Such that object Oi can be transformed to a D-digit GC.
19
3.3.Grid code sorting and elimination
Remove the redundant information.
A reduced set of candidate classes can be retrieved from the lookup table according to the GC of the test sample.
20
4 Experimental Results
In our application, the objects to be classified are handwritten characters in Chinese paleography.
Since most of the characters in these rare books were contaminated by various noises, it is a challenge to achieve a high recognition rate.
21
18600 samples (about 640 classes) are extracted from Kin-Guan ( 金剛 ) bible. Each character image was
transformed into a 48×48 bitmap. 1000 of the 186000 samples are used
for testing and the others are used for training.
22
23
5 Conclusions
This paper presents a coarse classification scheme based on DCT and quantization.
Due to the energy compacting property of DCT, the most significant features of a pattern can be extracted and quantized for the generation of the grid codes.
24
Hence the potential candidates which are similar to the test pattern can be efficiently found by searching the training patterns whose grid codes are similar to that of the test pattern.
25
Future works
Since features of different types complement one another in classification performance, by using features of different types simultaneously, classification accuracy could be further improved.