Aocr Hmm Presentation
-
Upload
mahmoud-elgenedy -
Category
Documents
-
view
620 -
download
0
Transcript of Aocr Hmm Presentation
![Page 1: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/1.jpg)
AOCRArabic Optical Character Recognition
ABDEL RAHMAN GHAREEB KASEM
ADEL SALAH ABU SEREEA
MAHMOUD ABDEL MONEIM ABDEL MONEIM
MAHMOUD MOHAMMED ABDEL WAHAB
![Page 2: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/2.jpg)
Main contents Introduction to AOCR
Feature extraction
Preprocessing
AOCR system implementation
Experimental results
Conclusion & future directions
Applications
![Page 3: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/3.jpg)
Main contents Introduction to AOCR
Feature extraction
Preprocessing
AOCR system implementation
Experimental results
Conclusion & future directions
Applications
![Page 4: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/4.jpg)
Introduction
Why AOCR? What is OCR? What is the problem in AOCR? What is the solution?
Pre-Segmentation. Auto-Segmentation.
![Page 5: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/5.jpg)
Main contents Introduction to AOCR
Feature extraction
Preprocessing
AOCR system implementation
Experimental results
Conclusion & future directions
Applications
![Page 6: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/6.jpg)
Preprocessing1. Image rotation
2. Segmentation. Line segmentation. Word segmentation
3. Image enhancement
![Page 7: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/7.jpg)
PreprocessingProblem of tilted image
1. Image rotation
![Page 8: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/8.jpg)
Preprocessing 1. Process rotated image
![Page 9: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/9.jpg)
Rotate by -1 degree
Preprocessing 1. Process rotated image
![Page 10: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/10.jpg)
Rotate by -2 degree
Preprocessing 1. Process rotated image
![Page 11: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/11.jpg)
Rotate by -3 degree
Preprocessing 1. Process rotated image
![Page 12: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/12.jpg)
Rotate by -4 degree
Preprocessing 1. Process rotated image
![Page 13: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/13.jpg)
Rotate by -4 degree
Preprocessing 1. Process rotated image
![Page 14: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/14.jpg)
Clear zeros
Clear zeros
Mean value0.2*Mean value
Preprocessing 1. Process rotated image
Threshold effect
![Page 15: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/15.jpg)
Preprocessing 1. Process rotated image
GRAY Scale Vs. Black/White
in Rotation process
Original image
Gray scale
Black/White
![Page 16: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/16.jpg)
Preprocessing
1. Process rotated image
2. Segmentation. Line segmentation. Word segmentation
3. Image enhancement
![Page 17: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/17.jpg)
Preprocessing
2. Segmentation.
What is the Segmentation process?
Why we need segmentation in Arabic OCR?
What is the algorithm used in Segmentation?
![Page 18: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/18.jpg)
2. Segmentation.Preprocessing Line level segmentation
![Page 19: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/19.jpg)
2. Segmentation.Preprocessing Line level segmentation
![Page 20: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/20.jpg)
2. Segmentation.Preprocessing Word level segmentation
![Page 21: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/21.jpg)
2. Segmentation.Preprocessing
![Page 22: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/22.jpg)
Preprocessing
1. Process rotated image
2. Segmentation. Line segmentation. Word segmentation
3. Image enhancement
![Page 23: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/23.jpg)
Preprocessing
3. Image enhancement
![Page 24: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/24.jpg)
3. Image enhancement Preprocessing Noise Reduction
By morphology operations
![Page 25: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/25.jpg)
Very important notation:
Apply Image Enhancement operations on small images not large image
بسم ال الرحمن الرحيم
ال أكبر ال أكبر ال أكبر
ل إله ال ال
وال أكبر
بسم ال الرحمن الرحيم
ال أكبر ال أكبر ال أكبر
ل إله ال ال
وال أكبر
Large Image
X
Small Images
![Page 26: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/26.jpg)
Main contents Introduction to AOCR
Feature extraction
Preprocessing
AOCR system implementation
Experimental results
Conclusion & future directions
Applications
![Page 27: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/27.jpg)
FeatureFeature ExtractionExtraction
اكبر ال
![Page 28: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/28.jpg)
• Feature Selection
Suitable for HMM technique ( i.e. window scanning based features).
Suitable for word level recognition (not character). To retain as much information as possible. Achieve high accuracy with small processing time.
we select features such that:
![Page 29: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/29.jpg)
Satisfaction of the previous points
Each feature designed such that, it deals with the principle of slice technique
محمد رسول ال
n1
n3
n4
n6
n5
n2
n7
Feature vector
![Page 30: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/30.jpg)
Features deal with words not single character, where algorithm is based on segmentation free concept.
We avoid dealing with structural features as it requires hard implementation, in addition large processing time.
![Page 31: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/31.jpg)
To achieve high accuracy with lowest processing time, we use simple features & apply overlap between slices to ensure smoothing of extracted data.
الصلةoverlap
![Page 32: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/32.jpg)
(1)Background Count
Calculate vertical distances (in terms of pixels) of background regions, where each background region is bounded by two foreground regions.
النجاح background
Foreground
![Page 33: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/33.jpg)
Feature vector
d1
d3
d2
d3 d2 d1
Feature vector ofthe selected slide
Two pixels with on overlap
:Example
![Page 34: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/34.jpg)
Feature Figure
![Page 35: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/35.jpg)
Baseline Count (2)
calculate number of black pixels above baseline (with [+ve] value) & number of black pixels below baseline (with [-ve] value) in each slide.
![Page 36: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/36.jpg)
:Example
Two pixels with on overlap
Baseline
Thinning
No. of black pixels(above baseline (X1
No. of black pixels(below baseline (X2
X2 X1 Feature vector
![Page 37: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/37.jpg)
Feature Figure
![Page 38: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/38.jpg)
Centroid (3)
For each slide we get its Centroid (cx, cy) so the feature vector contains sequence of centroids.
:Example
Cx Cy Feature vector
Two pixels with on overlap
![Page 39: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/39.jpg)
Cross Count (4)
For each slide we calculate number of crossing from background (white) to foreground (black).
:Example
2 Feature vector
Two pixels with on overlap
![Page 40: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/40.jpg)
Euclidean distance (5)
We get the average foreground pixel in region above & below baseline, then Euclidean distance is measured from baseline to the average points above & below baseline, with +ve value for point above and –ve value for point below.
![Page 41: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/41.jpg)
Thinning
Baseline
Euclidean distanceabove baseline D1
Euclidean distancebelow baseline D2
One pixel without overlap
D2 D1 Feature vector
:Example
![Page 42: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/42.jpg)
Feature Figure
![Page 43: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/43.jpg)
Horizontal histogram (6)
For each slide we get its horizontal histogram (horizontal summation for rows in the slide).
Calculate HistogramFour pixels with one overlap
:Example
![Page 44: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/44.jpg)
Feature Figure
![Page 45: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/45.jpg)
Vertical histogram (7)
for each slide we get its vertical histogram (vertical summation for columns).
X2 X1 Feature vector
Two pixels with one overlap
:Example
![Page 46: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/46.jpg)
Feature Figure
![Page 47: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/47.jpg)
)Weighted vertical histogram (8
Exactly as the previous feature but the only difference is that, we multiply each row in the image by a number (weight), where the weight vector which be multiplied by the whole image takes a triangle shape.
![Page 48: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/48.jpg)
:Example
weight vector
1
1-
X2 X1 Feature vector
Two pixels with one overlap
![Page 49: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/49.jpg)
Feature Figure
![Page 50: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/50.jpg)
Main contents Introduction to AOCR
Feature extraction
Preprocessing
AOCR system implementation
Experimental results
Conclusion & future directions
Applications
![Page 51: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/51.jpg)
Implementation of AOCR BasedHMM Using HTK
Data preparation
Creating Monophone HMMs
Recognizer Evaluation
![Page 52: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/52.jpg)
Data preparation
The Task Grammar
The Dictionary
Recording the Data
Creating the Transcription Files
Coding the Data
![Page 53: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/53.jpg)
The Task Grammar
Isolated AOCR Grammar ----->Mini project
Connected AOCR Grammar ---->Final project
![Page 54: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/54.jpg)
Isolated AOCR Grammar
$name =a1| a2 | a3 | a4 | a5|……………|a28|a29;
( SENT-START <$name> SENT-END )
a1-----> ا a2---> ب a3---> ت a4---> ث
a29---> space
![Page 55: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/55.jpg)
Connected AOCR Grammar
$name =a1| a2 | a3 | a4 |
a5 |……………|a124|a125;
(SENT-START <$name> SENT-END )
a1-----> ا a2---> ـا a11---> ــبــ a23---> ـ جـ a124---> لله a125---> ـــــــ
![Page 56: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/56.jpg)
?Why Grammar
Start
a1
a2
a124
a125
a3
End
![Page 57: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/57.jpg)
?How is it created
Hparse creates it
Grammar
Word Net )Wdnet (
HParse
![Page 58: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/58.jpg)
The Dictionary
Our dictionary is limited???
![Page 59: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/59.jpg)
The Dictionary
![Page 60: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/60.jpg)
Recording the Data
Featureextraction Transformer
(Image)D signal-2
D-1vector
wav.
![Page 61: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/61.jpg)
Creating the Transcription Files
Word level MLF
Phone level MLF
![Page 62: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/62.jpg)
Word level MLF
#! MLF! #"*/1.lab"فصل."*/2.lab"في الفرق بين الخالق والمخلوق."*/3.lab"وما ابراهيم وآل ابراهيم الحنفاء والنبياء فهم."*/4.lab".يعلمون انه ل بد من الفرق بين الخالق والمخلوق..
فصلفي الفرق بين الخالق والمخلوق
وما ابراهيم وآل ابراهيم الحنفاء والنبياء فهميعلمون انه ل بد من الفرق بين الخالق والمخلوق
![Page 63: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/63.jpg)
Phone level MLF
# !MLF !#"lab.1/*"a74a51a88."lab.2/*"a74a108a123a1a86a75a38a77a123
# !MLF !#"lab.1/*"
فصل.
"lab.2/*"في الفرق بين الخالق والمخلوق
."lab.3/*"
وما ابراهيم وآل ابراهيم الحنفاء والنبياء فهم.
"lab.4/*".يعلمون انه ل بد من الفرق بين الخالق والمخلوق
.
![Page 64: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/64.jpg)
Coding the Data
HCOPY
MFCC FilesS0001.mfcS0002.mfcS0003.mfc
..etc
Wave form filesٍٍS0001.wav
S0002.wavS0003.wav
..etc
ConfigurationFile
Script File
![Page 65: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/65.jpg)
Creating Monophone HMMs
Creating Flat Start Monophones
Re-estimation
![Page 66: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/66.jpg)
Creating Monophone HMMs
The first step in HMM training is to define a
prototype model.
The parameters of this model are not important; its purpose is to define the model topology
![Page 67: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/67.jpg)
The Prototype~o <VecSize> 39 <MFCC_0_D_A>
~h "proto"<BeginHMM><NumStates> 5<State> 2<Mean> 390.0 0.0 0.0 . . . . . . . <Variance> 391.0 1.0 1.0 . . . . . . . . <State> 3<Mean> 390.0 0.0 0.0 . . . . . . .<Variance> 391.0 1.0 1.0 . . . . . . . <State> 4<Mean> 390.0 0.0 0.0 . . . . . . . <Variance> 391.0 1.0 1.0 . . . . . . .<TransP> 50.0 1.0 0.0 0.0 0.00.0 0.6 0.4 0.0 0.00.0 0.0 0.6 0.4 0.00.0 0.0 0.0 0.7 0.30.0 0.0 0.0 0.0 0.0<EndHMM>
![Page 68: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/68.jpg)
Initialization Process
Proto
Vfloors
Proto
HCompV
hmm0
![Page 69: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/69.jpg)
Initialized prototype~o <VecSize> 39 <MFCC_0_D_A>~h "proto"<BeginHMM><NumStates> 5<State> 2<Mean> 39-5.029420e+000 1.948325e+000 -5.192460e+000 . . . . . <Variance> 391.568812e+001 1.038746e+001 2.110239e+001 . . . . . <State> 3<Mean> 39-5.029420e+000 1.948325e+000 -5.192460e+000 . . . . . .<Variance> 391.568812e+001 1.038746e+001 2.110239e+001 . . . . . <State> 4<Mean> 39-5.029420e+000 1.948325e+000 -5.192460e+000 . . . . . . .<Variance> 391.568812e+001 1.038746e+001 2.110239e+001 . . . . . . . <TransP> 50.0 1.0 0.0 0.0 0.00.0 0.6 0.4 0.0 0.00.0 0.0 0.6 0.4 0.00.0 0.0 0.0 0.7 0.30.0 0.0 0.0 0.0 0.0<EndHMM>
![Page 70: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/70.jpg)
Vfloors Contents
v varFloor1~
Variance> 39>
1.568812e-001 1.038746e-001 2.110239e-001 . . . . . .
![Page 71: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/71.jpg)
hmmdefs
~o <VecSize> 39
<MFCC_0_D_A>
Initialized proto
Creating initialized Models
a125
a2
a1
Initializedmodel
![Page 72: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/72.jpg)
Creating Macros File
Vfloors file o~ VecSize> 39>
<MFCC_0_D_A>
Vfloors file
![Page 73: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/73.jpg)
Re-estimation Process
Hmmdefsmacros
HERest
InitializedProto
HCompV
Hmmdefsmacros
Training FilesMFc Files
Phones levelTranscription
monophones
![Page 74: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/74.jpg)
Recognition Process
Hvite
Trained Models
Test Files
Word Networkwnet
The dictioarydict
Reconizedwords
![Page 75: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/75.jpg)
Recognizer Evaluation
HResults
ReferenceTranscription
ReconizedTranscription
Accuracy
![Page 76: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/76.jpg)
Main contents Introduction to AOCR
Feature extraction
Preprocessing
AOCR system implementation
Experimental results
Conclusion & future directions
Applications
![Page 77: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/77.jpg)
Experimental Results
![Page 78: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/78.jpg)
Main Problem -1
1-1 1-1 Requirements:Requirements: Connected Character Recognition.Connected Character Recognition.
Multi-sizes.Multi-sizes.
Multi-fonts.Multi-fonts.
Hand Written.Hand Written.
![Page 79: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/79.jpg)
1-2 1-2 Variables:Variables: Tool .Tool .
Method used to train and test.Method used to train and test.
Model Parameters.Model Parameters.
Feature Parameters.Feature Parameters.
![Page 80: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/80.jpg)
Tool:
How it can operate with images?
DiscreteInput images.
(failed)
ContinuousInput a continuouswave form
(Succeeded)
DATA Input to HTK
![Page 81: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/81.jpg)
Isolated Character Recognition -2
2-1 Single Size (16)- Single Font (Simplified 2-1 Single Size (16)- Single Font (Simplified Arabic Fixed).Arabic Fixed).
2-2 Multi-Sizes Character Recognition.2-2 Multi-Sizes Character Recognition.
2-3 Variable Lengths Character Recognition2-3 Variable Lengths Character Recognition.
![Page 82: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/82.jpg)
2-1 2-1 Single Size (16)- Single Font (Simplified Arabic Single Size (16)- Single Font (Simplified Arabic Fixed)Fixed)
Best method.
Best number of states.
Best Widow size.
![Page 83: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/83.jpg)
Best method:
Model for each char. (35 models) Vs Model for each Char. In each position (116 Models)
(Vertical histogram-11 states-window=2.5)
35No. of Models
99.14 %Accuracy
116
100%
![Page 84: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/84.jpg)
Best number of states:
(Vertical histogram-Number of Models=35 -window=2 pixels)
3No. of States
96.55 %Accuracy
11
99.14%
![Page 85: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/85.jpg)
Best Widow size:
(2-D histogram-Number of Models=124-11 states).
97.00%
97.50%98.00%
98.50%
99.00%
99.50%100.00%
100.50%
1.2 1.7 1.5Window size
Accu
racy
![Page 86: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/86.jpg)
2-2 2-2Multi-Sizes Character RecognitionMulti-Sizes Character Recognition
Sizes (12-14-16):
(2-D histogram-Number of Models=124-11 states).
85.00%
90.00%
95.00%
100.00%
105.00%
1.6 1.84 2
Window size
Ac
cu
rac
y
![Page 87: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/87.jpg)
2-3 2-3Variable Lengths Character RecognitionVariable Lengths Character Recognition
Train with different lengths:
Vertical histogram gives Accuracy more than 2-D histogram Vertical histogram-Number of Models=35 -window=2 pixels
0.00%
20.00%
40.00%
60.00%
80.00%
100.00%
120.00%
4.6 3.7 2.8 2.33 1.66
Window size
Ac
cu
rac
y
![Page 88: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/88.jpg)
Make Model for dash: Training:
Train with characters (with out dash) &dash model.
Train with different lengths & dash model.
Train with different lengths & dash model & if the character has a dash at its end we define it as a character model followed by dash model.
(True way).
![Page 89: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/89.jpg)
Make Model for dash:
•Testing:
•Vertical histogram:
failed to recognize the dash model using all methods (recognize it as a space).
• 2-D histogram : for window size =2.6
Accuracy=100%
![Page 90: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/90.jpg)
-3Connected Character Recognition
3-1 Single Size (16)- Single Font (Simplified 3-1 Single Size (16)- Single Font (Simplified Arabic Fixed).Arabic Fixed).
3-2 Parameter Optimization.3-2 Parameter Optimization.
3-3 Multi-Sizes Character Recognition.3-3 Multi-Sizes Character Recognition.
3-4 Fusion by feature concatenation.3-4 Fusion by feature concatenation.
![Page 91: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/91.jpg)
3-1 3-1 Single Size (16)- Single Font (Simplified Single Size (16)- Single Font (Simplified(Arabic Fixed(Arabic Fixed
Best Method: (on a simple experiment (10 words))
The correct way for the word Recognition is to train the character models by (Words or Lines).
Assumptions: Training data: 25-pages (495 lines). Simplified Arabic fixed (font size = 16). Images: 300dpi-black and white. Testing data: 4-pages (74 lines). Feature properties: window=2*frame.
![Page 92: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/92.jpg)
Vertical histogram:
88.00%89.00%
90.00%91.00%92.00%93.00%
94.00%95.00%
6 6.5 7 7.5 8
Window size
Ac
cu
rac
y
![Page 93: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/93.jpg)
2-D histogram:
89.00%90.00%91.00%92.00%93.00%94.00%95.00%96.00%97.00%
4.99 5.33 5.89
Window Size
Ac
cu
rac
y
![Page 94: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/94.jpg)
3-2 3-2 Parameter OptimizationParameter Optimization
Line Level Vs Word Level. optimum number of mixture. optimum number of States. optimum initial transition probability. optimum window Vs frame ratio.
![Page 95: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/95.jpg)
•Line Level Vs Word levelLine Level Vs Word level
Assumptions:
Simplified Arabic fixed (font size = 16). Testing data: same training data. Feature type: (vertical histogram, window=2*frame). Images: 300dpi-black and white.
Line LevelLevel
84.99% Accuracy
Word Level
85.36%
![Page 96: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/96.jpg)
Conclusion:
We will concentrate on the line segmentation instead of word segmentation because of:
The disadvantages of the word segmentation: We have a limitation on the window size because
of its small size. Accuracy decreases with increasing the number
of mixture. The simplicity of the line segmentation than word
segmentation in preprocessing.
![Page 97: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/97.jpg)
•optimum number of mixtureoptimum number of mixture. One dimension features : Training data: 495 lines Testing data: same training data. Feature type:
(Vertical histogram, window=2*frame, window size = 6.5 pixels).
70.00%
75.00%
80.00%
85.00%
90.00%
95.00%
100.00%
1 3 5 10 15
Number of Mixtures
Ac
cu
rac
y
![Page 98: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/98.jpg)
Two dimension features : Training data: 495 lines Testing data: same training data. Feature type:
(2-D histogram, window=2*frame, window size = 5.33 pixels, N= 4)
86.00%
88.00%
90.00%
92.00%
94.00%
96.00%
98.00%
1 5 7 10
Number of Mixtures
Ac
cu
rac
y
![Page 99: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/99.jpg)
•optimum number of Statesoptimum number of States
One dimension features :
80.00%
85.00%
90.00%
95.00%
100.00%
6 8 11 13
Number of States
Acc
urac
y
![Page 100: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/100.jpg)
Two dimension features : Assumptions: as previous Results:
8Number of
states
92.52%Accuracy =
11
95.02%
![Page 101: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/101.jpg)
•optimum initial transition probability optimum initial transition probability
Almost Equally likely probabilities. (Failed)
Random Probabilities ……..very bad.
Each state may still in it self or go to the next state only, probability that state sill in it self higher than probability to go to the next state…………(Succeed).
0 1 0 0 00 0.7 0.3 0 00 0 0.6 0.4 0------------------------------and so on.
![Page 102: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/102.jpg)
•optimum window Vs frame ratiooptimum window Vs frame ratio
• Assumptions: as previous in (2-D feature)
• Results:
0.40.6Overlapping
Ratio =
91.70%92.52%Accuracy =
0.5
93.92%
![Page 103: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/103.jpg)
Maximum Accuracy for all features:
Max. AccuracyFeature Type
95.96%2-D histogram
87.16%Euclidean distance
91.51%Cross count
95.75%Weighted histogram
89.70%Baseline count
91.61%Background count
Vertical histogram 96.97%
![Page 104: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/104.jpg)
3-3 3-3Multi-Sizes Character RecognitionMulti-Sizes Character Recognition
Resizing the test data only: Training data: Simplified Arabic fixed-font size =16. Testing data:
Simplified Arabic fixed. Font size = 12-16-18 (After resize). 60 lines.
Feature Type: Vertical histogram
1814Font size
76.21%79.74%Accuracy
16
96.97%
![Page 105: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/105.jpg)
Resizing the training and test data: Training data:
• Simplified Arabic fixed.• Font size = 14-16-18 • (After resize).• (324 * 3) lines.
Testing data:• (324 * 3) lines.• Same as training.
Feature Type: Vertical histogram
Accuracy = 92.15%
![Page 106: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/106.jpg)
3-4 F3-4 Feature concatenationeature concatenation
Concatenates vertical histogram and 2-D histogram.
44Scale vertical
histogram)=
4.25.57Window size =
69.02%77.17%Accuracy =
No scale
5
84.09%
![Page 107: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/107.jpg)
Main contents Introduction to AOCR
Feature extraction
Preprocessing
AOCR system implementation
Experimental results
Conclusion & future directions
Applications
![Page 108: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/108.jpg)
Future works
Improving printed text system: Data base: increasing its size to support Multi-
sizes and Multi-fonts. Preprocessing improvements:
Improving the image enhancement to solve the problem of noisy pages.
Develop a robust system to solve the problems that depends on the nature of input pages (delete frames and borders and pictures and tables…..etc).
Search for new features and combine between them to
improve the accuracy.
![Page 109: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/109.jpg)
Training and testing improvements: Tying the models. Using Adaptation supported by HTK-tool that may
improve the (Multi-size) system (size independent). Using tri-phones technique to solve the problems of
overlapping.
Improve the time response (implement all pre-processing programs by C++).
Increasing the accuracy by feature fusion.
![Page 110: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/110.jpg)
Build the Multi-Language system (Language independent system).
Develop the hand written system, especially because HMM can attack this problem efficiently.
Develop the ON-Line system.
![Page 111: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/111.jpg)
Main contents Introduction to AOCR
Feature extraction
Preprocessing
AOCR system implementation
Experimental results
Conclusion & future directions
Applications
![Page 112: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/112.jpg)
Automatic Form Recognition
Check Bank Reading
بنــك مصــر: ..........................شيك رقم
: .................اسم المصرف اليه
: ..................المبلغ بالحروف: ................ المبلغ بالرقام
امضاء ...................
![Page 113: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/113.jpg)
Digital libraries :
Where all books, magazines, newspapers…etc can be stored as a softcopy on PCs & CDs.
بسم ال
![Page 114: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/114.jpg)
Transcription of historical archives & "non-death" of paper
Where we can store all archived papers & documents as a softcopy files.
بسم ال
![Page 115: Aocr Hmm Presentation](https://reader034.fdocuments.in/reader034/viewer/2022051112/559b607c1a28ab025f8b4664/html5/thumbnails/115.jpg)