Khmer ocr itc
-
Upload
solin-tem -
Category
Data & Analytics
-
view
129 -
download
0
Transcript of Khmer ocr itc
2
Khmer OCR
• OCR System• Khmer OCR Project• State of the Art• Work Done• Current Work
3
OCR System
OCR
4
Khmer OCR Project
• 2011-2012• Team– 1 researcher– 1 intern student (5th year)
• Develop a Khmer OCR system– Font independent– Size independent
5
State of the ArtAuthor Limitation Result
C. Chey, P. Kumhom and K. Chamnongthai
10 characters (បពជកភណឃសវទ)
92%
C. Chey, P. Kumhom and K. Chamnongthai
20 fonts 92.85% (size 22)91.66% (size 18)89.27% (size 12)
L. Ing and A. Muaz Limon R1 22 98.88%
V. Kruy Font and size independent 97%
Tesseract• Top 3 engines in 1995• Most accurate open source OCR engine
6
Work Done
• Training Tesseract for Khmer font– Khmer OS font– 2210 character clusters
– 11 MB• Problems– Some characters not detected – Some characters misdetected
7
Current Work
• Improve works done by Vanna Kruy– Improve performance– Create an easy-to-use GUI– Make it easy to add new fonts
8
Thanks for your attention!
Questions???