handwritten oriya writer identification using zone segmentation technique

download handwritten oriya writer  identification  using zone segmentation technique

of 8

Transcript of handwritten oriya writer identification using zone segmentation technique

  • 8/10/2019 handwritten oriya writer identification using zone segmentation technique

    1/8

    Writer Identification Of Handwritten Oriya Script

    Barid Baran Nayak Partha Pratim Roy Umapada Pal

    Dept.of ECE Dept. of CS CVPR Unit

    NIT Rourkela,India IIT Roorkee,India ISI Kolkata,India

    Abstract

    In handwriten writer identification and character recognition we have

    done a image based analysis,where a scanned digital image containing

    handwriten script is taken as input, then system translate it into an

    machine editable readable digital text format. oriya language present

    great challenges due to the large number of letters in alphabet set,the

    sophisticated ways in which they combine and many letters areroundish and similar to look .

    In this project an attempt is made to recognize the writers by use of

    HISTOGRAM OF GRADIENT features of character image. The features so

    obtained are passed through the HMM code which gives out the

    identification result.

    Keywords: character recognition.writer identification,histogram of

    gradient,Hidden Markov Model(HMM)

  • 8/10/2019 handwritten oriya writer identification using zone segmentation technique

    2/8

    INTRODUCTION:-

    Oriya is one of the many official languages in India; it is the official

    language of Odisha and the second official language of Jharkhand.

    Since it is an old language there are various old documents present

    whose writers are unknown. My project deals with this problem. Its

    main aim is to identify who is the writer. And the Other part of the

    project is to identify each character written.

    Due to the presence of complex features such as headline, vowels,

    modifiers, etc., character segmentation in Oriya script is not easy. Also,

    the position of vowels and compound characters make the

    segmentation task of words into characters very complex. To take care

    of this problem we tried a novel method considering a zone wise break

    up of words and next HMM based recognition. In particular, the word

    image is segmented into 3 zones, upper, middle and lower,

    respectively. The components in middle zone are modelled using HMM.

    By this zone segmentation approach we reduce the number of distinctcomponent classes compared to total number of classes in Oriya

    character set. Once the middle zone portion is recognized, HMM based

    forced alignment is applied in this zone to mark the boundaries of

    individual components. The segmentation paths are extended later to

    other zones Next, the residue components, if any, in upper and lower

    zones in their respective boundary are combined to achieve the final

    word level recognition.

    Earlier template based approachwas followed for recognition purpose.

    In this approach an unknown pattern was superimposed on the ideal

    template is done, and then the degree of correlation between the two

  • 8/10/2019 handwritten oriya writer identification using zone segmentation technique

    3/8

    was used for the classification. But this approach became ineffective

    because of noises and changes in hand writing. Hence now a days

    feature based approachis used.

    DIAGRAMS

    Figure1: original oriya script

  • 8/10/2019 handwritten oriya writer identification using zone segmentation technique

    4/8

    Line segmentation:

    Word segmentation:

  • 8/10/2019 handwritten oriya writer identification using zone segmentation technique

    5/8

    Zone segmentation

    Figure 4: (a) Original Word. (b) Zone segmented word

    (upper,mid,lower).

    Figure 5: character segmentation from words.

  • 8/10/2019 handwritten oriya writer identification using zone segmentation technique

    6/8

    Results:

    Conclusion:

    The writer identification of writer was successfully carried out andsignificant results were obtained.A scheme for segmentation of

    unconstrained Oriya handwrittentext into lines, words and characters is

    proposed in this paper. Here, at first, the text image is segmented into

    lines, and then lines are segmented into individual words. Next, for

    character segmentation from words, initially, isolated and connected

    0

    20

    40

    60

    80

    100

    120

    zone segmentation non zone

    segmentation

    Series 1

    Series 1

  • 8/10/2019 handwritten oriya writer identification using zone segmentation technique

    7/8

    (touching) characters in a word are detected. Using structural,

    topological and water reservoir concept-based features, touching

    characters of the word are then segmented into isolated characters. To

    the best of our knowledge,

    this is the first work of its kind on Oriya text. The proposed waterreservoir-based approach can also be used for other Indian scripts where

    touching patterns show similar behavior.

    REFERENCE:-

    [1]

    U. Pal, B. B. Chaudhuri, "OCR in Bangla: an Indo-Bangladeshi Language", Proceedings of the 12th

    IAPR International Conference on Pattern Recognition B:ComputerVision & Image Processing,

    1994.[2] Sukalpa Chanda, Katrin Franke, Umapada Pal and Tetsushi Wakabayashi, "Text Independent Writer

    Identification for Bengali Script", Proc. 20th International Conference on Pattern Recognition, 2010,

    pp.2005-2008.[3] U. Pal, A. Belaid, and C. Choisy, "Touching numeral segmentation using water reservoir concept,"

    Pattern Recognition Letters, pp. 261-272, 2003.

    [4] J. M. White and G. D. Rohrer, "Image thresholding for optical character recognition and other

    applications requiring character image extraction," IBM J. of Res. and Dev., vol. 27, pp. 400-411,1983. (Pubitemid 13591061)

    [5] O. Tuzel, F. Porikli, and P. Meer, "Pedestrian detection via classification on riemannian manifolds, "IEEE Trans. Pattern Anal. Mach. Intell., vol. 30, no. 10, pp. 1713-1727, 2008.

    [6]

    L. R. Rabiner "A Tutorial on HMM and Selected Applications in Speech Recognition", IEEE

    Proceedings, vol. 77, pp.257 -286 1989[7] M. Chen , A. Kundu and S. N. Srihari "Variable Duration HMM and Morphological Segmentation

    for Handwritten Word Recognition", IEEE Trans. on Image Proc., vol. 4, no. 12, pp.1675 -1688

    1995[8]

    A. Mohan, C. Papageorgiou, and T Poggio, "Example-based object detection in images by

    components, " IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 23, pp. 349-361,2001.

    [9] D. G. Lowe, "Distinctive image features from scale-invariant keypoints, " International Journal of

    Computer Vision, vol. 60, no. 2, pp. 91-110, 2004.[10] J. Yen, F. Chang, and S. Chang, "A new criterion for automatic multilevel thresholding," IEEE Trans.

    Image Processing, vol. 4, no. 3, pp. 370-378, 1995.[11] B. B. Chaudhuri, U. Pal and M. Mitra, "Automatic recognition of printed Oriya script", Sadhana,

    Vol.27, part 1. pp.23-34, February 2002[12] U. Pal, N. Sharma, and F. Kimura, "Oriya offline handwritten character recognition", In Proc.International Conference on Advances in Pattern Recognition, pp. 123-128, 2007.

    [13]

    U. Pal and B. B. Chaudhuri, "Indian Script Character Recognition: A Survey", Pattern Recognition,

    Vol.37, pp. 1887-1899, 2004.[14]

    A. Gordo, A Fornes, and Ernest Valveny. Writer identification in handwritten musical scores with bag ofnotes. Pattern Recognition46(2013) 1337-1346

  • 8/10/2019 handwritten oriya writer identification using zone segmentation technique

    8/8

    [15] A. Fornes, J. Llados, G. Sanchez, H. Bunke, On the use of textural features for writer identification in

    old handwritten music scores, InProc. of the International Conference on Document Analysis andRecognition, 2009, pp. 9961000.

    [16] A.Fornes, J. Llanos, G. Sanchez, X. Otazu, H. Bunke, A combination of features for symbol-

    independent writer identification in old music scores,International Journal on Document Analysisand Recognition 13 (2010), pp. 243259.

    [17]

    A. Fornes, A. Dutta, A. Gordo, J. Llados, The ICDAR 2011 music scores competition: staff removaland writer identification, in:Proceedings of the International Conference on DocumentAnalysis and

    Recognition, 2011, pp. 15111515.