A Survey of Methods and Strategies in Character Segmentation

download A Survey of Methods and Strategies in Character Segmentation

of 17

Transcript of A Survey of Methods and Strategies in Character Segmentation

  • 7/28/2019 A Survey of Methods and Strategies in Character Segmentation

    1/17

    690 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 18, NO. 7, J ULY 1996

    urvey of Methods and Strin Character Segmentation

    Richard G. Casey and Eric Lecolinet, Member, / EAbstract-Character segmentation has long been a critical area of the OCR process. The higher recognition rates for isolatedcharacters vs. those obtained for words and connected character strings well illustrate this fact. A good part of recent progress inreading unconstrained printed and written text may be ascribed to more insightful handling of segmentation.been developed, rather than to simply list sources. Segmentation methods are listed under four main headings. What may betermed the "classical" approach cons ists of methods that partition the input image into subimages , which are then classified. Theoperation of attempting to decompose the image into classifiable units is called "dissection." The second class of methods avoidsdissection, and segments the image either explicitly, by class ification of prespecified windows, or implicitly by classification ofsubsets of spatial features collected from the image as a whole. The third strategy is a hybrid of the first two, employing diss ectiontogether with recombination rules to define potential segments, but using classification to select from the range of admissiblesegmentation possibilities offered by these subimages. Finally, holistic approaches that avoid segmentation by recognizing entirecharacter strings as units are described.

    This paper provides a review of these advances. The aim is to provide an appreciation for the range of techniques that have

    ind ex Terms-Optical character recognition, character segmentation, survey, holistic recognition, Hidden Markov Models,graphemes, contextual methods, recognition-based segmentation.

    1 INTRODUCTION1.1 The Role of Segmentation in Recogniti on

    ProcessingCharacter segmentation is an operation that seeks to de-comp ose an image of a seque nce of char acters in tosubima ges of individual symbols. It is one of the decisionprocesses in a system for optical character recognition(OCR). Its decision, that a pattern isolated from the image isthat of a character (or some other identifiable unit), can beright or wrong. It is wrong sufficiently often to make a ma-jor contribution to the error rate of the system.

    In what may be called the "classical" approach toOCR, Fig. 1, segmentation is the initial step in a three-step procedure:Given a starting point in a document image:

    1) Find the next character image.2) Extract distinguish ing attributes of the character

    image.3) Find the member of a given symbol set whose attrib-

    utes best match those of the input, and output itsidentity.

    This sequence is repeated until no additional character im-ages are found.

    FormatAnalysischaracter string

    Character

    character imageieatureExtractioncharacter propertiesAI Classificationcharacter IDFig. 1. Classical optical character recognition (OCR). The process ofcharacter recognition consists of a series of stages, with each stagepassing its results on to the next in pipeline fashion. There is no feed-back loop that would permit an earlier stage to make use of knowledgegained at a later point in the process.

    An implementation of step 1, the segmentation step, re-quires answering a simply posed question: "What consti-tutes a character?" The many researchers and developerswho have tried to provide an algorithmic to thisquestion find themselves in a Catch-22 situation. A charac-ter is a pattern that resembles one of the symbols the systemis designed to recognize. But to determine such a resem-

    0 R.G. Casey is wi th the IBM Almade n Research Center, San Jose, C A0 E. Lecolinet is wit h Ecole Nationale Supe'rieure des T ilicommu nicationsManuscript received Mar . 18,1995; revised May. 22,1996. Recommended foraccepfance by R . Kasturi.For information on obtaining reprints of this article, please send e-mail to:[email protected],and reference IEEECS Log Number P96058.

    95120. E-mail: casey@:ix.neicom.com.(EN STI ,Paris, France.

    0162-8828/96$05 00 01996 IEEE

    mailto:[email protected]:casey@:ix.neicom.commailto:casey@:ix.neicom.commailto:[email protected]
  • 7/28/2019 A Survey of Methods and Strategies in Character Segmentation

    2/17

    691ASEY AND LECOLINET: A SURVEY OF METHODS AND STRATEGIES IN CHARACTER SEGMENTATION

    blance the pattern must be segmented from the documentimage. Each stage depends on the other, and in complexcases it is paradoxical to seek a pattern that will match amember of the system's recognition alphabet of symbolswithout incorporating detailed knowledge of the struc tureof those symbols into the process.

    Furthermore, the segmentation decision is not a local de-cision, independent of previous and subsequent decisions.Producing a good match to a library symbol is necessary,but not sufficient, for reliable recognition. That is, a poormatch on a later pattern can cast doubt on the correctness ofthe current segmentation/recognition result. Even a seriesof satisfactory pattern matches can be judged incorrect ifcontextual requirements on the system output are not satis-fied. For example, the letter sequence "cl" can often closelyresemble a "d," but usually such a choice will not constitutea contextually valid result.

    Thus, it is seen that the segmentation decision is interde-pendent with local decisions regarding shape similarity,and with global decisions regarding contextual acceptabil-ity. This sentence summarizes the refinement of charactersegmentation processes in the past 40 years or so. Initially,designers sought to perform segmentation as per the"classical" sequence listed above. As faster, more powerfulelectronic circuitry has encouraged the application of OCRto more complex documents, designers have realized thatstep 1 can not be divorced frorn the other facets of the rec-ognition process.

    In fact, researchers have been aware of the limitations ofthe classical approach for many years. Researchers in the1960s and 1970s observed that segmentation caused moreerrors than shape distortions in reading unconstrainedcharacters, whether hand- or machine-printed. The problemwas often masked in experimental work by the use of data-bases of well-segmented patterns, or by scanning characterstrings printed with extra spacing. In commercial applica-tions, stringent requirements for document preparationwere imposed. By the beginning of the 1980s, workers werebeginning to encourage renewed research interest [73] topermit extension of OCR to less constrained documents.

    The problems of segmentation persist today. The well-known tests of commercial printed text OCR systems by theUniversity of Nevada, Las Vegas 1641, [651, consistently as-cribe a high proportion of errors to segmentation. Evenwhen perfect patterns, the bitmapped characters that areinput to digital printers, were recognized, commercial sys-tems averaged 0.5% spacing errors. This is essentially asegmentation error by a process that attempts to isolate aword subimage. The article 161 emphatically illustrates thewoes of current machine-print recognition systems as seg-mentation difficulties increase (see Fig. 2). The degradationin performance of NIST tests of handwriting recognition onsegmented 1861 and unsegmented 1881 images underscorethe continuing need for refinement and fresh approaches inthis area. On the positive side of the ledger, the study [29]shows the dramatic improvements that can be obtainedwhen a thoroughgoing segmentation scheme replaces oneof prosaic design.

    Some authors previously have surveyed segmentation,often as part of a more comprehensive work, e.g., cursive

    recognition 1361, 1191, [201, [551, [581, 1811, or documentanalysis [231,[291. In the present paper , we present a surveywhose focus is character segmentation, andto provide broad coverage of the topic.

    SCANNEDI i i AG E

    which attempts

    Fig. 2. Segmentation problems in machine print OCR. The figure illus-trates performance of a 1991 prototype of a commercial reader ascharacter spacing is varied. P roceeding from top to bottom, the images(at left) correspond to condensed, normal, and expanded spacing,respectively. Note that recognition output (at right) for the well-spacedimages is quite accurate, butdegrades badly when characters begin tomerge. This system has continued to evolve and the illustration is notrepresentative of performance of later releases. (Adapted from [6].)

    1.2 Organizat ion of MethodsA major problem in discussing segmentation is how to clas-sify methods. Tappert et al. [81], for example, speaks of"external" vs. "internal" segmentation, depending onwhether recognition is required in the process. Dunn andWang [20] use "straight segmentation" and "segmentation-recognition" for a similar dichotomization.

    A somewhat different point of view is proposed in thispaper. The division according to use or nonuse of recogni-tion in the process fails to make clear the fundamental dis-tinctions among present-day approaches. For example, it isnot uncommon in text recognition to use a spelling correc-tor as a post-processor. This stage may propose the substi-tution of two letters for a single letter output by the classi-fier. This is in effect a use of recognition to resegment thesubimage involved. However, the process represents only atrivial advance on traditional methods that segment inde-pendent of recognition.

    In this paper, the distinction between methods is basedon how segmentation and classification interact in the over-all process. In the example just cited, segmentation is donein two stages, one before and one after image classification.Basically an unacceptable recognition result is re-examinedand modified by a (implied) resegmentation. This is arather "loose" coupling of segmentation and classification.A more profound interaction between the two aspects ofrecognition occurs when a classifier is invoked to select thesegments from a set of possibilities. In this family of ap-proaches, segmentation and classification are integrated. To

  • 7/28/2019 A Survey of Methods and Strategies in Character Segmentation

    3/17

    69 2 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 18, NO. 7, JULY 1996

    some observers it even appears that the classifier performssegmentation since, conceptually at least, it could select thedesired segments by exhaustive evaluation of all possiblesets of subimages of the input image.

    After reviewing available literature, we have concludedthat there are three "pure" strategies for segmentation, plusnumerous hybrid approaches that are weighted combina-tions of these three. The elemental strategies are:

    1) The classical approach, in which segments are identi-fied based on "character-like" properties. This processof cutting up the image into meaningful componentsis given a special name, "dissection," in discussionsbelow.2) Recognition-based segmentation, in which the systemsearches the image for components that match classesin its alphabet.

    3) Holistic methods, in which the system seeks to recog-nize words as a whole, thus, avoiding the need tosegment into characters.

    In strategy 1, the criterion for good segmentation is theagreement of general properties of the segments obtainedwith those expected for valid characters. Examples of suchproperties are height, width, separation from neighboringcomponents, disposition along a baseline, etc. In method 2,the criterion is recognition confidence, perhaps includingsyntactic or semantic correctness of the overall result. Ho-listic methods (strategy 3) in essence revert to the classicalapproach with words as the alphabet to be read. The readerinterested to obtain an early illustration of these basic tech-niques may glance ahead to Fig. 6for examples of dissec-tion processes, Fig. 13 for a recognition-based strategy, andFig. 16for a holistic approach.

    Although examples of these basic strategies are offeredbelow, much of the literature reviewed for this survey re-ports a blend of methods, using combinations of dissection,recognition searching, and word characteristics. Thus, al-though the paper necessarily has a discrete organization,the situation is perhaps better conceived as in Fig. 3. Here,the three fundamental strategies occupy orthogonal axes:Hybrid methods can be represented as weighted combina-tions of these lying at points in the intervening space. Thereis a continuous space of segmentation strategies rather thana discrete set of classes with well-defined boundaries. Ofcourse, such a space exists only conceptually; it is notmeaningful to assign precise weights to the elements of aparticular combination.

    Taking the fundamental strategies as a base, this paper isorganized as indicated in Fig. 4. In Section 2, we discussmethods that are highly weighted towards strategy 1.Theseapproaches perform segmentation based on general imagefeatures and then classify the resulting segments. Interac-tion with classification is restricted to reprocessing of am-biguous recognition results.

    In Section 3, we present methods illustra tive of recog-nition-based segmentation, strategy 2. These algorithmsavoid early imposition of segmentation boundaries . AsFig. 4 shows, such methods fall into two subcategories.Windowing methods segment the image blindly at manyboundaries chosen without regard to image features, and

    then try to choose an optimal segmentation by evaluatingthe classification of the subimages generated. Feature-basedmethods detect the physical location of image features, andseek to segment this representation into well-classified sub-sets. Thus, the former employs recognition to search for"hard" segmentation boundaries, the latter for "soft" (i.e.,implicitly defined) boundaries.

    A Recognition-based

    Fig. 3. The space of segmentation strategies. The fundamental seg-mentation strategies identified in the text are shown as occupying or-thogonal axes. Many methods adopted in practice employ elements ofmore than one basic strategy. Although it is impossible to assign pre-cise coordinates in this space, in concept such methods lie in the inte-rior of the space bounded by the three planes shown.

    analyticItxhrecogniticn-p ccess

    r1

    Fig. 4. Hierarchy of segmentation methods. The text follows the classi-fication s cheme shown.

  • 7/28/2019 A Survey of Methods and Strategies in Character Segmentation

    4/17

    CASEY AND LECOLINET: A SURVEY OF METHODS AND STRATEGIES IN CHARACTER SEGMENTATION 693

    Sections 2 and 3 describe contrasting strategies: one inwhich segmentation is based on image features, and a sec-ond in which classification is used to select from segmenta-tion candidates generated without regard to image content.In Section 4, we discuss a hybrid strategy in which a pre-liminary segmentation is implemented, based on imagefeatures, but a later combination of the initial segments isperformed and evaluated by classification in order tochoose the best segmentation from the various hypothesesoffered.

    The techniques in Sections2 to 4 are letter-based and, thus,are not limited to a specific lexicon. They can be applied tothe recognition of any vocabulary. In Section 5, are presentedholistic methods, which attempt to recognize a word as awhole. While avoiding the character segmentation problem,they are limited in application to a predefined lexicon.Markov models appear frequently in the literature, justifyingfurther subclassification of holistic and recognition-basedstrategies, as indicated in Fig. 4. Section 6 offers some re-marks and conclusions of the state of research in the seg-mentation area. Except when approaches of a sufficientlygeneral nature are discussed, the discussion is limited toWestern character sets as well as to "off-line" character rec-ognition, that is to segmentation of character data obtainedby optical scanning rather than at inception ("on-line") usingtablets, light pens, etc. Nevertheless, it is important to realizethat much of the thinking behind advanced work is influ-enced by related efforts in speech and on-line recognition,and in the study of human reading processes.

    2 DISSECTIONECHNIQUES FOR SEGMENTATIONMethods discussed in this section are based on what will betermed "dissection" of an image. By dissection is meant thedecomposition of the image into a sequence of subimagesusing general features (as, for example, in Fig. 5). This isopposed to later methods that divide the image into su-bimages independent of content. Dissection is an intelligentprocess in that an analysis of the image is carried out; how-ever, classification into symbols is not involved at thispoint.

    .................. x ........... x - .............. ............... .............. .........................I' .I.]. I ( _ ...

    .................... iiiFig. 5. The dissection method of Hoffman and M cCullough. An evalua-tion function based on a running count of horizontal black-white andwhite-black transitions is plotted below the image. The horizontal blackbars above the image indicate the activation region for the function.Vertical lines indicate human estimates of optimal segmentation (from[43])olumns, while Xs indicate columns chosen by the algorithm.

    In literature describing systems where segmentation andclassification do not interact, dissection is the entire seg-mentation process: The two terms are equivalent. However,

    in many current studies, as we shall see, segmentation is acomplex process, and there is a need for a term such as"dissection" to distinguish the image-cutting subprocessfrom the overall segmentation, which may use contextualknowledge and/or character shape description.2.1 Dissect ion Direct ly in to CharactersIn the late 1950s and early 1960s, during the earliest at-tempts to automate character recognition, research wasfocused on the identification of isolated images. Workersmainly sought methods to characterize and classify char-acter shapes. In some cases, individual characters weremapped onto grids and pixel coordinates entered onpunched cards [401, [491 in order to conduct controlled de-velopment and testing. As CRT scanners and magneticstorage became available, well-separated characters werescanned, segmented by dissection based on whitespacemeasurement, and stored on tape. When experimental de-vices were built to read actual documents they dealtwith constrained printing or writing that facilitatedsegmentation.

    For example, bank check fonts were designed withstrong leading edge features that indicated when a charac-ter was properly registered in the rectangular array fromwhich it was recognized 1241. Handprinted characters wereprinted in boxes that were invisible to the scanner, or elsethe writer was constrained in ways that aided both seg-mentation and recognition. A very thorough survey ofstatus in 1961 [79] gives only implicit acknowledgment ofthe existence of the segmentation problem. Segmentation isnot shown at all in the master diagram constructed to ac-company discussion of recognition stages. In the severalpages devoted to preprocessing (mainly thresholding), thefunction is indicated only peripherally as part of the opera-tion of registering a character image.

    The twin facts that early OCR development dealt withconstrained inputs, while research was mainly concernedwith representation and classification of individual sym-bols, explains why segmentation is so rarely mentioned inpre-70s literature. It was considered a secondary issue.2.1.1 White Space and PitchIn machine printing, vertical whitespace often serves toseparate successive characters. This property can be ex-tended to handprint by providing separated boxes in whichto print individual symbols. In applications such as billing,where document layout is specifically designed for OCR,additional spacing is built into the fonts used. The notion ofdetecting the vertical white space between successive char-acters has naturally been an important concept in dissectingimages of machine print or handprint.In many machine print applications involving limitedfont sets, each character occupies a block of fixed wid th.The pitch, or number of characters per unit of horizontaldistance, provides a basis for estimating segmentationpoints. The sequence of segmentation points obtained fora given line of print should be approximately equallyspaced at the distance corresponding to the pitch. Thisprovides a global basis for segmentation, since separationpoints are not independent.

  • 7/28/2019 A Survey of Methods and Strategies in Character Segmentation

    5/17

  • 7/28/2019 A Survey of Methods and Strategies in Character Segmentation

    6/17

    CASEY AND LECOLINET: A SURVEY OF METHODS AND STRATEGIES IN CHARACTER SEGMENTATION 695

    between this minimum value and the peaks on each side iscalculated. The ratio of the sum to the minimum value itself(plus 1, presumably to avoid division by zero) is the dis-criminator used to select segmentation boundaries. Thisratio exhibits a preference for low valley with high peaks onboth sides.

    A prefiltering was implemented in [83] in order to inten-sify the projection function. The filter ANDed adjacent col-umns prior to projection as in Fig. 6c. This has the effect ofproducing a deeper valley at columns where only portionsof the vertical edges of two adjacent characters are merged.

    A different kind of prefiltering was used in 1571 tosharpen discrimination in the vicinity of holes and cavitiesof a composite pattern. In addition to the projection itself,the difference between upper and lower profiles of thepattern was used in a formulla analogous to that of [l].Here, the upper profile is a function giving the maximumy-value of the black pixels for each column in the patternarray. The lower profile is delined similarly on the mini-mum y-value in each column.

    A vertical projection is less satisfactory for the slantedcharacters commonly occurring in handprin t. In one study[281, projections were performed at two-degree incre-ments between -16 and +16 degrees from the vertical.Vertical strokes and steeply angled strokes such as occurin a letter A were detected as large values of the deriva-tive of a projection. Cuts were implemented along theprojection angle. Rules were implemented to screen outcuts that traversed multiple lines, and also to rejoin smallfloating regions, such as the left portion of a T crossbar,that might be created by the cutting algorithm. A similartechnique is employed in [891.2.1.3 Connected Component ProcessingProjection methods are primarily useful for good qualitymachine printing, where adjactmt characters can ordinarilybe separated at columns. A one-dimensional analysis isfeasible in such a case.

    However, the methods described above are not generallyadequate for segmentation of proportional fonts or hand-printed characters. Thus, pitch-based methods can not beapplied when the width of the characters is variable. Like-wise, projection analysis has limited success when charac-ters are slanted, or when inter-character connections andcharacter strokes have similar thickness.

    Segmentation of handprint lor kerned machine printingcalls for a two-dimensional analysis, for even nontouchingcharacters may not be separable along a single straight line.A common approach (see Fig. 7) is based on determiningconnected black regions (connected components, orblobs). Further processing may then be necessary tocombine or split these components into character images.

    There are two types of followup processing. One isbased on the bounding box, i.e., the location and di-mensions of each connected component. The other isbased on detailed analysis of the images of the connectedcomponents.

    2.1.3.1 Bounding Box AnalysisThe distribution of bounding boxes tells a great deal aboutthe proper segmentation of an image consisting of noncur-sive characters. By testing their adjacency relationships toperform merging, or their size and aspect ratios to triggersplitting mechanisms, much of the segmentation task canbe accurately performed at a low cost in computation.

    Fig. 7.Connected components. This example illustrates characters thatconsist of two components (e.g., the U in you), as well as compo-nents consis ting of more than one character (e.g ., the ry in very).The bounding box of each component is also s hown. The latter is oftenused as a bas is for dissection methods, as discuss ed in the text.

    This approach has been applied, for example, in seg-menting handwritten postcodes [14] using knowledge ofthe number of symbols in the code: six for the Canadiancodes used in experiments. Connected components werejoined or split according to rules based on height and widthof their bounding boxes. The rather simple approach cor-rectly classified 93% of 300 test codes, with only 2.7% incor-rect segmentation and 4.3% rejection.

    Connected components have also served to provide abasis for the segmentation of scanned handwriting intowords [741. Here, it is assumed that words do not touch,but may be fragmented. Thus, the problem is to groupfragments (connected components) into word images. Eightdifferent distance measures between components were in-vestigated. The best methods were based on the size of thewhite run lengths between successive components, with areduction factor applied to estimated distance if the com-ponents had significant horizontal overlap. 90% correctword segmentation was achieved on 1,453 postal addressimages.

    An experimental comparison of character segmentationby projection analysis vs. segmentation by connected com-ponents is reported in [87]. Both segmenters were tested ona large data base (272,870 handprinted digits) using thesame follow-on classifier. Characters separated by con-nected component segmentation resulted in 97.5% recogni-tion accuracy, while projection analysis (along a line ofvariable slope) yielded almost twice the errors a t 95.3% ac-curacy. Connected component processing was also carriedout four time faster than projection analysis, a somewhatsurprising result.2.1.3.2Splitting of Connected ComponentsAnalysis of projections or bounding boxes offers an efficientway to segment nontouching characters in hand- or ma-chine-printed data. However, more detailed processing isnecessary in order to separate joined characters reliably.The intersection of two characters can give rise to special

  • 7/28/2019 A Survey of Methods and Strategies in Character Segmentation

    7/17

    696 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 18, NO. 7, JULY 1996

    image features. Consequently dissection methods havebeen developed to detect these features and to use them insplitting a character string image into subimages. Suchmethods often work as a follow-on to bounding box analy-sis. Only image components failing certain dimensionaltests are subjected to detailed examination.

    Another concern is that separation along a straight-linepath can be inaccurate when the writing is slanted, or whencharacters are overlapping. Accurate segmentation calls foran analysis of the shape of the pattern to be split, togetherwith the determination of an appropriate segmentationpath (see Fig.8).

    Fig. 8. Dissection based on search and deflect. The initial path of thecut trajectory is along a column corresponding to a minimum in theupper profile of the image. When black is encountered the path ismodified recursively, seeking better positions at which to enforce a cut.In this way, multiple cuts can be made at positions which are in theshadow region of the image.

    Depending upon the application, certain a priori knowl-edge may be used in the splitting process. For example, analgorithm may assume that some but not all input charac-ters can be connected. In an application such as form data,the characters may be known to be digits or capital letters,placing a constraint on dimensional variations. Anotherimportant case commercially is handwritten zip codes,where connected strings of more than two or three charac-ters are rarely encountered, and the total number of sym-bols is known.

    These different points have been addressed by severalauthors. One of the earliest studies to use contour analysisfor the detection of likely segmentation points was reportedin [69]. The algorithm, designed to segment digits, uses lo-cal vertical minima encountered in following the bottomcontour as "landmark points." Successive minima detectedin the same connected component are assumed to belong todifferent characters, which are to be separated. Conse-quently, contour following is performed from the leftmostminimum counter-clockwise until a turning point is found.This point is presumed to lie in the region of intersection ofthe two characters and a cut is performed vertically. Analgorithm that not only detects likely segmentation points,but also computes an appropriate segmentation path, as inFig. 8, was proposed in [76] for segmenting digit strings.The operation comprises two steps. First, a vertical scan isperformed in the middle of an image assumed to containtwo possibly connected characters. If the number of black towhite transitions is 0 or 2, then the digits are either not con-

    nected or else simply connected, respectively, and cantherefore be separated easily by means of a vertical cut. Ifthe number of transitions found dur ing this scan exceeds 2,the writing is probably slanted and a special algorithm us-ing a "Hit and Deflect Strategy" is called. This algorithm isable to compute a curved segmentation path by iterativelymoving a scanning point. This scanning point starts fromthe maximum peak in the bottom profile of the lower halfof the image. It then moves upwards by means of simplerules which seek to avoid cutting the characters until fur-ther movement is impossible. In most cases, only one cut isnecessary to separate slanted characters that are simplyconnected.

    This scheme was refined in a later technique 1511, [52],which determines not only "how" to segment charactersbut also "when" to segment them. Detecting which charac-ters have to be segmented is a difficult task that has notalways been addressed. One approach consists in usingrecognition as a validation of the segmentation phase andresegmenting in case of failure. Such a strategy will be ad-dressed in Section 3. A different approach, based on theconcept of prerecognition, is proposed in 1511.

    The basic idea of the technique is to follow connectedcomponent analysis with a simple recognition logic whoserole is not to label characters but rather to detect whichcomponents are likely to be single, connected or brokencharacters. Splitting of a n image classified as connected isthen accomplished by finding characteristic landmarks ofthe image that are likely to be segmentation points, reject-ing those that appear to be situated within a character, andimplementing a suitable cutting path.

    The method employs an extension of the Hit and Deflectscheme proposed in [76]. irst, the valleys of the upper andlower profiles of the component are detected. Then, severalpossible segmentation paths are generated. Each path muststart from an upper or lower valley. Several heuristic crite-ria are considered for choosing the "best pa t h (the pathmust be "central" enough, paths linking an upper and alower valley are preferred, etc.).

    The complete algorithm works as a closed loop system,all segmentation being proposed and then confirmed ordiscarded by the prerecognition module: Segmentation canonly take place on components identified as connectedcharacters by prerecognition, and segmentations producingbroken characters are discarded. The system is able to seg-ment n-tuplets of connected characters which can be multi-ply connected or even merged. It was first applied on zipcode segmentation for the French Postal Service.

    In [85], an algorithm was constructed based on a catego-rization of the vertexes of stroke elements at contact pointsof touching numerals. Segmentation consists in detectingthe most likely contact point among the various vertexesproposed by analysis of the image, and performing a cutsimilar in concept to that illustrated in Fig. 8.

    Methods for defining splitting paths have been exam-ined in a number of other studies as well. The algorithm of[17]performs background analysis to extract the face-upand face-down valleys, strokes and loop regions of compo-nent images. A "marriage score matrix" is then used to de-cide which pair of valleys is the most appropriate. The

  • 7/28/2019 A Survey of Methods and Strategies in Character Segmentation

    8/17

    CASEY AND LEC OLINET: A SURVEY OF IMETHODS AND STRATEGIES IN CHARACTER SEGMENTATION 697

    separating path is deduced by combining three lines re-spectively segmenting the upper valley, the stroke and thelower valley.A distance transform is applied to the input image in[31 in order to compute the splitting path. The objective isto find a path that stays as far from character strokes aspossible without excessive curvature. This is achieved byemploying the distance transforrn as a cost function, and us-ing complementaryheuristics to seek a minimal-cost path.A shortest-path method investigated in [84] produces an"optimum" segmentation path using dynamic programming.The path is computed iteratively by considering successiverows in the image. A one-dimensdonal cost array contains theaccumulated cost of a path emanat ing from a predeterminedstarting point at the top of the iinage to each column of thecurrent row. The costs to reach the following row are thencalculated by considering all vertical and diagonal movesthat can be performed from one point of the current row to apoint of the following row (a specific cost being associated toeach type of move). Several tries can be made from differentstarting points. The selection of the best solution is based onclassification confidence (which is obtained using a neuralnetwork). Redundant shortest-path calculations are avoidedin order to improve segmentationspeed.2.1.3.3 LandmarksIn recognition of cursive writing, it is common to analyzethe image of a character string in order to define lower,middle and upper zones. This permits the ready detectionof ascenders and descenders, features that can serve as"landmarks" for segmentation of the image. This techniquewas applied to on-line recognition in pioneering work byFrischkopf and Harmon [361.Using an estimate of characterwidth, they dissected the image into patterns centeredabout the landmarks, and divided remaining image com-ponents on width alone. This scheme does not succeed withletters such as "U," "n," "m," which do not contain land-marks. However, the basic method for detecting ascendersand descenders has been adopted by many other research-ers in later years.2.2 Dissection with Contextual Postprocessing:GraphemesThe segmentation obtained by dissection can later be sub-jected to evaluat ion based on linguistic context, as shown in[7]. ere, a Markov model is postulated to represent split-ting and merging as well as misclassification in a recogni-tion process. The system seeks to correct such errors byminimizing an edit distance between recognition outputand words in a given lexicon. Thus, it does not directlyevaluate alternative segmentation hypotheses, it merelytries to correct poorly made ones. The approach is influ-enced by earlier developments in speech recognition. Anon-Markovian system reported in [121 uses a spell-checkerto correct repeatedly made merge and split errors in a com-plete text, rather than in single words as above.

    An alternative approach still based on dissection is todivide the input image into subimages that are not neces-sarily individual characters. The dissection is performed atstable image features that may occur within or between

    characters, as for example, a sharp downward indentationcan occur in the center of an " M or at the connection oftwo touching characters. The preliminary shapes, called"graphemes" or "pseudo-characters" (see Fig. 9), are in-tended to fall into readily identifiable classes. A contextualmapping function from grapheme classes to symbols canthen complete the recognition process. In doing so, themapping function may combine or split grapheme classes,i.e., implement a many-to-one or one-to-many mapping.This amounts to an (implicit) resegmentation of the input.The dissection step of this process is sometimes called"presegmentation" or, when the intent is to leave no com-posite characters, "over-segmentation."

    RULES

    Joining1 t 1 +Ui t 1 +n

    Sp t ngf - f t ra -c+eFig. 9. Graphemes. The bounding boxes indicate subimages isolatedby a cutting algorithm. The algorithm typically cuts "U" or "n" into twopieces as shown, but this is anticipated by the contextual recombina-tion rules at right. Other rules are shown for cases where the cuttingalgorithm failed tosplit two characters. These rules are applied in manycombinations seeking to transform the grapheme classes obtainedduring recognition into a valid character string.

    The first reported use of this concept was probably in[721, a report on a system for off-line cursive script recogni-tion. In this work, a dissection into graphemes was firstperformed based on the detection of characteristic areas ofthe image. The classes recognized by the classifier did notcorrespond to letters, but to specific shapes that could bereliably segmented (typically combinations of letters, butalso portions of letters). Consequently, only 17 nonexclu-sive classes were considered.

    As in [72], the grapheme concept has been appliedmainly to cursive script by later researchers. Techniquesfor dissecting cursive script are based on heuristic rulesderived from visual observation. There is no "magic" ruleand it is not feasible to segment all handwritten wordsinto perfectly separated characters in the absence of rec-ognition. Thus, word units resulting from segmentationare not only expected to be entire characters, but alsoparts or combinations of characters (the graphemes). Therelationship between characters and graphemes must re-main simple enough to allow definition of an efficientpost-processing stage. In practice, this means that a singlecharacter decomposes into at most two graphemes, andconversely, a single grapheme represents at most a two-or three-character sequence.

    The line segments that form connections between char-acters in cursive script are known as "ligatures." Thus,

  • 7/28/2019 A Survey of Methods and Strategies in Character Segmentation

    9/17

    698 IEEE TRANSACTIONS ON PATTERN ANALYS IS AND MACHINE INTELLIGENCE, VOL. 18, NO. 7, JULY 1996

    some dissection techniques for script seek "lower liga-tures," connections near the baseline that link most lower-case characters. A simple way to locate ligatures is to detectthe minima of the upper outline of words. Unfortunately,this method leaves several problems unresolved:

    0 Letters "o," "b," "v," and "w" are usually followed by"upper" ligatures.

    0 Letters "U" and "w" contain "intraletter ligatures,"i.e., a subpart of these letters cannot be differentiatedfrom a ligature in the absence of context.

    0 Artifacts sometimes cause erroneous Segmentation.In typical systems, these problems are treated at a latercontextual stage that jointly treats both segmentation andrecognition errors. Such processing is included in the sys-tem since cursive writing is often ambiguous without thehelp of lexical context. However, the quality of segmenta-tion still remains very much dependent on the effectivenessof the dissection scheme that produces the graphemes.

    Dissection techniques based on the principle of detectingligatures were developed in [22], [611, and [531. The laststudy was based on a d ual approach:

    e The detection of possible presegmentation zones.0 The use of a "prerecognition" algorithm, whose aim

    was not to recognize characters, but to evaluatewhether a subimage defined by the presegmenter waslikely to constitute a valid character.

    Presegmentation zones were detected by analyzing theupper and lower profiles an d open concavities of thewords. Tentative segmentation paths were defined in or-der to separate words into isolated graphemes. Thesepaths were chosen to respect several heuristic rules ex-pressing continuity and connectivity constraints. How-ever, these presegmentations were only validated if theywere consistent with the decisions of the prerecognitionalgorithm. An important property of this method was in-dependence from character slant, so that no special pre-processing was required .A similar presegmenter was presented in [42]. In thiscase, analysis of the upper contour, and a set of rules basedon contour direction, closure detection, and zone locationwere used. Upper contour analysis was also used in [47] fora presegmentation algorithm that served as part of the sec-ond stage of a hybrid recognition system. The first stage ofthis system also implemented a form of the hit and deflectstrategy previously mentioned.A technique for segmenting handwritten strings of vari-able length, was described in [27]. It employs upper andlower contour analysis and a splitting technique based onthe hit and deflect strategy.Segmentation can also be based on the detection ofminima of the lower contour as in [ # I . In this study, pre-segmentation points were chosen in the neighborhood ofthese minima and emergency segmentation performedbetween points that were highly separated. The methodrequires handwriting to be previously deslanted in order toensure proper separation.

    A recent study which aims to locate "key letters" in cur-sive words employs background analysis to perform lettersegmentation [18]. n this method, segmentation is based on

    the detection and analysis of faceup and facedown valleysand open loop regions of the word image.

    3 RECOGNITION-BASEDMethods considered here also segment words into individualunits (which are usually letters). However, the principle ofoperation is quite different. In principle, no feature-based dis-section algorithm is employed. Rather, the image is dividedsystematically nto many overlapping pieces without regard tocontent. These are classified as part of an attempt to find a co-herent segmentation/recognition result. Systems using such aprinciple perform "recogrution-based segmentation: Lettersegmentation is a by-product of letter recognition, which mayitself be driven by contextual analysis. The main interest of thiscategory of methods is that they bypass the segmentationproblem: No complex "dissection" algorithm has to be builtand recognition errors are basically due to failures in classifi-cation. The approach has also been called "segmentation-free"recognition. The point of view of this paper is that recognitionnecessarily involves segmentation, explicit or implicitthough it be. Thus, the possibly misleading connotations of"segmentation-free" will be avoided in our own terminology.

    Conceptually, these methods are derived from a schemein [48] and [ l l ] for the recognition of machine-printedwords. The basic principle is to use a mobile window ofvariable width to provide sequences of tentative segmenta-tions which are confirmed (or not) by character recognition.Multiple sequences are obtained from the input image byvarying the window placement and size. Each sequence isassessed as a whole based on recognition results.

    In recognition-based techniques, recognition can be per-formed by following either a serial or a parallel optimizationscheme. In the first case, e.g., [l l ] , recognition is done itera-tively in a left-to-right scan of words, searching for a"satisfactory" recognition result. The parallel method [48]pro-ceeds in a more global way. It generates a lattice of all (ormany) possible feature-to-letter combinations. The final deci-sion is found by choosing an optimal path through the lattice.

    The windowing process can operate directly on the im-age pixels, or it can be applied in the form of weightings orgroupings of positional feature measurements made on theimages. Methods employing the former approach are pre-sented in Section 3.1, while the latter class of methods isexplored in Section 3.2.

    Word level knowledge can be introduced during therecognition process in the form of statistics, or as a lexiconof possible words, or by a combination of these tools. Sta-tistical representation, which has become popular with theuse of Hidden Markov Models (HMMs), is discussed inSection 3.2.1.3.1 Methods that Search the Im aRecognition-based segmentation consists of the followingtwo steps:

    1) Generation of segmentation hypotheses (windowing2 ) Choice of the best hypothesis (verification step).

    step).How these two operations are carried out distinguishes thedifferent systems.

  • 7/28/2019 A Survey of Methods and Strategies in Character Segmentation

    10/17

    CASEY AND LECOLINET: A SURVEY OF METHODS AND STRATEGIES IN CHARACTER SEGMENTATION 699

    A s easy to state as these principles are, they were a longtime in developing. Probably the earliest theoretical andexperimental application of thLe concept is reported byKovalevsky [481. The task was recognition of typewrittenCyrillic characters of poor quality. Although characterspacing was fixed, Kovalevskis model assumed that theexact value of pitch and the location of the origin for theprint line were known only approximately. He developed asolution under the assumption ithat segmentation occurredalong columns. Correlation with prototype character im-ages was used as a method of classification.

    Kovalevskys model (Fig. 10)assumes that the probabil-ity of observing a given version of a prototype character is aspherically symmetric function of the difference betweenthe two images. Then the optimal objective function forsegmentation is the sum of the squared distances betweensegmented images and matching prototypes. The set ofsegmented images that minimizes this sum is the optimalsegmentation. He showed that ihe problem of finding thissolution can be formulated as one of determining the pathof maximum length in a graph, and that this path can befound by dynamic programming. This process was imple-mented in hardware to produce a working OCR system.A M

    -+ Protos of width 1----*---rotos of width 5--+- Cumulative similarity (@:

    0 5 10 15 20Window Position (right edge)

    Fig. 10. Kovalevskys method. The lower two curves give the similari-ties obtained by compar ing stored prototypes against a window whoseright edge is placed on the input image (shown at top) at the positionsindicated by the abcissa. For window size 1 the only prototype is ablank column. For windows of size 5 there are multiple character pro-totypes, and the highest similarity amoing these is plotted. The cumula-tive plot gives the maximum total similarity obtainable from a sequenceof windows up to the plotted point. Several good matches are obtain-able by windowing parts of two characters. However, such falsematches are not part of the optimal sequence of windows, indicated bythe circled points of the upper graph.

    Kovalevskys work appears to have been neglected forsome time. A number of years later, [ l l ] eported a recur-sive splitting algorithm for machine-printed characters.This algorithm, also based on prototype matching, system-atically tests all combinations of admissible separationboundaries until it either exhausts the set of cutpoints, orelse finds an acceptable segmentation (see Fig. 11).An ac-ceptable segmentation is one in which every segmentedpattern matches a library prototype within a prespecifieddistance tolerance.

    Inputpattern

    rmMatching

    Fig. 11 . Recursive segmentation. The example shows the results ofapplying windows of decreasing width to the left side of an input image.When the subimage in the window is recognized (in this case bymatching a prototype character stored in the systems memory), thenthe procedure is recursively applied to the residue image. Recognition(and segmentation) is accomplished if a complete series of matchingwindows is found. In the to p three rows, no match is obtained for theresidue image, bu t successful segmentation is finally obtained asshown at the bottom.A technique combining dynamic programming and neu-

    ral net recognition was proposed in [lo]. This technique,called Shortest Path Segmentation, selects the optimal con-sistent combination of cuts from a predefined set of win-dows. Given this set of candidate cuts, all possible legalsegments are constructed by combination. A graph whosenodes represent acceptable segments is then created andthese nodes are connected when they correspond to com-patible neighbors. The paths of this graph represent all thelegal segmentations of the word. Each node of the graph isthen assigned a distance obtained by the neural net recog-nizer. The shortest path through the graph thus correspondsto the best recognition and segmentation of the word.

    The method of selective attention 1301 takes neuralnetworks even further in the handling of segmentationproblems. In this approach, Fig. 12 , a neural net seeks rec-ognizable patterns in an image input, but is inhibitedautomatically after recognition in order to ignore the regionof the recognized character and search for new characterimages in neighboring regions.

  • 7/28/2019 A Survey of Methods and Strategies in Character Segmentation

    11/17

    700 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 18, NO. 7, JULY 1996

    product of the probabilities of segment 1 resulting from "c,"segment 2 from "a," etc. The probability of a different lexi-con word can likewise be calculated. To choose the mostlikely word from a set of alternatives the designer of thesystem may select either the composite model that gives the-segmentation having greatest probability, or else thatmodel which maximizes the a posteriori probability of theobservations, i.e., the sum over all segmentations. In eithercase/ the Optimization is Organized to avoid re-dundant calculations, in the former case, by using the well-known Viterbi algorithm.

    (a ) (b) (clFig, 12. Selective attention. (a) An input Pattern. (b) The recognizergradually reinforces pixels that corresponds to objects in its templatelibrary, and inhibits those that do not, yielding a partial recognition. (c)After a delay, attention is switched to unmatched regions, and anothermatch to the library is found (after Fukishima).

    3.2 Methods that Segm ent a Feature Representat ion3.2.1 Hidden Markov ModelsA Hidden Markov Model (often abbreviated HMM) modelsvariations in printing or cursive writing as an underlyingprobabilistic structure which is not directly observable. Thisstructure consists of a set of states plus transition probabili-ties between states. In addition, the observations that thesystem makes on an image are represented as random vari-ables whose distribution depends on the state. These obser-vations constitute a sequential feature representation of theuse in recognition applications.

    For the purpose of this survey, three levels of underlyingMarkov model are distinguished, each calling for a differ-ent type of feature representation:

    MODELof the Image(a)

    gp-lJlA bb)t

    inpu t image. The survey [34] provides an introduction to its ( C )

    The Markov model represents letter-to-letter varia-tions of the language. Typically such a model is basedon bigram frequencies (first order model) or trigramfrequencies (second order model). The features aregathered on individual characters or graphemes, andsegmentation must be done in advance by dissection.Such systems are included in Section 2 above.The Markov model represents state-to-state transi-tions within a character. These transitions provide asequence of observations on the character. Featuresare typically measured in the left-to-right direction.This facilitates the representation of a word as a con-catenation of character models. In such a system,segmentation is (implicitly) done in the course ofmatching the model against a given sequence of fea-ture values gathered from a word image. That is, itdecides where one character model leaves off and thenext one begins, in the series of features analyzed. Ex -amples of this approach are given in this section.The Markov model represents the state-to-state varia-tions within a specific word belonging to a lexicon ofadmissible word candidates. This is a holistic modelas described in Section 5, and entails neither explicitor implicit segmentation into characters.

    6NPUT RECOGNITIONsegmentation 1

    (4 x: I 3 7 5 5 3 2 6 5 5 7 4 8 5 1 0 3 2I segmentation 2Fig. 13. Hidden Markov Models. This vastly oversimplified example isintended only to illustrate the principal concepts: (a) Training lettersand (b) typical sequences of feature values obtained from (a). (c) TheMarkov models underlying (b), indicating the state sequence, andshowing that certain states may be re-entered. Each state outputs avalue from a feature distribution whose mean is indicated in the dia-gram above. The model for a word is the concatenation of such lettermodels. (d) A sequence of feature values obtained from a word, indi-cating several different segmentations. The HMM solution is found byevaluating many possible segmentation, of which two are shown.

    Such HMMs are a powerful tool to model the fact thatletters do not always have distinct segmentation bounda-ries. It is clear that in the general case perfect letter dissec-tion can not be achieved. This problem can be compensatedby the HMMs, as they are able to learn by observinn letter

    In this section, we are concerned with HMMs of type 2,which model sequences of feature values obtained fromindividual letters. For example, Fig. 13 shows a samplefeature vector produced from the word "cat." This se-quence can be segmented into three letters in many differ-ent ways, of which two are shown. The probability that aparticular segmentation resulted from the word "cat" is the

    segmentation behavior on a training set. Context (word andletter frequencies, syntactic rules) can be also be included,by defining transition probabilities between letter states.

    Elementary HMMs describing letters can be combined toform either several model-discriminant word HMMs or elsea single path-discriminant model. In model-discriminantHMMs, one model is constructed for each different word

  • 7/28/2019 A Survey of Methods and Strategies in Character Segmentation

    12/17

    CASEY AND LECOLINET: A SURVEY OF IWETHODS AND STRATEGIES IN CHARACTER SEGMENTATION 701

    1321,1151, 1751, 141 while in the path discriminant HMM onlyone global model is constructed [50]. In the former case,each word model is assessed to determine which is mostlikely to have produced a given set of observations. In thelatter case, word recognition is performed by finding themost likely paths through the unique model, each path be-ing equivalent to a sequence of letters. Path discriminantHMMs can handle large vocabularies, but a re generally lessaccurate than model-discriminant HMMs. They may incor-porate a lexicon comparison module in order to ignore in-valid letter sequences obtained by path optimization.

    Calculation of the best paths in the HMM model is usu-ally done by means of the Viterbi algorithm. Transition andobserved feature probabilities can be learned using theBaum-Welch algorithm. Starting from an initial evaluation,HMM probabilities can be re-estimated using frequencies ofobservations measured on the tmining set [33].

    First order Markov models a.re employed in most appli-cations; in [50],an example of a second order HMM isgiven. Models for cursive script ordinarily assume discretefeature values. However, continuous probability densitiesmay also be used, as in [ 3 ] .3.3.2 Non-Markov ApproachesA method stemming from concepts used in machine visionfor recognition of occluded objects is reported in [161. Here,various features and their positions of occurrence are re-corded for an image. Each feature contributes an amount ofevidence for the existence of one or more characters at theposition of occurrence. The positions are quantized intobins such that the evidence for e#ach haracter indicated in abin can be summed to give a score or classification. Thesescores are subjected to contextual processing using a prede-fined lexicon in order to recognize words. The method isbeing applied to text printed in a known proportional font.

    A method that recognizes word feature graphs is pre-sented in 1711.This system attempts to match subgraphs offeatures with predefined character prototypes. Differentalternatives are represented by a directed network whosenodes correspond to the matched subgraphs. Word recog-nition is performed by searching; for the pa th that gives thebest interpretation of the word features. The characters aredetected in the order defined by the matching quality.These can overlap or can be broken or underlined .

    This family of recognition-based approaches has moreoften been aimed at cursive handwriting recognition. Prob-abilistic relaxation was used in 1371 to read off-line hand-written words. The model was working on a hierarchicaldescription of words derived from a skeletal representation.Relaxation was performed on the nodes of a stroke graphand of a letter graph where all possible segmentations werekept. Complexity was progressively reduced by keepingonly the most likely solutions. iq-gram statistics were alsointroduced to discard illegible combinations of letters. Amajor drawback of this technique is that it requires inten-sive computation.

    Tappert employed Elastic Matching to match the draw-ing of an unknown cursive word with the possible se-quences of letter prototypes [ f i O ] . As it was an on-linemethod, the unknown word was represented by means of

    the angles and y-location of the strokes joining digitizationpoints. Matching was considered as a path optimizationproblem in a lattice where the sum of distance betweenthese word features and the sequences of letter prototypeshad to be minimized. Dynamic programming was usedwith a warping function that permitted the process to skipunnecessary features. Digram statistics and segmentationconstraints were eventually added to improve performance.

    Several authors proposed a Hypothesis Testing andVerification scheme to recognize handprinted [44] or on-line cursive words [5], [67].or example, in the systemproposed in [5]a sequence of st ructural features (like x- andy-extrema, curvature signs, cusps, crossings, penlifts, andclosures) was extracted from the word to generate all thelegible sequences of letters. Then, the aspect of the word(which was deduced from ascender and descender detec-tion) was taken into account to chose the best solution(s1among the list of generated words. In [67], words andletters were represented by means of tree dictionaries:Possible words were described by a letter tree (alsocalled a trie) and letters were described by a featuretree. The letters were predicted by finding in the lettertree the paths compatible with the extracted features andwere verified by checking their compatibility with theword dictionary.

    Hierarchical grouping of on-line features was proposedin [39].The words were described by means of a hierarchi-cal description where primitive features were progressivelygrouped into more sophisticated representations. The firstlevel corresponded to the turning points of the drawing,the second level was dealing with more sophisticated fea-tures called primary shapes, and finally, the third levelwas a trellis of tentative letters and ligatures. Ambiguitieswere resolved by contextual analysis using letter quad-grams to reduce the number of possible words and a dic-tionary lookup to select the valid solution(s).A different approach uses the concept of regularities andsingularities [77]. n this system, a stroke graph represent-ing the word is obtained after skeletonization. Thesingular parts, which are supposed to convey most of theinformation, were deduced by eliminating regular part ofthe word (the sinusoid-like path joining all cursive liga-tures). The most robust features and characters (theanchors) were then detected from a description chainderived from these singular parts and dynamic matchingwas used for analyzing the remaining parts.

    A top-down directed word verification method calledbackward matching (see Fig. 14) is proposed in [54]. Incursive word recognition, all letters do not have the samediscriminating power, and some of them are easier to rec-ognize.So, in this method, recognition is not performed in aleft-to-right scan, but follows a meaningful order whichdepends on the visual and lexical significance of the letters.Moreover, this order also follows an edge-toward-centermovement, as in human vision [821. Matching betweensymbolic and physical descriptions can be performed at theletter, feature and even sub-feature levels. A s the systemknows in advance what it is searching for, it can make useof high-level contextual knowledge to improve recognition,even at low-level stages. This system is an attempt to pro-

  • 7/28/2019 A Survey of Methods and Strategies in Character Segmentation

    13/17

    70 2 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VO L 18, NO. 7, JULY 1996

    vide a general framework allowing efficient cooperationbetween low-level and high-level recognition processes.

    rately by classification, and the split giving the highestrecognition confidence was accepted. This approach re-duced segmentation errors 100-fold compared with the

    lXED STRATEGIES: OVERSEGMENTINGTwo radically different segmentation strategies have beenconsidered to this point. One (Section2) attempts to choosethe correct segmentation points (at least for generatinggraphemes) by a general analysis of image features. Theother strategy (Section 3 ) is at the opposite extreme. Nodissection is carried out. Classification algorithms simplydo a form of model-matching against image contents.

    Thirteen word level

    T t h e n -------- letterlevel

    I t I t I n I o -------- feutiwe l e iv lFig. 14. Backward matching. Recognition is performed by matching animage against various candidate words, the most distinctive lettersbeing matched first. In this example, the algorithm first seeks the letter7n the image, then t, h, and so on. These letters are more infor-mative visually because they contain ascenders, and lexically becausethey are consonants.

    In this section, intermediate approaches, essentially hy-brids of the first two, are discussed. This family of methodsalso uses presegmenting, with requirements that are not asstrong as in the grapheme approach. A dissection algorithmis applied to the image, but the intent is to oversegment,i.e., to cut the image in sufficiently many places that thecorrect segmentation boundaries are included among thecuts made, as in Fig. 15. Once this is assured, the optimalsegmentation is defined by a subset of the cuts made. Eachsubset implies a segmentation hypothesis, and classificationis brought to bear to evaluate the different hypotheses andchoose the most promising segmentation.

    Fig. 15. Oversegmenting. Note that letters that contain valleys havebeen dissected into multiple parts. However, no merged charactersremain, so that a correct segmentation can be produced by recombi-nation of some of the segments.

    previously used segmentation technique that did not em-ploy recognition confidence.

    Because touching was assumed limited to pairs, theabove method could be implemented by splitting a singleimage along different cutting paths. Thus, each segmen-tation hypothesis was generated in a single step. Whenthe number of characters in the image to be dissected isnot known a priori, or if there are many touching charac-ters, e.g., cursive writing, then it is usual to generate thevarious hypotheses in two steps. In the first step, a set oflikely cutting paths is determined, an d the in put image isdivided into elementary components by separating alongeach path. In the second step, segmentation hypothesesare generated by forming combinations of the compo-nents. All combinations meeting certain acceptability con-straints (such as size, position, etc.) are produced andscored by classification confidence. An optimization algo-rithm, typically implemented on dynamic programmingprinciples and possibly making use of contextual knowl-edge, does the actual selection.A number of researchers began using this basic approachat about the same time, e.g., [46], 191, [13]. Lexical matchingis included in the overall process in [26] and 1781.

    It is also possible to carry out an oversegmenting proce-dure sequentially by evaluating trial separation boundaries[ 2 ] . n this work, a neural net was trained to detect likelycutting columns for machine printed characters usingneighborhood characteristics. Using these as a base, theoptimization algorithm recursively explored a tree of possi-ble segmentation hypotheses. The left column was fixed ateach step, and various right columns were evaluated usingrecognition confidence. Recursion is used to vary the leftcolumn as well, but pruning rules are employed to avoidtesting all possible combinations.

    5 HOLISTICTRATEGIESA holistic process recognizes an entire word as a unit. Amajor drawback of this class of methods is that their use isusually restricted to a predefined lexicon: Since they donot deal directly with letters but only with words, recog-nition is necessarily constrained to a specific lexicon ofwords. This point is especially critical when training onword samples is required: A training stage is thus man-datory to expand or modify the lexicon of possible words.This property makes this kind of method more suitablefor applications where the lexicon is statically defined(and not likely to change), like check recognition. Theycan also be used for on-line recognition on a personalcomputer (or notepad) , the recognition algorithm beingthen tuned to the writing of a specific user as well as tothe particular vocabulary concerned.

    The strategy in a simple form is illustrated in P91.Here, a great dea l of effort was expended in analyzing theshapes of pairs of touching digits in the neighborhood ofration boundaries. However, multiple separating pointswere tested, i.e., the touching character pair was over-segmented. Each candidate segmentation was tested sepa-

    Whole word recognition was introduced by Earnest atfor on-line recognition, his method followed an off-linemethodology: Data was gathered by means of a photo-style in a binary matrix and no temporal information was

    contact, leading to algorithms for determining likely sepa- the beginning of the 1960s 1211. Although it was designed

  • 7/28/2019 A Survey of Methods and Strategies in Character Segmentation

    14/17

    CASEY AND LECOLINET: A SURVEY OF METHODS AND STRATEGIES IN CHARACTER SEGMENTATION 703

    used. Recognition was based on the comparison of a collec-tion of simple features extracted from the whole wordagainst a lexicon of "codes" representing the "theoretical"shape of the possible words. Feature extraction was basedon the determination of the middle zone of the words andascenders and descenders were found by considering thepart of the writing exceeding thiis zone. The lexicon of pos-sible word codes was obtained by means of a transcodingtable describing all the usual ways of writing letters.

    This strategy still typifies recent holistic methods. Sys-tems still use middle zone determination to detect the as-cenders and descenders. The type of extracted features alsoremains globally the same (ascenders, descenders, direc-tional strokes, cusps, diacritical marks, etc.). Finally, holisticmethods, as illustrated in Fig. 16, usually follow a two-stepscheme:

    The first step performs feature extraction.0 The second step performs global recognition by com-

    paring the representation of the unknown word withthose of the references stored in the lexicon.

    Thus, conceptually, holistic methods use the "classical ap-proach defined in Section 1, with complete words as thesymbols to be recognized. The main advances in recenttechniques reside in the way comparison between hypothe-ses and references is performed. Recent comparison tech-niques are more flexible and better take into account thedramatic variability of handwriting. These techniques(which were originally introduced for speech recognition)are generally based on Dynamic Programming with opti-mization criteria based either on distance measurements oron a probabilistic framework. The first type of method isbased on Edit Distance, DP-matching or similar algorithms,while the second one uses Markov or Hidden MarkovChains.

    Fig. 16. Holistic recognition. Recognition consists of comparing a lexi-con of word descriptions against a seqiuence of features obtained froman unsegmented word image. The detected features, shown below theimages, include loops, oriented strokes, ascenders and descenders.Dynamic Programming was (employed in [621 and 1701

    for check and city name recognition. Words were repre-sented by a list of features indicating the presence of as-cenders, descenders, directional strokes and closed loops.The "middle zone" was not delimited by straight lines, butby means of smooth curves following the central part of the

    word, even if slanted or irregular in size. Relative y-locationwas associated to every feature and uncertainty coefficientswere introduced to make this representation more tolerantto distortion by avoiding binary decisions. A similarscheme was used in [68] and [56], but with different fea-tures. In the first case, features were based on the notion of"guiding points," (which are the intersection of the lettersand the median line of the word), whereas in the latter casethey are derived from the contours of words.

    One of the first systems using Markov Models was de-veloped by Farag in 1979 [25]. In this method, each word isseen as a sequence of oriented strokes which are coded us-ing the Freeman code. The model of representation is a nonstationary Markov Chain of the first or second order. Eachword of the lexicon is represented as a list of stochastictransition matrixes and each matrix contains the transitionprobabilities from the jth stroke to the following one. Therecognized word is the reference Wi of the lexicon whichmaximizes the joint probability P(Z, Wi) where Z is the un-known word.

    Hidden Markov Models are used in 1631 for the recogni-tion of literal digits and in [33] or off-line cheque recogni-tion. Angular representation is used in the first system torepresent the feature, while structural off-line primitives areused in the second case. Moreover, this second system alsoimplement several Markov models at different recognitionstages (word recognition and cheque amount recognition).Context is taken into account via prior probabilities ofwords and word trigrams.

    Another method for the recognition of noisy images ofisolated words such as in checks was recently proposed in[35]. In the learning stage, lines are extracted from binaryimages of words and accumulated in prototypes called"holographs." During the test phase, correlation is used toobtain a distance between an unknown word and eachword prototype. Using these distances, each candidateword is represented in the prototype space. Each class isapproximated with a Gaussian density inside this spaceand these densities are used to calculate the probability thatthe word belongs to each class. Other simple holistic fea-tures (ascenders and descenders, loops, length of the word)are also used in combination with this main method.

    In the machine-printed text area characters are regularso that feature representations are stable, and in a longdocument repetitions of the most common words occurwith predictable frequency. In [45], these characteristicswere combined to cluster the ten most common shortwords with good accuracy, as a precursor to word rec-ognition. It was suggested that identification of theclusters could be done on the basis of unigram and bi-gram frequencies.

    More general applications require a dynamic generationstage of holistic descriptions. This stage converts wordsfrom ASCII form to the holistic representation required bythe recognition algorithm. Word representation is gener-ated from generic information about letter and ligature rep-resentations using a reconstruction model. Word recon-struction is required by applications dealing with a dy-namically defined lexicon, for instance the postal applica-tion 1601 where the list of possible city names is derived

  • 7/28/2019 A Survey of Methods and Strategies in Character Segmentation

    15/17

    704 IEEE TRANSACTIONS ON P AmER N ANALYSIS AND MACHINE INTELLIGENCE, VOL. 18, NO. 7, JULY 1996

    from zip code recognition. Another interesting characteris-tic of this last technique is that it is not used to find thebest solution but to filter the lexicon by reducing its size (adifferent technique being then used to complete recogni-tion). The system was able to achieve 50% size reductionwith under 2% error.

    CCINCLUD~NGEMARKSMethods for treating the problem of segmentation in charac-ter recognition have developed remarkably in the last dec-ade. A variety of techniques has emerged, influenced by de-velopments in related fields such as speech and online recog-nition. In this paper, we have proposed an organization ofthese methods under three basic strategies, with hybrid ap-proaches also identified. It is hoped that this comprehensivediscussion will provide insight into the concepts involved,and perhaps provoke further advances in the area.

    The difficulty of performing accurate segmentation isdetermined by the nature of the material to be read and byits quality. Generally, missegmentation rates for uncon-strained material increase progressively from machine printto handprint to cursive writing. Thus, simple techniquesbased on white separations between characters suffice forclean fixed-pitch typewriting. For cursive script from manywriters and a large vocabulary, at the other extreme, meth-ods of ever increasing sophistication are being pursued.Current research employs models not only of characters,but also words and phrases, and even entire documents,and powerful tools such as HMM, neural nets, contextualmethods are being brought to bear. While we have focusedon the segmentation problem it is clear that segmentationand classification have to be treated in an integrated man-ner to obtain high reliability in complex cases.

    The paper has concentrated on an appreciation of prin-ciples and methods. We have not attempted to compare theeffectiveness of algorithms, or to discuss the crucial topic ofevaluation. In truth, it would be very difficult to assesstechniques separate from the systems for which they weredeveloped. We believe that wise use of context and classi-fier confidence has led to improved accuracies, but there islittle experimental data to permit an estimation of theamount of improvement to be ascribed to advanced tech-niques. Perhaps with the wider availability of standard da-tabases, experimentation will be carried out to shed light onthis issue.

    We have included a list of references sufficient to pro-vide a more-detailed understanding of the approaches de-scribed. We apologize to researchers whose important con-tributions may have been overlooked.

    CKNOWLEDGMENTSAn earlier, abbreviated version of this survey was pre-sented at ICDAR 95 in Montreal, Canada. Prof. GeorgeNagy and Dr. Jianchang Mao read early drafts of the paperand offered critical commentaries that have been of greatuse in the revision process. However, to the authors fallsfull responsibility for faults of omission or commission thatremain.

    Richard G. Caseys research for this paper was per-formed during his sabbatical at Ecole Nationale Superieuredes T6lhcommunications n Paris.

    REFERENCESH. S.Baird, S.Kahan, and T. Pavlidis, Components of an Omni-font Page Reader, Proc. Eighth Intl Conf. attern Recognition,Paris, p p. 344-348,1986.T.Bayer, U. Kressel, and M. Hammelsbeck, Segmenting MergedCharacters, Proc. 11th htl Conf. attern Recognition, vol. 2. conf.B: Pattern Recognition, Methodology, and Systems, pp. 346-349,1992.E.]. Bellegarda, J.R. Bellegarda, D. Nahamoo, and K.S. athan, AProbabilistic Framework for On-line Handwriting Recognition,Pre-Proc. IWFHR 111, Buffalo, N.Y., p. 225, May 1993.S.Bercu and G. Lorette, On-line Handwritten Word Recognition:An Approach Based on Hidden Markov Models, Pre-Proc.IWFHR II,Buffalo, N.Y., p. 385, May 1993.M. Berthod and S.Ahyan, On Line Curs ive Script Recognition: AStructural Approach with Learning, Proc. Fifth Intl Conf. PatternRecognition, p. 723, 1980.M. Bokser, Omnidocument Technologies, Proc. I E E E , vol. 80 ,no. 7, pp . 1,066-1,078, July 1992.R. Bozinovic and S. N. rihari, String Correction Algorithm forCursive Script Recognition, I E E E Trans. Pattern Analysis and Ma -chine Intelligence, vol. 4, no. 6, pp. 655-663, June 1982.R.M. Bozinovic and S. N. rihari, Off-Line Cursive Script Rec-ognition, IEEE Trans. Pattern Analysis and Machine Intelligence,vol. 1 1 , no. 1, p. 68, Ja n. 1989.T. Breuel, Design and Implementation of a System for Recogni-tion of Handwrit ten Responses on US Census Forms, Proc. IAP RM7orkshop Document Analysis Systems, Kaiserlautern, Germany,Oct. 1994.

    [lo] C.J.C. Burges, J.I. Be, and C.R. Nohl, Recognition of HandwrittenCursive Postal Words using N eural Networks, Proc. USPS FifthAdvanced Technology Coni., p. A-117, Nov./Dec. 1992.[11] R.G. Casey and G . Nagy, Recursive Segmentation and Classifi-cation of Composite Patterns, Proc. Sixth Intl Conf. attern Rec-ognition, p. 1,023, 1982.1121 R.G. Casey, Text OCR by Solving a Cryptogram, Proc. EighthIiztl Conf. Pattern Recognition, Paris, pp. 349-351, Oct. 1986.1131 R.G. Casey, Segmentation of Touching Characters in Postal Ad-dresses, Proc. Fifth US Postal Service Technology Conf., WashingtonD.C., 1992.[I41 M. Cesar and R. Shinghal, Algorithm for Segmenting Hand-written Postal Codes, Intl J . Man Machine Studies, vol. 33 , no. 1,

    [l 5] M.Y. Chen an d A. Kundu, An Alternative to Variable DurationHMM in Handwritten Word Recognition, Pre-Pro c. I W H R 111,Buffalo, N. Y. ,. 82, May 1993.[161 C. Chen and J . DeCurtins, Word Recognition in a Segmentation-Free Approach to OCR, Proc. Intl Conf. Document Analysis andRecognition, Tsukuba City, Japan, pp. 573-576, Oct. 1993.[I71 M. Cheriet, Y.S.Huang, and C.Y. Suen, Background Region-Based Algorithm for the Segmentation of Connected Digits, Proc.11th Intl Conf. Pattern Recognition, vol. 2, p. 619, Sept. 1992.1181 M.Cheriet. Readine Cursive Scriot bv Parts. Pve-Proc. IWHR III.

    pp. 63-80, July 1990.

    I ,-Buffalo,N. Y. ,. 403yMay 1993.[I91 G. Dimauro, S. mpedovo, and G. Pirlo, From Character to Cur-sive Script RecogAition: Future Trends in Scientific Research,Proc. 11th Intl Conf. attern Recognition, vol. 2, p. 516, Aug. 1992.1201 C.E. Dunn and P.S.P. Wang, Character Segmenting Techniquesfor Handwritten Text-A Survey, Proc. 11th Intl Conf. atternRecognition, vol. 2, p. 577, Aug. 1992.[211 L.D. Earnest, Machine Recognition of Cursive Writing, C. Cherry,ed., Information Processing, pp. 462-466. London: Butterworth,1962.1221 R.W. Ehrich and K.J. Koehler, Experiments in the ContextualRecognition of Cursive Script, I E E E Trans . Computers , vol. 24,no. 2, p. 182, Feb. 1975.[231 D.G. Elliman and I.T. Lancaster, A Review of Segmentation andContextual Analysis Techniques for Text Recognition, PafternRecognition, vol. 23, no. 3/4, pp. 337-346,1990,

  • 7/28/2019 A Survey of Methods and Strategies in Character Segmentation

    16/17

    CASEY AND LECOLINET: A SURVEY OF METHODS AND STRATEGIES IN CHARACTER SEGMENTATION 705

    [241 R.J. Evey, "Use of a Compute r to Design Character RecognitionLogic," Proc. Eastern joint Comput er Conf., pp. 205-211,1959.[25] R.F.H. Farag, "Word-Level Recognition Recognition of Cur-

    sive Script," I E E E Trans . Computers , vol. 28, no. 2, pp . 172-175,Feb. 1979.

    [26] J.T. Favata and S.N. Srihari, "Recognition of General Handwritt enWords Using a Hypothesis Generation and Reduction Methodol-ogy," Proc. Fifth USPS Advanced Technology Conf., p. 237,Nov./Dec. 1992.[27] R. Fenrich, "Segmenting of Automatical ly Located Handwrit tenNumeric Strings," From Pixels to Features 111,S. mpedovo and J.C.Simon, eds., Chapter 1,p. 47, Elsevier, 1992.[281 P.D. Friday and C.G. Leedham, "A Pre-Segmenter for SeparatingCharacters in Unconstrained Hand-Printed Text," Proc. Int'l Conf.lmage Proc., Singapore, Sept. 1989.[29] H. Fujisawa, Y. Nakano, and K. Kurino, "SegmentationMethods for Character Recognition: From Segmentation toDocument Structure Analysis," Proc. I E E E , vol. 80, no. 7,

    [30] K. Fukushima and T. Imagawa, "Recognition and Segmentationof Connected Characters wi th Selective Attention," Neural Net-works, ol. 6, no. 1, pp. 33-41,1993.[311 P. Gader, M. Magdi, and J-H. Chiang, "Segmentation-BasedHandwritten Word Recognition," Proc. U SP S Fifth Advanced Tech-nology Conf., Nov./Dec. 1992.1321 A.M. Gillies, "Cursive Word Recognition Using Hidden MarkovModels," Proc. USPS Fifth Advanced TechnologyConf., Nov./Dec. 1992.

    [33] M. Gilloux, J.M. Bertille, and M. Leroux, "Recognition of Hand-written Words in a Limited Dynamic Vocabulary," Pre-Proc.IWFHR 111, Buffalo, N.Y., p. 417, 1993.[34] M. Gilloux, "Hidden Markov Models in Handwriting Recogni-tion," Fundamentals in Handwriting Recognition, S. Impedovo, ed.,NATO AS1 Series F Computer and Systems Sciences, vol. 124,Springer Verlag, 1994.[35] N. Gorsky, "Off-line Recognition of Bad Quality Handwri ttenWords Using Prototypes," Fundamentals in Handwriting Recogni-tion, S. Impedovo, ed., NATO AS1 Series F: Computer and Sys-tems Sciences, vol. 124, Springer Verlag, 1994.[36] L.D. Harmon, "Automatic Recognition of Print and Script," Proc.I E E E , vol. 60, no. 10, pp. 1,165-1,177, Oct . 72.[371 K.C. Hayes, "Reading Handwritten Words Using HierarchicalRelaxation," Computer Graphics and lmage Processing, vol. 14,

    [38] R.B. Hennis, "The IBM 1975 Optical Page Reader: System De-sign," l B M J .Research and Developm ent, pp. 346-353, Sept. 1968.

    [39] C.A. Higgins and R. Whitrow, "On-Line Cursive Script Recogni-tion," Proc. lnt'l Conf. Human-Computer Interaction-INTERACT'84, Elsevier, 1985.[401 W.H. Highleyman, "Data for Character Recognition Studies,"I E E E Trans. Electrical Computatio n, pp. 135-136, Mar. 1963.[411 T.K. Ho, J.J. Hull, and S.N. Srihari, "A Word Shape Analysis Ap-proach to Recognition of Degraded Word Images," Pattern Recog-nition Letteus, no. 13, p. 821, 1992.[421 M. Holt, M. Beglou, and S. Datta, "Slant-Independent LetterSegmentation for Off-line Cursive Script Recognition," FromPixels to Features 111, S. Impedovo a nd J.C. Simon, eds., p. 41, El-sevier, 1992.[43] R.L. Hoffman and J.W. McCullough, "Segmentation Methods forRecognition of Machine-Printed Characters," IB M I . Research andDevelopment, pp. 153-65, Mar. 1971.[44] J.J. Hull and S.N. Srihari, "A Computat ional Approach to VisualWord Recognition: Hypothesis Generation and Testing," Pioc .Computer Vision and Pattern Recognition, pp. 156-161, June 1986.

    [45] J. Hull, S. Khoubyari, and T.K. Ho, "Word Image Matching as aTechnique for Degraded Text Recognition," Proc. 12th Int'l Conf.Pattern Recognition, vol. 2, conf. B, pp. 665-668, Sept. 1992.[46] F. Kimura, S. Tsuruoka, M. Shridhar, and Z . Chen, "Context-Directed Handwritten Word Recognition for Postal Service Ap-plications," Proc. Fifth U S Postal Service Technology Conf., Wash-ington, D.C., 1992.[47] F. Kimura, M. Shridhar, and N. Narasimhamurthi, "Lexicon Di-rected Segmentation-Recognition Procedure for UnconstrainedHandwritten Words," Pre-Proc. lWFHR 111, Buffalo, N.Y., p. 122,May 1993.[48] V.A. Kovalevsky, Character Readers and Pattern Recognition.Washington, D.C.: Spartan Books, 1968.

    pp. 1,079-1,092, July 1992.

    pp. 344-364, 1980.

    [491 F. Kuhl, "Classification and Recognition of Hand-Printed Char-acters," IE E Nat'l Convention Record, pp. 75-93, Mar. 1963.[501 A. Kundu, Y. He, and P. Bahl, "Recognition of Handwrit ten

    Words: First and Second Order Hidden Markov Model BasedApproach," Pattern Recognition, vol. 22, no. 3, p. 283,1989.

    [51] E. Lecolinet and J-V. Moreau, "A New System for AutomaticSegmentation a nd Recognition of Unconstrained ZipCodes," Proc. S ixth Scandinavian C onf . Image Analys is , Oulu,Finland, p. 585, June 1989.[52] E. Lecolinet, "Segmentation d'images de mots manuscrits," PhDthesis, Universite Pierre et Marie Curie, Paris, Mar. 1990.1531 E. Lecolinet and J-P. Crettez, "A Grapheme-Based Segmen-tation Technique for Cursive Script Recognition," Proc. I n t ' lConf . Document Analy s is and Recognit ion, Saint Malo, France,p. 740, Sept. 1991.1541 E. Lecolinet, "A New Model for Context-Driven Word Recogni-tion," Proc. Symp. Document Analysis and lnformation Retrieval, LasVegas, p. 135, Apr. 1993.[551 E. Lecolinet and 0. Baret, "Cursive Word Recognition: Methodsand Strategies," Fundamentals in Handwriting Recognition, S. Impe-dovo, ed., NATO AS1 Series F: Computer and Systems Sciences,vol. 124, pp. 235-263, Springer Verlag, 1994.[56] M. Leroux, J-C. Salome, and J. Badard, "Recognition of CursiveScript Words in a Small Lexicon," lnt'l Conf. Document Analysisand Recognition, Saint Malo, France,p. 774, Sept. 1991.[571 S. Liang, M. Ahmadi, and M. Shridhar, "Segmentation ofTouching Characters in Printed Document Recognition," Proc.Int 'l Con f. Document Ana lysis and Recognition, Tsukuba City, Ja-pan, pp. 569-572, Oct. 1993.[58] G. Lorette and Y. Lecourtier, "Is Recognition and Interpretation ofHandwritten Text:A Scene Analysis Problem?" Pre-Pro c. I W H R III,p. 184, Buffalo, N.Y., May 1993.[59] Y. Lu, "On the Segmentation of Touching Characters," lnt'l Conf.Document Analysis and Recognition, Tsukuba, Japan, pp. 440-443,Oct. 1993.

    [60] S. Madhvanath and V. Govindaraju, "Holisitic Lexicon Reduc-tion," Pre-Proc. IWFHR 111, Buffalo, N.Y.,p. 71, May 1993.[61] M. Maier, "Separating Characters in Scripted Documents," Proc.Eighth Int'l Conf. Paftern Recognition, Paris, p. 1,056, 1986.[621 J-V. Moreau, B. Plessis, 0. Bourgeois, and J-L. Plagnaud, "APostal Check Reading System," Int 'l Conf. Document Analysis andRecognition, Saint Malo, France, p. 758, Sept. 1991.[63] R. Nag, K.H. Wong, and F. Fallside, "Script Recognition UsingHidden Markov Models," l E E E I C AS S P, Tokyo, pp. 2,071-2,074,1986.

    1641 T. Nartker, I S R l 1 9 9 2 Annual Report, Univ. of Nevada, Las Vegas,1992.1651 T. Nartker, I S R I 2 9 9 3 Annual Report, Univ. of Nevada, Las Vegas,1993.1661 K. Ohta, I. Kaneko, Y. Itamoto, and Y. Nishijima, Character Seg-mentation of Address ReadinglLetter Sorting Machine for the Ministryof P o s t s and Telecommunications of Japan, NEC Research and Devel-opment, vol. 34, no. 2, pp. 248-256, Apr. 1993.[67] H. Ouladj, G. Lorette, E. Petit, J. Lemoine, and M. Gaudaire,"From Primitives to Letters: A Structural Method to AutomaticCursive Handwriting Recognition," Proc. Sixth Scandinavian Conf.Image Analysis, Finland, p. 593, June 1989.[68] T. Paquet and Y. Lecourtier, "Handwriting Recognition: Applica-tion on Bank Cheques," Proc. Int'l Conf. Document Anal ysis andRecognition, Saint Malo, France, p. 749, Sept. 1991.[69] S.K. Parui, B.B. Chaudhuri, an d D.D. Majumder, "A Procedure forRecognition of Connected Handwritten Numerals," Int' l 1 Sys-tems Science, vol, 13, no. 9, pp. 1,019-1,029, 1982.

    [70] B. Plessis, A. Sicsu, L. Heute, E. Lecolinet, 0. Debon, and J-V.Moreau, "A Multi-Classifier Strategy for the Recognition ofHandwritten Cursive Words," Proc. lnt'l Conf. Document Analysisand Recognition, Tsukuba City, Japan, pp . 642-645, Oct. 1993.[71] J. Rocha and T. Pavlidis, "New Method for Word RecognitionWithout segmentation," Proc. SPIE Character Recognition Tcchnolo-gies, vol. 1,906, pp. 74-80,1993.

    [72] K.M. Sayre, "Machine Recognition of Handwrittt en Words: AProject Report," Pattern Recognition, vol. 5, pp. 213-228, 1973.[73] J. Schuermann, "Reading Machines," Proc. Sixth Int'l Conf. PatternRecognition, Munich, Germany, 1982.[74] G. Seni and E. Cohen, "External Word Segmentation of Off-Line Handwritten Text Lines," Pattern Recognition, vol. 27, no. 1,pp. 41-52, Jan. 1994.

  • 7/28/2019 A Survey of Methods and Strategies in Character Segmentation

    17/17

    706 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 18, NO. 7, J ULY 1996

    [75] A.W. Senior and F. Fallside, An Off-line Cursive Script Recogni-tion System Using Recurrent Error Propagation Networks, Pre-Proc. IWFH R I l l , Buffalo, N.Y., p. 132, May 1993.1761 M. Shridhar and A. Badreldin, Recognition of Isolate