SimpleLPR

9

Click here to load reader

Transcript of SimpleLPR

Page 1: SimpleLPR

A Simple License Plate Recognition System forSpanish License Plates

Xavier Girones Sancho

Student in Master’s Degree in Artificial Intelligence,Universitat Rovira i Virgili,

Av. Paısos Catalans 26,E-43007 Tarragona, Catalonia, Spain

e-mail: [email protected]

Abstract. This paper presents a straightforward method of identifyingand recognizing vehicle license plates, which has been successfully usedin a “real world” product. The system consists of three main modules:Plate area detection, character segmentation and clustering, and platenumber recognition. In the plate area detection module, a multiresolu-tion histogram analysis method is proposed. This method implementsan effective acceptability filter. Then, character candidates are singledout and arranged in clusters. Finally, each plate character is recognizedby template matching and the best overall plate candidate is selected.The system was evaluated in an image set of 120 vehicle pictures and itachieved roughly a 90% recognition rate.Keywords: Articles, LPR, License Plate Recognition.

1 Introduction

Automatic vehicle license plate recognition (LPR) has been one of the classicalsubjects in computer vision since its inception. While far from being trivial, theintrinsic characteristics of license plates qualify this problem as one of the mostaccessible in the realm of pattern recognition in natural images. Applicationsof LPR are numerous: traffic control and monitoring, parking access, vehiclemanagement. It can be used as well to detect security violations such as enteringrestricted areas without clearance, occupying reserved lanes or parking spaces,surpassing speed limits, stolen car detection, red light crossing. . .

There are many commercial systems in the market, some of them intendedfor general purpose and some optimized for a specific duty. Usually optimiza-tions take the form of improved recognition accuracy/speed at the expense ofconstraints on the image to be recognized. Constraints can be on the positionand distance from the camera to the license plates, the inclination, the level ofillumination, absence of shadows and so on. There are nearly as many differentapproaches to LPR as systems are in the market, since a LPR system typi-cally consists of several functional blocks, and each block can be implementedin many different ways. For instance, a plate area detection stage could possiblybe implemented using the Hough transform to find the plate rectangle, or more

Page 2: SimpleLPR

commonly, locating the plate texture using invariant moments, Fourier, Gabor,wavelet, histogram or interlacing analysis to name only a few.

This paper describes a straightforward method for detecting license platesand recognizing their corresponding license plate number (SimpleLPR). Theproduct in which this method was implemented was on a tight schedule and theentire development cycle took less than two months. The trade off was that thesimplest methods available were to be used, as there was no time to experimentwith relatively sophisticated techniques such as wavelet analysis, neural networkor support vector machine classifiers, which surely would have led to improvedperformance and recognition rates. In spite of this limitation, the system per-formed surprisingly well (see http://www.warelogic.com).

2 Description

Figure 1 on the facing page shows a diagram of the architecture of the currentsolution. It can be regarded as a pipeline of processing blocks, each stage tak-ing the input of its predecessor and delivering its output to its successor. Atfunctional level three main processing blocks can be identified.

1. License plate area detection. Acceptability filter. Finds suitable areas in theimage based on their histogram distribution.

2. Thresholding, shape segmentation and clustering. Finds an optimal thresh-old for the candidate plate area by analyzing its histogram information.Segments shapes and finds clusters of aligned character candidates.

3. Character identification by template matching and plate verification. Foreach shape, finds its correlation value with all members in the predefined setof character templates. This set consists of templates of all characters thatcan be found in Spanish license plates. Only candidates that can result invalid license plate numbers are considered.

In the first place, either a front or rear image of the vehicle must be acquired.An effort has been put in allowing unconstrained images as input. Images cancome from a variety of sources, they can be pictures from a digital camera orframes from a video stream source. They can also be in either RGB or grayscalecolor spaces. Neither special optics nor infrared illumination sources are required,in contrast to many LPR systems in the market. The implication is that our sys-tem does not rely on crisp images and accurate colors. Hence, it must be able todeal with certain amount of blurring and graininess in the input pictures. Nev-ertheless, to achieve an acceptable detection rate a few conditions have to bemet. Preferably the plate picture should be visible at a straight angle ± 30 degfrom the license plate plane normal vector, as the current solution does not pro-vide for affine rectification. In addition there should not be obstacles occludingthe license plate characters and, ideally, character height should be 32 pixels ortaller.

Page 3: SimpleLPR

Image Acquisition

Vehicle License

Plate Multiresolution

histogram analysis

Segmentation and

clustering

4 - 8595

5 - 8872

6 - 9560

2 - 8550

B - 8982

M - 8832

C - 8658

Template matching and

verification

4562BMC

Best candidate

selection

Fig. 1. Main blocks in the SimpleLPR system

2.1 Plate Area Detection by Histogram Multiresolution Analysis

The main goal of this block is to work as acceptability filter, namely discard allareas in the image where there is almost no likelyhood that a license plate is tobe found there. It is tuned to get as few as false rejections (FRR) as possible,at the expense of an increased rate of false accept errors (FAR). The platearea detection algorithm takes advantage of the particularities of the histogramdistribution [1] of Spanish license plates. The ordinary Spanish license platefeatures black characters on a white background. Hence, its derived histogramshould display a bimodal distribution, with two well defined peaks. Figure 2shows a typical license plate of the kind this system is tailored for and Figure 3shows its associated histogram for the lightness channel.

However, a plain search for areas in the image that exhibit a bimodal distri-bution would not be acceptable in terms of FAR, there would be just too many.To narrow the search further, SimpleLPR takes also into account relationshipsamongst neighboring image areas. In particular, the following steps are taken:

Page 4: SimpleLPR

Fig. 2. Picture of a license plate Fig. 3. Histogram of the plate area

1. Split the input image into 24×4 pixel windows and calculate their histogramdistributions. If images are in RGB colorspace this step is repeated for eachchannel.

2. Build a dyadic histogram pyramid where each element Hi,j on the new levelL is defined by the recurrence formula:

HLi,j = HL−1

2i,2j + HL−12i,2j+1 + HL−1

2i+1,2j + HL−12i+1,2j+1 (1)

The process stops just before the use of the recurrence formula would leaveless than 8 elements across any of the axes. Namely LMAX = min

(⌊Log2( W

8·24 )⌋,⌊Log2( H

8·4 )⌋)

where W and H are the image width and height respectively.3. Build a gaussian pyramid [1] of the input image of the same height as the

histogram pyramid. In case the image was originally in RGB color space itwill be first converted to grayscale.

4. Starting at the coarsest level of the histogram pyramid. For each level ofdetail and while no plates are found:(a) Look for 4-connected groups with more than 4 elements that meet the

conditions below:i. Intra-element. Each histogram must conform to a bimodal distribu-

tion. In addition, if the image is RGB peak locations on each channelmust be the same up to scale.

ii. Inter-element. The locations of the peaks of all elements in the groupmust be the same up to a threshold. If the image is RGB this has tobe enforced channel wise.

(b) When such a group is found activate the next stage in the pipeline 2.2passing along– The location and bounding box of the group– Peak intensities for each channel– The level of detail (LOD)

Figure 4 on the next page shows the histogram windows for LOD 4 and 5where the condition stated in 4(a)i is met. In the next step, after the conditionin 4(a)ii is enforced, only windows belonging to the license plate area will remain.

The algorithm described above is especially fit for this task because, asidefrom plate location candidates, it returns the intensities of the histogram lightand dark peaks. As it is shown in section 2.2, these values are later used tocalculate an optimum threshold for binarization.

Page 5: SimpleLPR

Fig. 4. Areas selected by the multiresolution histogram filter for scale levels 4and 5. Window sizes are 24×4 and 96×16 respectively

2.2 Thresholding, Shape Segmentation and Clustering

Once an area in the image has been identified as a license plate candidate, thenext step is to isolate the characters that make up the plate number. From theprevious section we get the candidate bounding box (CBB), the level of detail(LOD) and the peak intensities per channel (PIC ). LOD is used to select animage in the gaussian pyramid that becomes the working image. Selecting theright working scale results in a 4× performance gain for each level we go up fromlevel 0. The license plate character isolation is accomplished in four steps:

1. Binarization. The PIC values are used to calculate a threshold level thatlays in between the dark/light peaks. Then, an auxiliary binary image withthe same dimensions as the working image is created and its pixel values setin accordance with the following rule:

IAUX(x, y) ={

0 : IWRK(x, y) < Treshold1 : IWRK(x, y) > Treshold (2)

2. Shape segmentation. The flood fill [1] algorithm is used on the binarizedimage to detect shapes that become the character candidates. In case a shapeis too large or too small it is pruned.

3. Clustering. A variant of the k-means [2] clustering algorithm is used tolocate groups of shapes arranged along a straight line. The rules to decidethat a group of shapes forms a cluster are(a) Their centers lie on a straight line. This is determined via principle com-

ponent analysis [1] (PCA).

Page 6: SimpleLPR

(b) The distance between neighboring items is regular up to a threshold.(c) The height of the shapes taken from the line that connects their centers

is constant up to a threshold.(d) There are more than four elements in the cluster.

Figures 5 and 6 illustrate the operation of the clustering algorithm. Theshapes in the clusters are joined with a straight line. Bounding boxes arealso represented.

Fig. 5. B&W image with four separated text groups

Fig. 6. The four text groups are singled out as a result of the execution of theclustering algorithm on the image in Figure 5

Page 7: SimpleLPR

4. Rotation compensation. The template matching algorithm discussed in 2.3requires character candidates and templates to be aligned in order to oper-ate properly. This step performs a rotation transform on the plate candidatethat aligns the cluster baseline with the x axis. This is illustrated in Figures 7and 8.

Fig. 7. Binarized image of the li-cense plate shown prior rotation

Fig. 8. License plate after the rota-tion compensation

2.3 Character Identification by Template Matching and PlateVerification

The goal of this block is to turn a list of grayscale raster character candidatesinto a license plate number (LPN ). Candidates are matched with the templatesof all possible symbols using cross-correlation [1]:

c(x, y) =∑

s

∑t

f(s, t)w(x + s, y + t) (3)

Table 1 on the following page shows some correlation values of character candi-dates with templates computed using (3). Of course, one could just compose aplate number by just keeping the symbols with maximum cross-correlation, butthis would be flawed approach. An effective algorithm should take into accountthe following considerations:

– There are symbols nearly indistinguishable from each other correlation wise.For instance 0 and O, 1 and I, 8 and B.

– There can be extra shapes at the front or at the tail of the candidate listthat do not correspond to license plate characters. Chances are that randomshapes nearby the license plate with the right size and alignment are includedin the candidate list. For example, depending on the environment conditionsone could get BT6300AKIII.

As the problem definition involves constraints and weights, a constrained opti-mization problem solver (COPS ) must be employed. SimpleLPR implements anad hoc COPS that attempts to accomplish the following goals:

– LPN candidates must be valid Spanish license plates.

Page 8: SimpleLPR

Character candidate Template symbol Correlation value

T 5473I 4840Y 47881 4768

6 6044G 54828 54560 5315

3 5974S 51998 51459 4958

0 6111O 5939D 5841G 5497

0 6202O 6039D 5983G 5510

A 41874 4156X 3684J 3653

K 5528X 4723F 4311R 4057

Table 1. Character candidates along with the 4 topmost similar templates andtheir correlation values

– Only the 3 topmost symbols with maximum correlation values are consideredon each character candidate.

– Maximize the LPN sum of correlation values.– Maximize the LPN string length. For example, 0123BTV would take

precedence over O1238T.

3 Results

The system was initially evaluated in an image set of 120 pictures of vehiclelicense plates. The computer was a Pentium IV 2.4 GHz. Images were acquiredusing a low end Creative DC-2320 digital camera. Their sizes ranged from 640×480 pixels to 1600×1200 pixels. The recognition rate was approximately 90% as109 pictures were successfully recognized. Typical detection times ranged from

Page 9: SimpleLPR

.1 seconds to 1.2 seconds depending on the image size and the relative size of theplate within the image. The remaining 11 plates that were incorrectly recognizedor not recognized at all can be classified according to their error types.

– 3 cases where the plate characters merged with the plate borders due todirtiness.

– 2 cases where two plate characters were merged due to a scratch.– 1 case where a plate character had a bump.– 2 case where the pictures were taken with more than 30 deg slant.– 1 case where the licence plate characters where shorter than 32 pixels.– 1 case where there was a hard shadow that rendered the plate impossible to

binarize using a single threshold.– 1 case where the ambient light was blue.

In spite of its limitations, the system has been successfully used to assist theclassification of vehicle pictures in applications where the license plate numberhad previously to be entered manually.

4 Conclusions

A straightforward license plate recognition system has been presented. It per-forms reasonably well on most license plate images. It is tolerant to scale androtation transforms, but it does not provide for affine rectification in case ofpictures taken at oblique angles. In addition, the long recognition times makesit unsuitable for real time video applications. The system presented here can beconsidered a corroboration of the 80%/20% rule. With relatively little develop-ment effort and using the plainest techniques available it achieves a decent 90%success rate with less than a second recognition time. However, it is the author’sopinion that bridging the gap to achieve a 99% recognition rate cannot be accom-plished without redesigning the system from scratch. Particularly, the currentcharacter segmentation algorithm involves connectivity and thresholding, whichare inherently weak; and general affine rectification should also be provided. Inaddition, cross-correlation performs poorly when compared with ANN or SVMbased classifiers. It is the author’s purpose to address these issues in a futurework, which will be targeted at real time video surveillance applications.

References

1. C. Gonzales, Richard E. Woods, Digital Image Processing (fourth edition). PrenticeHall, 2002. 2.1, 3, 2, 3a, 2.3

2. S. Theodoridis, K. Koutroumbas, Pattern recognition (second edition). Elsevier,2003. 3