[Lecture Notes in Computer Science] Advances in Multimedia Information Processing - PCM 2009 Volume...

P. Muneesawang et al. (Eds.): PCM 2009, LNCS 5879, pp. 826–835, 2009. © Springer-Verlag Berlin Heidelberg 2009

Extraction of 2D Barcode Using Keypoint Selection and Line Detection

Lim Kart Leong and Wang Yue

Institute for Infocomm Research, Agency for Science and Technology Research, 1 Fusionopolis Way, Singapore 138632

{kllim,ywang}@i2r.a-star.edu.sg

Abstract. We present a new method for 2D barcode extraction based on object keypoint selection and line detection for the orientation finding of the barcode. In the interest point detection stage, we apply SURF for fast and robust key-point detection. Next, we perform a non-linear SVM classification to detect the barcode location. We perform local threshold based on the barcode location in the image before applying a morphological operation to segment the barcode from the background information. Finally, we apply edge detection and Hough Transform to extract information of the barcode content.

Keywords: 2D Barcode Detection, SURF, SVM, Kernel, Edge Detection, Hough Transform.

1 Introduction

In this paper, we propose a new algorithm for decoding 2D barcode from the back-ground of an image. The barcode is displayed and captured from TV screen at a dis-tance using a mobile camera phone. To the best of our knowledge, currently there is no work that deals specifically with this topic of extracting 2D barcode automatically from a complex background. We will make a few restrictions to the problem. Firstly, we assume that the mobile phone is equipped with at least a 2 mega pixels camera. Most of the current mid-end mobile phones are already 2M and above. At furthest distance, the maximum captured distance is 3 meters away from a 29” TV or propor-tional. This is due to our technical requirement of having at least 4 pixels width per block. As for our barcode design, we adopted a simplified version of the Data Matrix barcode. Our algorithm is robust in terms of rotation, scale, illumination, viewpoint and affine changes. Our goal in developing the algorithm is to exploit the technical feasibility of deployment in commercial or entertainment application such as adver-tisement, information sharing and game interaction involving 2D barcodes. Fig. 5 shows an example of 2D barcode captured on a background of domestic setting.

2 Our Method

In our barcode extraction method, we first run keypoint feature extraction on the im-age. Then, we treat it as a two class problem of separating barcode keypoint features

Extraction of 2D Barcode Using Keypoint Selection and Line Detection 827

from background. The keypoint selection approach is advantageous over heuristic or brute force computer vision approaches [4] since the feature or local descriptor is robust to variants such as scale, rotation, illumination, affine and viewpoint changes [5], [7]. Furthermore, by training a SVM [3] model for keypoint selection, the actual computation cost during the localization is greatly reduced without compromising accuracy. Once the barcode location is found, there are a few ways to extract the con-tent. One way is to use apply the location of the barcode to a template matching method such as in [4] so as to reduce search redundancy across the entire image. An-other method which we present here is a simple method based on binary segmentation and line detection. Our extraction algorithm can be analyzed in five main stages, (i) Feature Extraction, (ii) Feature Training & Selection, (iii) Local Thresholding, (iv) Binary Segmentation and (v) Edge Transform and Line Detection.

2.1 Feature Extraction

The 2D barcode is a heterogeneous textured object, made up of contrasting dark and bright intensity blocks which create interest points that typical corner or blob detec-tors can easily pick up. These detected keypoint are unique because they possess good repeatability [5], [6], [7]. Fig.1 shows two different types of keypoint detection on a

(a) CSS (b) SURF

(c) SIFT

Fig. 1. Keypoint detection on 2D barcode image using (a) CSS, (b) SURF and (C) SIFT. SURF is a better choice here due to fewer and more evenly spread of keypoint across various strategic positions across texture.

828 L.K. Leong and W. Yue

barcode using corner or blob detectors. From the figure, CSS [6] is limited to detect-ing mostly corners which may not be statistically sufficient for good local representa-tion. SIFT [7] is shown to response intensively along the edges and corners. SURF [5] is shown to give much fewer keypoints than SIFT while exhibiting somewhat similar keypoint distribution across the texture.

To make sense of the keypoints, each location is then computed to find its unique local representation or keypoint descriptor. Some common local descriptor methods are normalized gray patches, color and edge histograms and gradient responses. Ac-cording to [5], [7] local descriptors that rely on gradient responses such as SURF and SIFT are found to be most robust towards image variants. An important criterion for good local descriptor is robustness to image distortions, fast computation and low dimensionality.

2.2 Feature Training and Selection

Prior to SVM training, we perform min-max normalization to scale the training sam-ples to values between -1 and +1. Linear SVM constructs a discriminant function

bxwxy T +=)( for separating two classes by identifying the individual training

samples or support vectors that lie along the linear margin separating the two classes, while Kernel SVM extends the same principle into the kernel space:

bxxKdxy ii

N

ii +=∑

=

),()(1

α (1)

where iα , id , ix , x, ),( xxK i , b refer to the Lagrange multiplier, the class label,

the training sample of the optimal support vectors, the unknown sample, the kernel function and the bias respectively.

Solution for the optimal support vectors is achieved by solving an optimization problem of minimizing the number of support vectors for each class while under the constraint of maximizing the margin width of the shared class boundary. This prob-lem can be expressed as a simplified form called the dual problem and only requires solving for α :

∑∑∑= ==

−=N

i

N

jjijiji

N

ii xxKddQ

1 11

),(2

1)( αααα (2)

We solve for α by using quadprog() in the Matlab Optimization Toolbox. Solving for α will also solve the bias, b of the hyperplane and identify the optimal support vec-tors from the training samples.

SVM offers several enhancements such as the soft margin and non-linear kernels. In the soft-margin type, SVM can vary the margin of the boundary to achieve better overall classification than using hard margin, while according to Cover’s theorem, non-linear kernels projects the data into higher dimensional feature space so as to increase the linear separation of the problem. Examples of kernel types are Polyno-mial, Gaussian Radial Basis Function, Hyperbolic Tangent and etc. The function for the Gaussian Radial Basis Function kernel is given as:


⎟⎟

⎠

⎞

⎜⎜

⎝

⎛ −−= 2

2

2121 2

exp),(σ

xxxxK (3)

The classification of an unknown sample x, is performed by using (1):

[ ][ ]⎩

⎨⎧

+=−=

=1)(sgn__

1)(sgn__)(

xyifbackground

xyifbarcodexf (4)

Fig.5 shows the SVM classification result on test images after training

2.3 Local Thresholding

We use the SVM to predict barcode keypoint in the image. We then take the vector mean of the coordinates of all the detected barcode keypoints to obtain a mean point in the region of interest and use a threshold method such as Otsu’s [8] to perform the task adaptively. The advantage of using an adaptive threshold is that it is more robust to non uniform illumination changes and is not affected by the overall intensity of the background. From our experiment, we found that for big barcode images, Otsu’s can work fairly well since the barcode content dominates the background. However, for small barcode the global threshold can no longer work. Similarly, we found that set-ting a high threshold at 240 out of 255 almost always work for most small barcode but will usually fail for big barcode. This is illustrated in Fig.2. We do not require a method to identify whether the unknown barcode in an image is big or small as we rely on the SVM detected location to perform robust local adaptive thresholding.

(a) Fixed (b) Global (c) Local

(d) Fixed (e) Global (f) Local

Fig. 2. Various threshold scheme on big and small barcodes. (a, d) refer to using a fixed value threshold. (b, e) refer to using Otsu thresholding on the entire image. (c, f) refer to using Otsu thresholding on local region located by using a SVM classifier. Only the local method is robust to scale changes in barcode for thresholding.


2.4 Segmentation

It can be seen from the design of the barcode in Fig.1 that the primary pattern is en-closed by a white border. Based on that observation, our strategy for segmentation here given a binary image is to apply a binary flood-fill operation [1] to the image to replace connected black pixels with white while avoiding white objects, resulting in the removal of all black background surrounding white objects. We also preset a thin border of the binary image to black before performing flood-fill in order to avoid discontinuity in the flood-fill operation. The result is shown in Fig.3(c).

In Fig.3(c) there are some minor but visible residues from the earlier flood-fill op-eration. In order to clean up the residues, we apply a morphological open operation [1] to remove small objects which are defined as connected components fewer than N pixels, whereby N is prior information by deciding the minimum number of pixels of the L-shape structure of the barcode image taken at the furthest distance with a cam-era. The open operation also removes some of the actual barcode data but will essen-tially preserve the full L-structure. This is advantageous since the burden of the line detection task is further reduced. We can easily recover the missing barcode data by using the same coordinates of the L-structure on the pre-open image. Fig.3(d) illus-trates the changes in above.

(a) (b)

(c) (d)

Fig. 3. Result of flood-fill operation on binary images. (a) Original binary image. (b) Flood-fill operation after 40% iterations. (c) Flood-fill operation after 100% iterations. (d) Image denois-ing by open operation after flood-filling.


2.5 Edge Transform and Line Detection

By performing a 3x3 local gradient computation on the segmented image (L-structure data), we obtain both information on the gradient magnitude and the vector directions

),( nn dydx for each pixel. By eliminating all pixels with low gradient magnitudes,

we obtain only the edge points of the object in the image [2]. Then, we calculate the

Hough parameters ( nR , nD ) for each edge pixel using its initial vector direction

),( nn dydx in (5) to get nnD θ= and nR in (6).

)/(tan 1nnn dxdy−=θ

nnnnn RDyDx =+ sincos

(5)

(6)

However, we should take note that the 3x3 gradient window can only offer accuracy up to a specific angle resolution. If we want to improve the resolution of each edge pixel for better line detection, we require a small tolerance of e.g. °±15 with a small step size from its original gradient direction for casting more votes. Thus, when per-forming Hough transform on all the edge points, we eventually perform with

{ }°±= 15nnD θ in (5) and (6).

For the vote accumulation part, instead of using the standard 2D Hough bin, we re-duce the memory storage (since a huge image will require a redundant matrix with the same size) by working with a 1D Hough bin which we call the line accumulator bin. The purpose is to identify all possible lines (we refer a line as a set of Hough parame-ters) using the previously computed Hough parameters on each edge pixel. We first perform sorting on parameter R and we assign an accumulator bin to the first line. We continue adding bin until no more possible lines are found. In order to reduce the bin size, we combine lines with minor difference at R °± 2 and D 1± tolerance apart. By the same token, we add an accumulator vote to the particular bin for each combined line found. To detect the peak line or the longest line, we simply select the accumula-tor bin with the highest vote. This is shown in Fig.4(c).

In order to find the perpendicular line to the peak line, we find another peak line that occurs around °±° 1590 from the first peak line. The °±15 tolerance is to compensate for cases where one of the perpendicular line may not be exactly perpen-dicular. After obtaining peaks corresponding to the 2 perpendicular lines, we can use

the two sets of Hough parameters ),( nn DR to solve for the interception coordinate

point of the lines to find the coordinate ),( yx of the first corner point, B using the

following:

21

2

2

1

1

cotcot

sinsin

DD

D

R

D

R

x−

⎟⎟⎠

⎞⎜⎜⎝

⎛−

= , 12

1

1

2

2

tantan

coscos

DD

D

R

D

R

y−

⎟⎟⎠

⎞⎜⎜⎝

⎛−

=

(7)


The two other corner points are solved by first scanning edge pixels in the image to find all the edge pixels belong to either of the peak lines on the image, then finding the two edge pixels with the largest magnitude offset from point B. This is shown in Fig.4(d).

Once we have the exact orientation of the barcode from the 3 corner points and we have prior knowledge about the barcode cell dimension, we can easily sample the local cells to extract the content inside the barcode as shown in Fig. 4(e).

(a) (b)

(c) (d)

(e) (f)

Fig. 4. Various plots showing operation of edge transform and line detection. (a) Image after open operation. (b) Computing image gradient for edge detection (c) Finding peak line from histogram of line accumulaton bin. (d) Identifying the L-shape and corners. (e) Extracting barcode content by local sampling. (f) Final result of segmentation algorithm.


3 Experiments

We have created in excess of over 1500 images, a data set of TV displayed barcodes captured with background of various indoor environments, using various mobile phone cameras ranging from 1M to 2M pixels. In order to allow sufficient pixel repre-sentation of the barcode to be captured when at far, the image resolution size is usu-ally quite large at typically above 1M pixels.

A quick overview of our algorithm is as follows:

1. Perform SURF feature extraction 2. Train a kernel SVM classifier 3. Locate barcode with SVM classifier 4. Run local threshold on grayscale image 5. Segment barcode with binary flood-fill operation 6. De-noise with morphological open operation 7. Compute image gradient and edge transform 8. Run Hough transform on edge pixels 9. Detect orientation of barcode L-shape. 10. Extract barcode content

Our main focus here is object localization and there are some slight differences in the way the classifier is applied. Firstly, although we require the classifier to be able to correctly label keypoints within the barcode as a specific class, we can afford to mis-classify some barcode keypoints as background. This is inevitable since the image background is unpredictable and the two class problem is usually non-linearly separa-ble. We simply discard all keypoints detected as background so long as there is at least one positive keypoint. Lastly, our localization problem cannot afford to have false positive i.e. barcode keypoints falling outside the barcode texture. In order to avoid this problem, we take two measures. We prefer to use a keypoint method that will generate fewer but sufficient keypoints in the image, and we require a kernel mapping that can provide the best separation between barcode and background.

Since the pattern of the barcode is less varying than the background, we allocate more training samples to background. In our experiment, we randomly select 1000 samples from background and 500 samples from barcode for SVM training. The set-ting of the soft margin is less critical as a value of 1 will suffice for most case. The parameterσ is an important parameter that helps to determine the biasing of non-linear separation between barcode and background samples. A high value allows more barcode samples to be detected but will also increase misclassification of back-ground samples as barcode, while a low value ensures that very low false detection of barcode samples will occur at the expense of lower detection of barcode samples. A setting of σ =0.5 during training was found to work extremely well for both training and test images. From our experiments, we found SURF to be a better choice than SIFT and Gaussian RBF kernel to be more suitable than polynomial, sigmoid or RBF.

One major problem with SVM is the increased amount of memory storage and slow computation speed of the training process as the number of samples increases. In literature, a popular approach which tries to address this problem is to use a ‘bag of features’ to quantize all possible features into fixed bins, much like a histogram. We


do not require this quantization step since our test results showed that as little as a total of 1500 random keypoints from 4 training images are already sufficient for good barcode localization for most of our test data set, with an accuracy of over 90% on 200 test images. Some image results of the proposed algorithm are shown below in Fig.5.

Fig. 5. Kernel SVM detection of barcode keypoints from various indoor settings. Left images: red crosshairs refers to keypoints from background (discarded) while the green circles refer to barcode keypoints. The result shows that the proposed method can accurately identify the loca-tion of the barcode by correctly classifying unknown keypoint features extracted from the images. Right images: Segmentation result after locating barcode. The L-shape is used to orien-tate the local sampling of the barcode content.


4 Conclusion

We have presented a new method for 2D barcode extraction from background by using an object recognition method together with a segmentation method. The learn-ing is optimized by using a kernel SVM for class separation and this is in turn used to locate the barcode from background. The segmentation part is performed by a series of methods including local thresholding, binary morphology, edge detection and line detection which is simple and robust. Using the proposed method, we achieved good detection rate and good segmentation result of barcode extraction on both training and test images. In future, we intend to investigate the feasibility of the method on various commercial barcode designs as well as algorithm porting on mobile phone devices.

References

1. Soille, P.: Morphological Image Analysis: Principles and Applications, pp. 173–174. Springer, Heidelberg (1999)

2. O’gorman, F., Clowes, M.B.: Finding Picture Edges Through Collinearity of Feature Points. In: Internat. Joint Conf. on Artifi. Intel., pp. 543–555 (1973)

3. Haykin, S.: Neural Network., 2nd edn. A Comprehensive Foundation. Prentice Hall, Engle-wood Cliffs (1999)

4. Chin, T.-J., Goh, H., Tan, N.-M.: Exact Integral Images At Generic Angles For 2D Barcode Detection. In: ICPR, pp. 1–4 (2008)

5. Bay, H., Tuytelaars, T., Gool, V.: Surf: Speeded Up Robust Features. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. Part I. LNCS, vol. 3951, pp. 404–417. Springer, Heidelberg (2006)

6. He, X.-C., Yung, H.-C.: Corner Detection Based On Global And Local Curvature Proper-ties. Optical Engineering 47(5) (2008)

7. Lowe, D.: Distinctive Image Features From Scale-Invariant Keypoints. International Journal of Computer Vision 60, 91–110 (2004)

8. Otsu, N.: A Threshold Selection Method From Gray-Level Histograms. IEEE., Trans., Sys., Man., Cyber. 9, 62–66 (1979)

[Lecture Notes in Computer Science] Advances in Multimedia Information Processing - PCM 2009 Volume...

Documents

Transcript of [Lecture Notes in Computer Science] Advances in Multimedia Information Processing - PCM 2009 Volume...