Implementing Rectangle Detection using Windowed Hough ......1 Implementing Rectangle Detection using...

1

Implementing Rectangle Detection usingWindowed Hough Transform

Akhil Singh, Music Engineering, University of Miami

Abstract—This paper implements Jung and Schramm’s method to use Hough Transform for rectangle recognition usinga few pre processing methods and performing a windowed application so that the the algorithm can perform faster.Images are scanned with a sliding Hough transform window, peaks of the transform are extracted(which correspondto line segments) and rectangles are detected based on a certain geometric conditions. The algorithm proposed waseffective in identifying rectangles in synthetic images and a few ’not so complex’ natural images. This method fails in thecase of overlapped images. An alternate method was used for such images.

Index Terms—Hough Transform, Rectangle Detection, Image Processing

F

1 INTRODUCTION

O BJECT detection in general has many ap-plications like detecting number plates of

vehicles, in microbiology(to detect the num-ber/type of cells), aerial imaging, etc. In thisproject, we particularly talk about detection ofrectangles in a given image. Implementationand results of windowed Hough Transformsuggested by Jung and Schramm are shownand also a few modifications to the originalalgorithm are suggested and implemented.

The easiest way to detect rectangles is totackle it from the basic level of lines and edges.Such a method will be limited to parallel andorthogonal lines. It would fail miserably in thecases where the images are angled, do not haveperfect sharp edges/intersections, etc. .

There have been different methods to tacklethis problem. Windowed Hough transform[2]was found to be a very efficient way to de-tect rectangles. The paper clearly mentions thatfalse detection results when aligned rectangleare close to each other. In this project, westart with the implementation of their idea andmodifying it to get our desired results.

2 IMPLEMENTATION

2.1 Hough TransformHough transform is a feature extraction tech-nique used in image analysis and digital image

processing to find geometrical shapes in animage by a voting procedure. This voting pro-cedure is carried out in a parameter space, fromwhich object candidates are obtained as localmaxima in a so-called accumulator space thatis explicitly constructed by the algorithm forcomputing the Hough transform[1]. The ideaof the transform is the fact that any line on axy plane can be described as ρ = x cos θ+y sin θwhere ρ is the normal distance and θ is thenormal angle of a straight line, making theline a 2D function C(ρ, θ) that represents thenumber of edge points satisfying the equation.Fig. 1 shows the graphical representation of thetransformation.

Fig. 1: Representation of a straight line inHough Parameters[1]

2

The local maxima of the 2 dimensional lineequation obtained after the transform can beused to detect the straight line segments pass-ing through edge points. These edge pointsand line intersections help us in determiningvarious geometrical structures based on thegeneric equations. In our case the geometricalstructure being a rectangle.

2.2 Rectangle Patterns in the Hough SpaceIn this section, we discuss the implementationthat Jung and Schramm suggested to detectrectangles in Hough space. All the images fromthe following section are taken from their pa-per. The generic rectangle has 5 degrees offreedom: two coordinates of the center, width,height and orientation. Having a 5D accumula-tor array to compute would be computationallyvery expensive and time taking. The papertalks about an easier implementation which iscomputational cheaper but efficient.

Consider a rectangle with vertices P1 =(x1, y1), P2 = (x2, y2), P3 = (x3, y) and P4 =(x4, y4) with P1P2 P3P4 being parallel sides withlength a, as well as P2P3 and P4P1 with lengthb. Also assume that the coordinate system of islocated in the center of the rectangle, as shownin Fig.2.

Fig. 2: Rectangle centered at the origin of thecoordinate system[2]

The image of this rectangle after the appli-cation of Hough transform(in Hough space) isshown in Fig.3. Clearly, the four peaks, labeled

Fig. 3: Rectangle centered at the origin of thecoordinate system[2]

as H1 = (ρ1, θ1), H2 = (ρ2, θ2), H3 = (ρ3, θ3) andH4 = (ρ4, θ4) and those that correspond to thefour sides of the rectangle (P2P3, P1P4, P3P4 andP1P1 respectively) from Fig. 2.

The paper then talks about 5 geometrical ob-servations that we can make from these peaks.

1) They appear in pairs: H1 and H2 are atθ = α1, H3 and H4 are at θ = α0

2) Peaks belonging to the same pair aresymmetric about the ρ axis, which makesthem of equal magnitude resulting in can-celling each other, i.e., ρ1 + ρ2 = 0 andρ3 + ρ4 = 0

3) The peaks are separated by an angle of90 i.e., ∆θ = 90(angle between all the linesegments is 90).

4) Lengths of the line segments are the rep-resented by the heights of the peaks i.e.,C(ρ1, θ1) = C(ρ2, θ2) = b and C(ρ3, θ3) =C(ρ4, θ4) = a

5) The vertical distances (ρ axis) betweenpeaks within each pair are exactly thesides of the rectangle, i.e., ρ1−ρ2 = a andρ3 − ρ4 = b.

The above mentioned relations are valid onlyif there is just one rectangle present in theregion. Multiple edges or rectangles wouldn’tgive accurate results because presence of anyother structure in the image would have aglobal influence in the Hough space. The re-lations 1,2 and 3 are more generic and shouldwork globally.

3

2.3 Jung and Schramm’s algorithm

The basic idea of the algorithm proposed bythem, was to apply Hough transform on theimage, identify peaks which satisfy the con-ditions mentioned in the previous section anddetect the rectangle. As mentioned earlier, theconditions fail when there are additional edgesin the area. To overcome this defect, they useda windowed approach.

2.3.1 Windowed Hough Transform

Consider a rectangle centered at (x0, y0). Forthe algorithm to be able to detect the rectangle,the windowed region must be large enough tocontain all the edges of any possible rectanglecentered at the given location. On the otherhand, it should be as small as possible to avoidedges belonging to other structures.

To tackle this problem, they proposed a suit-able search region to be a ring with internaldiameter Dmin and external diameter Dmax.The choice of these parameters is user definedwhere Dmin should be equal to the smallestpossible side of the rectangle and Dmax shouldbe approximately equal to the largest possiblediameter. They claim that such a choice ofparameters will ensure that any rectangle inthe image will have all of its edges withinthe search region. The windowed Hough trans-form implementation is shown in Fig.4.

Fig. 4: Windowed Hough transform implemen-tation[2]

Fig. 4 shows the edge map of synthetic imagecomputed with Canny’s operator and also thechoice of parameters for the ring shaped searchregion with internal and external diameters tobe Dmin and Dmax.

This approach to the problem is very efficientin the cases of rectangles which are spacedout. In the cases where rectangles of differentsizes are very close to each other, such anapproach would give us multiple edges. Laterin the report, an alternate way to improve thissituation is suggested and implemented.

Once the Dmin and Dmax are specified,Hough transform is computed using quantizedorientations of θ and ρ. Discretization stepsdθ and dρ were calculated based on the sizeof the image as suggested by Furukawa andShinagawa’s idea[3]. If not for this method,there could possibly be infinite values of θand ρ which would make the algorithm com-putationally expensive. In this case the W0 =H0 = Dmax(as the Hough transform will beperformed on the ring like window specifiedin Fig.4) which would make the discretizationsteps to be given by :

dθ =3π

4Dmax

, dρ =3

4

2.3.2 Detecting Peaks

After performing the Hough transform, thenext step would be to identify the peaks i.e.,detecting the line segments within the searchregion. Let us assume, that C(ρ, θ) representsthe number of edge points satisfying the linearequation of Hough transform. The easiest wayto calculate peaks would be set a threshold(TC),and extract all points that are above it. This caneasily go wrong in the case of noisy images.

An alternate version to detect peaks is toidentify and analyze butterfly patterns thatarise in the vicinity of the peaks. In the paper,a simplified version of the butterfly evaluatorsuggested by Furukawa and Shinnagawa wasused to enhance the image. The enhanced im-age is given by :

Cenh(ρ, θ) = hwC(ρ, θ)2∫ h/2

−h/2

∫ w/2−w/2C(ρ+ y, θ + x)dxdy

, (2)

4

where h and w are the height and width ofthe rectangular region used for this enhance-ment, C(ρ, θ) represents the number of edgepoints satisfying the linear equation of Houghtransform(ρ = x cos θ + y sin θ) and Cenh isthe enhanced hough image. Since ρ and θ arequantized, the integral is computed through aconvolution with a rectangular mask.

The local maxima of the enhanced imageCenh(ρ, θ) satisfying

C(ρ, θ) ≥ TC

where TC is minimum number of pixels for theare stored as peaks.

An image showing the Hough transform ofthe test region with peaks is shown in Fig.5.We can observe the 6 peaks, 4 of the rectanglebounded by the test region and 2 of the neigh-boring object peaks(edges).

Fig. 5: Hough transform of the test region withdetected peaks[2]

2.3.3 Detecting RectanglesIf H1 = (ρ1, θ1), H2 = (ρ2, θ2), .., Hm = (ρm, θm)are the m peaks that are detected in the spec-ified region, we need to find the four peakswhich satisfy the conditions specified in section2.2.

Pairs of the peaks are scanned to satisfy thefollowing conditions:

∆θ = |θi − θj| < Tθ,

∆ρ = |ρi + ρj| < Tρ

|C(ρi, θj)− C(ρj, θj)| < TLC(ρi, θj) + C(ρj, θj)

2

Tθ corresponds to the angular threshold i.e.,it determines if the peaks are parallel lines,which implies that θi ≈ θj . Tρ is the distancethreshold i.e., it determines if the lines are

symmetric with respect to θ axis which impliesthat ρi ≈ −ρj . TL determines the normalizedthreshold i.e., if the lines are approximately thesame lengths (C(ρi, θi) ≈ C(ρj, θj)).

To further extend the detection pattern, theyspecify that each pair of Hi and Hj which sat-isfy all the equations mentioned in this sectionwill produce another extended peak

Pk = (±ξk, αk)

where

αk =1

2(θi + θj), ξk =

1

2|ρi − ρj|

. The final step is to compare all pairs ofextended peaks Pk and Pl and those that cor-respond to orthogonal pairs of parallel lines.These two extended peaks contain informationabout all the peaks which are potentially form-ing a rectangle. A rectangle is detected if :

∆α = ||αk − αl| − 90o| < Tα

where Tα is an angular threshold that deter-mines if the pairs of lines determined by Pkand Pl are orthogonal.

When a rectangle is detected, the intersectionof the two pairs of parallel lines give you thevertices, αk gives you the orientation, 2ξk and2ξl give you the length of the sides.

2.3.4 Removing Duplicated RectanglesHaving multiple thresholds(Tθ, Tρ, TαandTL)can lead to detection of duplicated rectanglesfor neighboring centers. When a slidingwindow is used to detect the centers ofvarious objects in the image, there can bemultiple centers which can be detected forthe same object. Hough Transform will beperformed applying the window and eachof the centers, resulting in duplicating therectangles i.e., the same rectangle with slightlydifferent orientations is detected for each ofthe centers. This happens because of the smallmargin of error we leave when we computethe thresholds.

One way around this problem would beto set tighter(lower) thresholds. However, thismay lead to wrongly identifying the actualrectangle.

5

A more efficient way is to calculate the er-ror margin for each detected rectangle, and tochoose the rectangle for which the error is thesmallest. The paper suggests that there are fiveerror measures related to each rectangle : Paral-lelism (∆θk,∆θl), Distance error (∆ρk,∆ρl) andOrthogonality error(∆α). The proposed errormeasure is given by:

E(Pk, Pl) =√a(∆θ2k + ∆θ2l + ∆α2) + b(∆ρ2k + ∆ρ2l )

where a and b are weights for angular anddistance errors, respectively. However, mea-sure of ∆θ and ∆ρ are given in degrees and∆α in pixels. Visually, a difference of onepixel is more significant than a difference ofone degree. To compensate the visual differ-ence, the weight for distance error should belarger(typical values are a=1, b=4). The resultcan be seen in Fig.6.

Fig. 6: Results before and after error correc-tion[2]

3 RESULTS

3.1 Edge Detection

Several Edge detection techniques(Canny, Pre-witt, Log, Sobel, Zerocorss and Roberts) weretested and Canny and Prewitt were foundto be the more efficient ones detecting edgesmore precisely. A low pass filters were used tosmoothen the edges and the image was thendilated further and edges were detected for thisparticular image(Fig.8) using the ’Canny’ edgedetection method.

Fig. 7: Original Image

Fig. 8: Edges

3.2 Hough Transform implementation andrectangle detectionIn this section, the Hough transform using thesliding window was applied and the variousresults are shown below in figures 9, 10 and11.

As you can see the images below, there is lotor error. It initially detects the regions correctly.In fig.10 you can observe that a few rectanglesare detected correctly on the left and right sidesof the image, but there are a lot of squaresdetected because of the closeness of the edgesand multiple peaks getting recognized in theprocess.

The results for the real life images are seenin figures 12,13 and 14. The noise in the imagewas initially reduced and the same algorithmwas applied. Lot of other components werealso detected along with the rectangles. Thereis still a lot of scope for improvement. Resultsfor class 3 images are shown in 15,16,17 andare pretty much garbage.

6

Fig. 9: Potential regions in the image which can be rectangles overlapped with the originalimage(binary)

Fig. 10: All the rectangles detected without error correction

Fig. 11: Rectangles detected after error correction

7




8




9

4 ADDITIONAL WORK AND IDEAS

The proposed algorithm by Jung and Schrammis efficient only when there the rectangles in theimages are sufficiently apart. This algorithmfails in the case of closely alligned images.An example for this failure is discussed in theoriginal paper itself.

The ring window has several disadvantages.Observe Fig. 18. The Dmin and Dmax are theside of the smallest rectangle and diameter ofthe largest rectangle respectively.

Fig. 18: Test Image

Here, the ring window is shown as a the grayarea. We observe how two other rectangles arealso bound by the window in this case. Thealgorithm totally fails in such cases as it doesnot know what to do. The additional peaksdetected will result in satisfying the equationmore than once. This would result in wronglydetecting the rectangles which are about thesame orientation and.

A solution to tackle this problem would be touse an elliptical window instead of a ring andalso calculating the Dmin and Dmax of eachobject in the image individually. This might becomputationally expensive but a more effectivesolution. A modified windowed transform ismentioned below.

4.1 Modified Windowed Transform algo-rithm

4.1.1 Edge detection

Instead of using Canny, Prewitt, Roberts, etctechniques to detect edges, we can simply di-late the image, adjust the intensity, remove

noise,convert the image into binary and per-form a few morphological operations to iden-tify really good edges. All these edge detec-tion result in losing a lot of information ifthe threshold is not decided properly. Resultsof my edge detection algorithm are shown inFigures 19,20,21.

Fig. 19: Test Image

Fig. 20: Test Image

Fig. 21: Test Image

10

4.1.2 Windowed Hough TransformInstead of using the ring like Hough Transformwhere we manually input the Dmax and Dminof the largest and the smallest possible rectan-gles in the image, we can identify all the objectsin an image easily and calculate the longestand shortest side of the object individuallyon run time using Dijkstra’s algorithm. Also,to minimize the error further of the ring likewindow, we can chose an elliptical windowand do the Hough transform of that.

This is not that tough to do, considering acircle is also a form of an ellipse. The equationof an ellipse is :

x2

a2+y2

b2= 1− (3)

If a=b=r, then an ellipse becomes a circle withan equation x2 + y2 = r2.

Let us assume that our rectangle sides areRw and Rh. We want an ellipse with the sameproportions as rectangle. We need to keep theratio of the ellipse a/b same as the ratio of therectangle sides Rw/Rh. This gives us anotherequation:

a

b=Rw

Rh− (4)

Solving 3 and 4, we have

a = b(Rw

Rh)

Rh

2b

2

+Rh

2b

2

= 1

Rh =√

2b

Solving for both a and b we have

a =Rw√

2, b =

Rh√2

Here is an example solution shown in Fig 22.

This window eliminates detection of manyundesired and additional edges making thesearch area smaller. Also, calculation of Rwand Rh real time rather than manually giving aDmin and Dmax value improves the efficiencyof the algorithm to a great level.

The modified algorithm was implementedto a certain extent and hasn’t been completed

Fig. 22: Ellipse bounded rectangle

due to time constraints. Overall, with a littlemodification to Jung and Schramm’s algorithmwe can produce a very efficient algorithm foredge detection which might be expensive com-putationally but very efficient.

4.2 DrawbacksA major drawback is that, these algorithms donot account for overlapped rectangles. Thesecan only detect individual rectangles that arespaced apart. Future work would be to im-plement such an algorithm which could detectrectangle in the case of an overlapped image aswell.

REFERENCES

[1] Wikipedia - Hough Transform[2] C.R. Jung and R. Schramm. Rectangle detection based on a

windowed hough transform. In SIBGRAPI ’04: Proc. of theComputer Graphics and Image Processing, XVII BrazilianSymposium, pages 113-120, 2004.

[3] Y. Furukawa and Y. Shinagawa. Accurate and robust linesegment extraction by analyzing distribution around peaksin hough space. Computer Vision and Image Understand-ing, 92(1):1-25, October 2003.

Implementing Rectangle Detection using Windowed Hough ......1 Implementing Rectangle Detection using...

Documents

Transcript of Implementing Rectangle Detection using Windowed Hough ......1 Implementing Rectangle Detection using...