11ddfdf

download 11ddfdf

of 4

Transcript of 11ddfdf

  • 8/13/2019 11ddfdf

    1/4

    The 9th Korea apan oint Workshop on Frontiers of Computer Vision

    Intensity Comparison Based Compact Descriptor forMobile Visual SearchSang-il Na, Keun-dong Lee, Seung-jae Lee, Sung-kwan Je, and Weon-geun Oh

    Creative Content Research Lab.ETRIDaejeon, Rep. of Korea{sina, za UIT seungjlee, skj, owg}@etri.re.kr

    bstract In this paper we proposed intensity comparisonbased compact descriptor for mobile visual search. For practicalmobile applications, the low complexity and the descriptor sizeare more preferable, and many algorithms such as SURF, CHoG,and PCA-SIFT have been proposed. However, these approachesfocused on not the feature description but the extraction time andthe size of the feature. This paper suggests feature descriptionmethod based on simple intensi ty comparison with consideringdescriptor size and extraction speed. Experimental results showthat the proposed method has comparable performance to SURFwith similar complexity and 20 times much smaller size.

    Keywords feature descriptor image matchingI INTRODUCTION

    Feature based object recognition has been getting muchmore attention before Scale Invariant Feature TransformSIFT had been proposed and its applications and theperformance were reported in the literatures [1]. After SIFT,many modification were proposed and efficient searchingstructures and performance comparisons were followed [2][13].

    With the popularity of mobile network and devices, thecompactness and the low complexity of feature extraction aremuch more considered to design feature descriptor. For mobileapplications with feature descriptor, the algorithm shouldsatisfy the following properties: Robustness: the visual descriptor is robust againstdifferent lightening conditions and partial occlusion bymoving objects e.g. pedestrians and cars. Discriminable: if two image patches are different part ofobject, the feature descriptor from them should be

    significantly different. Fast extraction: the computing power is limited on themobile device; the algorithm has a low complexity. Compactness: When the local features are sent over anetwork, the system latency can be reduced by sendingfewer bits resulting from compact local feature. Alsowhen the DB is stored in the device, the amount ofimages can be increased by using compact descriptor. Identify applicable sponsor/s here. If no sponsors, deleteFigure 1 shows the scenario to use feature descriptor onThis research was supported by the ICT Standardization program ofMKE The Ministry of Knowledge Economy

    978-1 -4673 -5621 -3/13/ 31.00 02013 IEEE 103

    mobile device. a shows the server side solution, bshows the extraction doing on device and the localfeatures are send over the network and c shows ondevice scenario [14].

    Fig. 1.Mobile visual search pipeline. a every thing is done on server, bdescriptor extract on device and matching on server c every thingis done on mobile deviceIn case of b and c , the extraction speed and compactnessare a matter of concernment.In the previous research, SURF archive the low complexitybut its descriptor size is too large to use in mobile device. PCASIFT and CRoG have compact descriptor size but thesemethods need to post processing after raw descriptor extraction[15]-[16].In this paper, we present local descriptor based oncomparison. The proposed descriptor uses an average intensityvalue comparison that is robust to various conditions.

  • 8/13/2019 11ddfdf

    2/4

    The 19th Korea apan ointWorkshop on Frontiers of Computer Vision

    This paper is organized as follows. Section 2 explains theproposed feature descriptor. Section 3 describes experimentalcondition and result, and finally concludes the paper.II. PROPOSED METHOD

    This section describes the proposed comparison basedcompact descriptor extraction method.

    Fig. 2. The proposed descriptor overview

    AssumptionsGenerally, the relativity of the luminance component are

    maintained in the local region after performing modification inpixel domain. The modification of pixel value can beapproximated by equation 1).

    comparison. For increase discriminability, build different typeof comparison pattern. The detailed descriptor extraction isdescribed follows.B Descriptor Extraction

    Fig. 2 shows the proposed feature descriptor extractionflow. The goal of a feature descriptor is to robustly capturesalient information from a canonical image patch. The imagepatch means the local region which extracted by feature pointextractor. When the feature point extractor extracting theregion, the region is normalize to their scale and mainorientation.

    Fig. 3.The sub-block division

    Where ]t x,y is the modified pixel value in the x,ylocation and x, y is the corresponding location of x, yin the modified image, a is contrast change and is brightnesschange.

    In the local feature extraction, the geometric modificationslike scaling, rotation and translation can be compensated byfeature detector. Therefore, we design the feature descriptorwithout considering the geometric modification. In this reason,x,y and x,y would be same. If the modification islinear, the equation 1) can be approximated by the followingequations:

    M M2 2It x, y) = I I O .LjI x,y) 3)

    . M M]= J=These equations represent brightness changes, contrastchanges, and convolutional filtering, respectively. Where a andR are same with equation 1), M is filter size and a . istJ 1 Jmagnitude factor.

    Above equations means the relationships of the differentpositions do not change even if the modifications occur in thelocal region. However, some modifications such as noiseaddition do not follow this. In this case, the block average isused and the rule is maintained.In this paper, we proposed descriptor by above assumptions.Proposed method builds the binary descriptor by local intensity

    It x, y = aI

    ]t x,y aI x,y fJ

    1)

    2)

    104

    The image patch is dividing into 4x4 blocks and generatescomparison patterns as shown in fig. 3. The comparisonpatterns are generated using three sub-block values and thedifferent values inside a sub-block. The three sub-block valuesare the average, and the x and y directions difference valuesinside the sub-block. X direction difference is the differencevalue of left half pixel values with right half pixel values. Ydirection difference calculate as same way with x directiondifference using bottom half pixel values and top half pixelvalues.We converted these values into a binary descriptor usingcomparison. For each comparison we carried out binarization,so if one block is corrupted by noise, it influences just 1 bit foreach feature. The outline for the procedure for generating thevalues is as follows: X directional difference and Y directional difference: Ifthe value is positive, set to 1 else set to 0 for eachvalue. These values show local characteristic on imagepatch. The compari son pairs make use of point symmetry fromthe origin. For example, block O s value is comparewith block 15 s value. For this comparison, average, xdirectional difference and y directional difference areused. We compare the large regions like the sum of left sidesub-blocks compare with sum of right side sub-blocks.Use above method, we make 99 bits descriptor for eachpatch image.

    Descriptor matchingThe hamming distance is used to measure the similaritybetween reference descriptor and query descriptor and its valueshows how many bits are different to the reference one.

  • 8/13/2019 11ddfdf

    3/4

    The 9th Korea apan ointWorkshop on Frontiers of Computer Vision

    The number of bit errors between the descriptors fromdifferent local regions will then have a binomialdistribution B n,p) where n is equal to the number of bitsextracted and p is the probability that a or 1 bit isextracted. If n is sufficiently large, the binomial distributioncan be approximated as a normal distribution. Therefore, itsmean is np and the standard deviation is ~ n p l p). Fromthis it can be deduced that the bit error rate BER has anormal distribution wi th mean u p and a standarddeviation of a = p l p / n . For the approximated normaldistribution, NeLl,o- , the false the false alarm rate PFA forBER is given in 4 [17].

    1 u-TFFA = erjc J2a 4For deciding the matching or not, we use two criteria. If thedistance value is lower than predefined threshold, we assumethe descriptors are matched. The threshold value is very strictwhich value set to PFA lower than10-12 . The other one useusual way to decide the matching on local feature. Nearestneighbor distance ratio NNDR is one of popular method tofind the matching point. We also use this method.

    III. EXPERIMENTAL RESULTTo evaluate the proposed algorithm, SURF was used todetect feature points and get image patches. We designed twodifferent experiments as follows.

    Fig. 4.Example of Stanford DB

    105

    Fig. 5.Example of DB which used for retrievalFirst one is pair-wise matching experiment. In thisexperiment, True positive rate TPR and false positive rateFPR were calculated to measure the performance. Equation5 and 6 shows TPR and FPR. In this experiment, the imagepair successfully get a homography matrix by RANSAC[18],we assume the match.

    TPR of _match 5Total __ of _ matching _ pairsFPR of _match 6Total __of _ non _ matching _ pairs

    For this test, we use 5 category of Stanford DB [19] whichare CD covers, DVD covers, Book covers, Business cards andText documents. In this experiment, we use 600 matching pairsfor each category and 6,000 non matching pairs for over-allcategories. Fig. 4 shows example of the DB.Second experiment is retrieval. For this test, we build 1,500reference images DB and 1,900 query images DB. Thereference and query images were captured by DSLR andmobile devices, respectively. Example images of DB areshown in fig. 5.For efficient search, KD-tree was used to build searchingstructure and best-bin-first BBF [20] was used to findapproximate nearest neighbor. In this test, we checked topresult by query and if they are same object, we assume thesuccess.Table 1 and Table 2 shows pair-wise matching and retrievalresult, respectively. As shown in the results, pair-wisematching performance of TPR, FPR and retrieval performancein terms of success ratio of proposed method was comparableto SURF. OpenSURF[21] implementation was used for thisexperiment.

  • 8/13/2019 11ddfdf

    4/4