Salient Object Detection Based on Background Feature...

Research ArticleSalient Object Detection Based on BackgroundFeature Clustering

Kan Huang12 Yong Zhang12 Bo Lv12 and Yongbiao Shi12

1University of Chinese Academy of Sciences Beijing 100049 China2Shanghai Institute of Technical Physics Chinese Academy of Sciences Shanghai 200083 China

Correspondence should be addressed to Kan Huang huangkan15mailsucasaccn

Received 26 September 2016 Revised 21 November 2016 Accepted 27 November 2016 Published 9 February 2017

Academic Editor Deepu Rajan

Copyright copy 2017 Kan Huang et al This is an open access article distributed under the Creative Commons Attribution Licensewhich permits unrestricted use distribution and reproduction in any medium provided the original work is properly cited

Automatic estimation of salient object without any prior knowledge tends to greatly enhance many computer vision tasks Thispaper proposes a novel bottom-up based framework for salient object detection by first modeling background and then separatingsalient objects from background We model the background distribution based on feature clustering algorithm which allows forfully exploiting statistical and structural information of the background Then a coarse saliency map is generated according tothe background distribution To be more discriminative the coarse saliency map is enhanced by a two-step refinement whichis composed of edge-preserving element-level filtering and upsampling based on geodesic distance We provide an extensiveevaluation and show that our proposed method performs favorably against other outstanding methods on two most commonlyused datasets Most importantly the proposed approach is demonstrated to be more effective in highlighting the salient objectuniformly and robust to background noise

1 Introduction

As an important preprocessing technique in computer visionto reduce computational complexity saliency has attractedmuch attention in recent years Saliency detection hasboosted a wide variety of applications in computer visiontasks such as image retrieval [1 2] image compression [3]content-aware image editing [4 5] and object detection andrecognition [6]

According to previous works saliency models can becategorized into bottom-up approaches [7ndash9] and top-downapproaches [10 11] Most top-down approaches make useof prior knowledge and are based on supervised learningWhile these supervised approaches can effectively detectsalient regions and perform overall better than bottom-up approaches it is still expensive to perform the train-ing process especially data collection Compared to top-down approaches bottom-up based approaches do not needdata collection and training process meanwhile requiringlittle prior knowledge These advantages make bottom-upapproaches more efficient and easy to implement in a widerange of real computer vision applications In this paper wefocus on bottom-up approach

In recent years most works deploy color contrast [78] as the measurement of saliency And some other worksutilize color rarity [12] to produce saliency map Algorithmsbased on these cues usually work well in situations whereforeground object appears to be distinctive in terms ofcontrast or rarity But they may fail in dealing with imagescontaining various background patterns Work in [13] tendsto focus on pattern distinctness in saliency detection Itdefines a pixel to be salient if the patch corresponding tothat pixel has less contribution to the whole image in PCAsystem So it will perform badly in cases where salient objectis overwhelming in size in an image By assumingmost of thenarrow border of the image as background region informa-tion of backgroundness priors can be exploited in calculatingsaliency maps as done in [14 15] Though backgroundnesspriors seem to play a role in locating salient objects treatingpatches on the border directly as background patches wouldincur serious problems especially when salient objects touchthe image border or present similar features to a certainpart of the image border Thus looking for an effective wayto construct background distribution from which salientobjects can be separated seems to be necessary

HindawiAdvances in MultimediaVolume 2017 Article ID 4183986 9 pageshttpsdoiorg10115520174183986

2 Advances in Multimedia

(a) (b) (c) (d) (e)

Figure 1 Main framework of the proposed approach (a) Input image (b) Coarse saliency map (c) Saliency map after edge-preservingelement-level filtering (d) Final saliency map after geodesic distance refinement (e) Ground truth

In this work we suggest that clustering features withinthe pseudobackground region (border region of the image)would give a suitable representation of the background dis-tributionThus we first construct background distribution inthis way and then calculate saliency value of every location asthe distance from background distribution In the followingwe employ a series of refinement methods to enhance thesalient region and make salient objects stand out uniformly

There are two main contributions underlying our pro-posed approach (1) we propose a novel way to constructbackground distribution of an image based on clusteringmethod In this way statistical and structural informationof background can be fully exploited to construct back-ground distribution which would allow for a subsequentlymore accurate saliency measurement and (2) we proposea new refinement framework which is composed of edge-preserving element-level filtering and upsampling based ongeodesic distance The new refinement framework has anexcellent ability to uniformly highlight the salient object andremoving background noise

2 Proposed Approach

Our proposed approach adopts a coarse-to-fine strategy inwhich every step fully utilizes the information from previousstep There are three main steps composing the proposedapproach First the patches from image border are usedto construct background distribution from which a coarsesaliency map is derived Second the coarse saliency mapis enhanced by an edge-preserving element-level filteringThird final saliency map is obtained by refinement basedon geodesic distance The main framework is depicted inFigure 1 Each step will be illustrated in detail in each sectionbelow

21 Background Distribution Since backgroundness priorhas shown its importance in salient object detection we aimto find an effective way to establish the background distribu-tion from which saliency map can be calculated Similar towork in [11] we define the pseudobackground region as theborder region of the image Some improved methods [15ndash17]separate the border region into several certain parts such astop bottom left and right and deal with them respectivelyTo our consideration this may fail to exploit as much aspossible the statistical information about backgroundnesspriors In our approach we construct the background distri-bution with the help of clustering algorithm To implement

clustering on all patch features within pseudobackgroundregion provide a way to construct background distributionwhich allows for automatically and substantially exploitingstatistical information of the backgroundness prior Work in[18] also uses clustering algorithm to establish backgrounddistribution and thus generate a coarse saliencymap based onthe distribution Different from ourmethod it only considerscolor information in superpixel level By contrast patchfeature used in our method represents not only color infor-mation but also structural information of a neighborhood ofa locationThe effectiveness of patch feature has already beendemonstrated in [13 19]

For the feature representation of each patch we chooseto use the feature in CIE LAB color space which possessesthe property of perceptual uniformity and has also provedits effectiveness in many saliency algorithms [20 21] Thusfor example if we take a 7 times 7 patch from an imagethe feature of this patch will be represented by a 147 times1 column vector (pixelNumber times channelNumber) Thiskind of representation can represent both color and patterninformation Here we denote this vectorized feature forimage patch 119894 to be 119891(119894)

After patch features have been drawn from the bor-der region of an image we treat them as a collection ofbackground feature samples and use 119870-means clusteringalgorithm [22] to cluster the samples Compared to separatingborder region into certain parts and dealing with themclustering the features within border region into differentgroups would provide a novel way to exploit statisticaland structural information of the backgroundness priorConsequently this would find clusters that best representbackground feature distribution Larger clusters have morecontribution to background distribution and smaller clustershave less contribution Through exercising with differentnumbers of clusters we found that clustering the samples intofour clusters could maintain relatively good performancethus we choose to cluster the features into four groups Fromthe above process we can get four feature clusters samplenumber of each cluster is denoted119873(119897) and centroid locationvalue of each cluster is denoted 119862(119897) 119897 isin 1 2 3 4

In most of the previous work a variety of measurementshave been used to measure image saliency The most com-monly usedmeasurement is rarity and contrast But as statedin Section 1 such measurement may not be suitable for moregeneral cases A more intuitive saliency measurement can bedefined as the difference frombackground distributionwhichdescribes the backgroundness prior In our approach we

Advances in Multimedia 3

Patchextraction

helliphelliphelliphelliphelliphelliphelliphelliphellip

Cluster 1 Cluster 2 Cluster 3 Cluster 4

Compute distance map

Linear combination

Clustering

Figure 2 Coarse saliency map generation process

consider a location to be salient if patch feature correspond-ing to that location is distant from background distributionFor the distance measurement we adopt Euclidean distanceas the distance metric

As we get four background feature clusters we thencalculate a distance map for each cluster In each distancemap value of every location is defined as the Euclideandistance between the patch feature of that location and thecentroid value of this feature cluster Distance map withrespect to each cluster is computed by

119863(119897) (119894) = 1003817100381710038171003817119891 (119894) minus 119862 (119897)10038171003817100381710038172 (1)

where 119894 denote patch location in image 119897 isin 1 2 3 4In this way four distance maps are derived And the four

maps are normalized to [0 1] In the following we create acoarse saliencymap through combination of the four distancemaps The combination procedure is defined as follows

119878coarse = 4sum119897=1

119908119897 times 119863(119897) (2)

where ldquotimesrdquo denotes element-wise multiplication and 119908119897denotes weight for each distancemap119908119897 is defined as follows

119908119897 = 119873 (119897)sum4119894=1119873(119894) (3)

where 119908119897 isin [0 1] Obviously the weight can be seen asthe ratio of sample number in each cluster to whole samplenumber In this way coarse saliency map is more similarto distance maps which are produced from larger clustersLarger clusters have a stronger representative power forrepresenting the real background So distancemaps producedfrom these larger clusters aremore different frombackground

and meanwhile are more likely to highlight salient regionsThe whole process of producing coarse saliency map isshown in Figure 2 (patches are zoomed in for observationpurpose)

22 Edge-Preserving Element-Level Filtering In the previoussection a coarse saliency map is generated We aim tofurther discriminate salient object from background andhighlight salient object uniformly in this section Beforestarting the substantial operation we decompose image intohomogeneous basic elements and define saliency on elementlevel hereafterThis is based on the observation that an imagecan be decomposed into basic structurally representative ele-ments that abstract away unnecessary detail and meanwhileallow for a clear and intuitive definition of saliency Eachelement should locally abstract the image into perceptualhomogenous regions For detailed implementation we use alinear spectral clustering method [23] to abstract the imageinto homogenous elements

In this section we adopt amethod called edge-preservingelement-level filtering to enhance salient region This step ismotivated by the observation that (1) salient object alwaystends to have sharp edges to background and (2) a regionis likely to be salient if it is surrounded by salient regionsThe operation of edge-preserving element-level filtering dealswith basic elements described above Its mathematical for-mula is described in (4) which we will explain in nextparagraph This operation possesses several properties suchas preserving sharp edges refining saliency value of anelement by combining saliency values of its surroundingelements

After decomposing the image into homogenous elementscoarse saliency value of each element (superpixel) is cal-culated by averaging the coarse saliency value 119878coarse ofall its pixels inside Then for 119895th superpixel if its coarse


(a) (b) (c) (d)

Figure 3 Edge-preserving element-level filtering (a) Input image (b) Coarse saliency map (c) Saliency map after edge-preserving element-level filtering (d) Ground truth

saliency value is labeled as 119878119888(119895) the saliency value of the 119902thsuperpixel is measured by

119878119891 (119902) = 119889 (119902) sdot 119878119888 (119902) + [1 minus 119889 (119902)] sdot sum119895isinΘ

119908119902119895119878119888 (119895) (4)

where 119895 isin Θ are a set of superpixels for which 119902th superpixelis the neighborwithin a certain distance119908119902119895 is a weight basedon Euclidean distance and 119889(119902) isin 0 1 indicate whether119902th superpixel is touching sharp edges in image Formula (4)can be interpreted as follows if 119902th superpixel is touchingsharp edges in image then saliency value of 119902th superpixelremains unchanged as its coarse saliency value but if 119902thsuperpixel is not directly connected to sharp edges thensaliency value of 119902th superpixel is determined by saliencyvalues of its surrounding superpixels

Given superpixel positions 119875(119902) and 119875(119895) weight 119908119902119895 in(4) is defined as Euclidean distance between positions of 119902thand 119895th superpixel

119908119902119895 = 1003817100381710038171003817119875 (119902) minus 119875 (119895)10038171003817100381710038172 (5)

The indicator 119889(119902) in (4) is determined according tothe following process all superpixels contiguous to 119902thsuperpixel are searched first then the largest variance amongsaliency values of these superpixels is computed and finallyif the variance is larger than a certain threshold the 119902thsuperpixel is regarded as touching sharp edges thus setting119889(119902) = 1 otherwise setting 119889(119902) = 0

Results of edge-preserving element-level filtering areillustrated in Figure 3 It is obvious that coarse saliency

maps are strongly enhanced after this procedure Sharp edgesof salient object have been well preserved and backgroundnoises have been largely removed Work in [14] adopted anoperation called ldquocontext-based error propagationrdquo whichis similar to edge-preserving element-level filtering in thissection It smooths region saliency by taking all other regionelements in the same feature cluster into consideration whileourmethod refines region saliency using its spatially adjacentregions Obviously our method is more intuitive and simpleand above all preserves sharp edges well

23 Geodesic Distance Refinement The final step of ourproposed approach is refinement with geodesic distance[24] The motivation underlying this step is based on theconsideration that determining saliency of an element asweighted sum of saliency of its surrounding elements whereweights are correspond to Euclidean distance has a limitedperformance in uniformly highlighting salient object Wetend to find some other solutions that could enhance regionsof salient object more uniformly From recent works [2526] we found that the weights may be sensitive to geodesicdistance Thus saliency value based on geodesic distance isdefined as follows

119878 (119901) = 119873sum119894=1

120575119901119894119878119891 (119894) (6)

where 119873 is the total number of superpixels and 120575119901119894 is theweight based on geodesic distance between 119901th and 119894thsuperpixels


(a) (b) (c) (d) (e) (f)

Figure 4 Geodesic distance refinement and comparison with Euclidean distance refinement (a) Input image (b) Coarse saliency map(c) Saliency map after edge-preserving element-level filtering (d) Saliency map using geodesic distance refinement (e) Saliency map usingEuclidean distance refinement (f) Ground truth

The weight values are produced similar to work in [25]First an undirected weight graph has been constructed byconnecting all adjacent superpixels (119886119896 119886119896+1) and assigningtheir weight 119889119888(119886119896 119886119896+1) as the Euclidean distance betweentheir saliency values which are derived in the previous Sec-tion 22Then the geodesic distance between two superpixels119889119892(119901 119894) can be defined as accumulated edge weights alongtheir shortest path on the graph

119889119892 (119901 119894) = min1198861=1199011198862 119886119899=119894

119899minus1sum119896=1

119889119888 (119886119896 119886119896+1) (7)

In this way we can get geodesic distance between any twosuperpixels in the image Then the weight 120575119901119894 is defined as

120575119901119894 = expminus1198892119892 (119901 119894)21205902119888 (8)

where 120590119888 is the deviation for all 119889119888 values From formula(8) we can easily conclude that when 119901 and 119894 are in a flatregion saliency value of 119894would have a higher contribution tosaliency value of 119901 and when 119901 and 119894 are in different regionsbetween which a steep slope existed saliency value of 119894 tendsto have a less contribution to saliency value of 119901

In order to demonstrate the effectiveness of refinementusing geodesic distance we compare geodesic distance andEuclidean distance when they are used as combinationweights Figure 4 shows the experiment results Refinementbased on geodesic distance performs much better thanrefinement based on Euclidean distance in that saliencyobjects are distinctively and uniformly highlighted

3 Experiment

We provide an extensive comparison of our approach toseveral state-of-the-art methods on two most common-used datasets The two benchmark datasets include ASD [7]and MSRA10K [8] The ASD dataset contains 1000 imageswhileMSRA10K contains 10000 images both with pixel-levelground truth

31 Qualitative Evaluation Figure 5 presents a qualitativecomparison of our approach with other six outstandingmethods The other methods include Itti [27] GBVS [28]FT [7] CAS [19] PCA [13] wCtr [25] and DRFI [11] Itcan be seen that results of Itti and GBVS only successfullydetect some fuzzy locations of salient object which makesit far from real application CAS highlights more on objectedgesWhile PCA performs overall better that thesemethodsjust mentioned before there still exist failing occasions whensalient objects are overwhelming in size FT has a simple andefficient implementation but problems occur when globalcues play a role in saliency detection as FT focuses onlocal cues Method of wCtr has a discriminative power tohighlight salient objects but would go unstable when somenonsalient regions are also clustered as salient objects and arehardly connected to image boundary Different from thesemethods which are all bottom-up based approaches DRFIincorporates high level priors and supervised learning whichmake it outstanding among other approaches Our method isbased on bottom-up processing it is easy for us to concludefrom the experiment results that our method provides morefavorable results compared with others and even performsbetter than supervised approach like DRFI An important tipis that our method is implemented on the original resolutionof the input image without any rescale preprocessing

32 Metric Evaluation Similar to [7 8] we deploy precisionand recall as the evaluation metric Precision correspondsto the percentage of salient pixels correctly assigned whilerecall corresponds to the fraction of assigned salient pixelsin relation to the ground truth number of salient pixelsPrecision-recall (PR) curve can be produced by comparingthe results with the ground truth through varying thresholdswithin the range [0 255] PR curve is an important metricin performance evaluation PR curve with respect to eachdataset is given in Figure 6

In addition to precision and recall we compute the 119865-measure which is defined as follows

119865120573 = (1 + 1205732) sdot precision sdot recall1205732 sdot precision + recall

(9)


Image OursFTItti GBVS CAS DRFIPCAGround

truthwCO

Figure 5 Visual comparison of state-of-the-art approaches to our method and ground truth

where we set 1205732 = 03 In Figure 7 we show the precisionrecall and 119865-measure values for adaptive threshold which isdefined as twice the mean saliency of the image Figures 6and 7 show that the proposed method achieves best amongall these methods on the two datasets

33 Performance In order to evaluate algorithms thoroughlywe also compare the average running time of our approach toother methods on the benchmark images Timing has beentaken on a 26GHz CPU with 8GB RAM The environmentis Matlab 2016a installed on Windows Running times ofalgorithms are listed in Table 1

CAS method is slower because of its exhaustive nearest-neighbor search Our running time is similar to that of PCAmethod Both methods involve location-wise patch featureextraction and comparing Our method spends most of the

running time on the first step of feature clustering (21) andsubsequent coarse saliency map computation (25) Time ofour method is moderate among these methods Consideringthat our method is implemented on the original resolutionof input image computational cost of our method is stillfavorable against other methods

4 Conclusion

In this paper an effective bottom-up salient object detec-tion method is presented In order to utilize the statisticaland structural information of backgroundness priors back-ground distribution based on feature clustering has beenconstructed That distribution allows for computing a coarsesaliency map Then two refinement steps including edge-preserving element-level filtering and upsampling based on


PR curve

OursIttiGBVSCAS

FTPCADRFIwCtr

0

01

02

03

04

05

06

07

08

09

1

Prec

ision

01 02 03 04 05 06 07 08 09 10Recall

(a)

OursIttiGBVSCAS

FTPCADRFIwCtr

PR curve

0

01

02

03

04

05

06

07

08

09

1

Prec

ision

01 02 03 04 05 06 07 08 09 10Recall

(b)

Figure 6 Precision-recall (PR) curves for all algorithms (a) PR curves on MSRA10K dataset (b) PR curves on ASD dataset

Ours CAS DRFI wCtr GBVS Itti PCA FT

PrecisionRecallF-measure

0

01

02

03

04

05

06

07

08

09

(a)


Ours CAS DRFI wCtr GBVS Itti PCA FT0

01

02

03

04

05

06

07

08

09

(b)

Figure 7 Precision recall and 119865-measure for adaptive thresholds (a) Measured on MSRA10K dataset (b) Measured on ASD dataset

Table 1 Comparison of running times

Method Ours GBVS CAS Itti FT PCA DRFI wCtrTime (s) 265 070 768 0895 044 268 3238 0723Code Matlab Matlab C++ Matlab Matlab Matlab Matlab Matlab Matlab


geodesic distance have been applied to obtain the finalsaliency map In the final map salient objects are uniformlyhighlighted and background noises are removed thoroughlyThe experimental evaluation exhibits the effectiveness ofthe proposed approach compared with the state-of-the-artmethods

Competing Interests

The authors declare that there are no competing interestsregarding the publication of this paper

Acknowledgments

This work was supported in part by the National HighTechnology Research and Development Program of China(863 Program Program no 2011AA7031002G)

References

[1] T Chen M-M Cheng P Tan A Shamir and S-M HuldquoSketch2Photo internet image montagerdquo ACM Transactions onGraphics vol 28 no 5 pp 124ndash124 2009

[2] S-M Hu T Chen K Xu M-M Cheng and R R MartinldquoInternet visual media processing a survey with graphics andvision applicationsrdquo Visual Computer vol 29 no 5 pp 393ndash405 2013

[3] C Christopoulos A Skodras and T Ebrahimi ldquoThe JPEG2000still image coding system an overviewrdquo IEEE Transactions onConsumer Electronics vol 46 no 4 pp 1103ndash1127 2002

[4] G-X Zhang M-M Cheng S-M Hu and R R MartinldquoA shape-preserving approach to image resizingrdquo ComputerGraphics Forum vol 28 no 7 pp 1897ndash1906 2009

[5] M-M Cheng F-L Zhang N JMitra X Huang and S-MHuldquoRepfinder finding approximately repeated scene elements forimage editingrdquoACMTransactions onGraphics vol 29 no 4 pp831ndash838 2010

[6] Z Ren S Gao L-T Chia and I W-H Tsang ldquoRegion-basedsaliency detection and its application in object recognitionrdquoIEEE Transactions on Circuits and Systems for Video Technologyvol 24 no 5 pp 769ndash779 2014

[7] R Achanta S Hemami F Estrada and S Susstrunk ldquoFrequen-cy-tuned salient region detectionrdquo in Proceedings of the IEEEConference on Computer Vision and Pattern Recognition (CVPRrsquo09) pp 1597ndash1604 Miami Fla USA June 2009

[8] M-M Cheng N J Mitra X Huang P H S Torr and S-M Hu ldquoGlobal contrast based salient region detectionrdquo IEEETransactions on Pattern Analysis and Machine Intelligence vol37 no 3 pp 569ndash582 2015

[9] Z Wang and B Li ldquoA two-stage approach to saliency detectionin imagesrdquo in Proceedings of the IEEE International Conferenceon Acoustics Speech and Signal Processing (ICASSP rsquo08) pp965ndash968 Las Vegas Nev USA April 2008

[10] T Judd K Ehinger F Durand and A Torralba ldquoLearningto predict where humans lookrdquo in Proceedings of the 12thInternational Conference on Computer Vision (ICCV rsquo09) pp2106ndash2113 Kyoto Japan October 2009

[11] H Jiang J Wang Z Yuan Y Wu N Zheng and S Li ldquoSalientobject detection a discriminative regional feature integrationapproachrdquo in Proceedings of the 26th IEEE Conference on

Computer Vision and Pattern Recognition (CVPR rsquo13) pp 2083ndash2090 Portland Ore USA June 2013

[12] F Perazzi P Krahenhuhl Y Pritch and A Hornung ldquoSaliencyfilters contrast based filtering for salient region detectionrdquo inProceedings of the IEEE Conference on Computer Vision andPattern Recognition (CVPR rsquo12) pp 733ndash740 June 2012

[13] R Margolin A Tal and L Zelnik-Manor ldquoWhat makes apatch distinctrdquo in Proceedings of the 26th IEEE Conference onComputer Vision and Pattern Recognition (CVPR rsquo13) pp 1139ndash1146 Portland Ore USA June 2013

[14] X Li H Lu L Zhang X Ruan and M-H Yang ldquoSaliencydetection via dense and sparse reconstructionrdquo in Proceedingsof the 14th IEEE International Conference on Computer Vision(ICCV rsquo13) pp 2976ndash2983 Sydney Australia December 2013

[15] C Yang L Zhang H Lu X Ruan and M-H Yang ldquoSaliencydetection via graph-based manifold rankingrdquo in Proceedingsof the 26th IEEE Conference on Computer Vision and PatternRecognition (CVPR rsquo13) pp 3166ndash3173 Columbus Ohio USAJune 2013

[16] T Zhao L Li X H Ding Y Huang and D L Zeng ldquoSaliencydetection with spaces of background-based distributionrdquo IEEESignal Processing Letters vol 23 no 5 pp 683ndash687 2016

[17] J Han D Zhang X Hu L Guo J Ren and FWu ldquoBackgroundprior based salient object detection via deep reconstructionresidualrdquo IEEE Transactions on Circuits and Systems for VideoTechnology vol 25 no 8 pp 1309ndash1321 2015

[18] Y Qin H C Lu Y Q Xu and HWang ldquoSaliency detection viaCellular Automatardquo in Proceedings of the IEEE Conference onComputer Vision and Pattern Recognition (CVPRrsquo 15) pp 110ndash119 Boston Mass USA June 2015

[19] S Goferman L Zelnik-Manor and A Tal ldquoContext-awaresaliency detectionrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 34 no 10 pp 1915ndash1926 2012

[20] Y Xie and H Lu ldquoVisual saliency detection based on Bayesianmodelrdquo in Proceedings of the 18th IEEE International Conferenceon Image Processing (ICIP rsquo11) pp 645ndash648 Brussels BelgiumSeptember 2011

[21] S Li H C Lu Z Lin X Shen and B Price ldquoAdaptive metriclearning for saliency detectionrdquo IEEE Transactions on ImageProcessing vol 24 no 11 pp 3321ndash3331 2015

[22] T Kanungo D M Mount N S Netanyahu C D PiatkoR Silverman and A Y Wu ldquoAn efficient k-means clusteringalgorithms analysis and implementationrdquo IEEETransactions onPattern Analysis andMachine Intelligence vol 24 no 7 pp 881ndash892 2002

[23] Z Li and J Chen ldquoSuperpixel segmentation using LinearSpectral Clusteringrdquo in Proceedings of the IEEE Conference onComputer Vision and Pattern Recognition (CVPRrsquo 15) pp 1356ndash1363 Boston Mass USA June 2015

[24] J B Tenenbaum V De Silva and J C Langford ldquoA globalgeometric framework for nonlinear dimensionality reductionrdquoScience vol 290 no 5500 pp 2319ndash2323 2000

[25] WZhu S Liang YWei and J Sun ldquoSaliency optimization fromrobust background detectionrdquo in Proceedings of the 27th IEEEConference on Computer Vision and Pattern Recognition (CVPRrsquo14) pp 2814ndash2821 IEEE Columbus Ohio USA June 2014

[26] K Fu C Gong I Y H Gu and J Yang ldquoGeodesic saliencypropagation for image salient region detectionrdquo in Proceedingsof the 20th IEEE International Conference on Image Processing(ICIP rsquo13) pp 3278ndash3282 Melbourne Australia September2013


[27] L Itti C Koch and E Niebur ldquoAmodel of saliency-based visualattention for rapid scene analysisrdquo IEEE Transactions on PatternAnalysis and Machine Intelligence vol 20 no 11 pp 1254ndash12591998

[28] J Harel C Koch and P Perona ldquoGraph-based visual saliencyrdquoin Proceedings of the 20th Annual Conference on Neural Infor-mation Processing Systems (NIPS rsquo06) pp 545ndash552 VancouverCanada December 2006

International Journal of

AerospaceEngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014


Active and Passive Electronic Components

Control Scienceand Engineering

Journal of



RotatingMachinery


Hindawi Publishing Corporation httpwwwhindawicom

Journal ofEngineeringVolume 2014

Submit your manuscripts athttpswwwhindawicom

VLSI Design



Shock and Vibration


Civil EngineeringAdvances in

Acoustics and VibrationAdvances in



Electrical and Computer Engineering

Journal of

Advances inOptoElectronics


Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

SensorsJournal of


Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014


Chemical EngineeringInternational Journal of Antennas and

Propagation




Navigation and Observation



DistributedSensor Networks



(a) (b) (c) (d) (e)

Figure 1 Main framework of the proposed approach (a) Input image (b) Coarse saliency map (c) Saliency map after edge-preservingelement-level filtering (d) Final saliency map after geodesic distance refinement (e) Ground truth

In this work we suggest that clustering features withinthe pseudobackground region (border region of the image)would give a suitable representation of the background dis-tributionThus we first construct background distribution inthis way and then calculate saliency value of every location asthe distance from background distribution In the followingwe employ a series of refinement methods to enhance thesalient region and make salient objects stand out uniformly

There are two main contributions underlying our pro-posed approach (1) we propose a novel way to constructbackground distribution of an image based on clusteringmethod In this way statistical and structural informationof background can be fully exploited to construct back-ground distribution which would allow for a subsequentlymore accurate saliency measurement and (2) we proposea new refinement framework which is composed of edge-preserving element-level filtering and upsampling based ongeodesic distance The new refinement framework has anexcellent ability to uniformly highlight the salient object andremoving background noise

2 Proposed Approach

Our proposed approach adopts a coarse-to-fine strategy inwhich every step fully utilizes the information from previousstep There are three main steps composing the proposedapproach First the patches from image border are usedto construct background distribution from which a coarsesaliency map is derived Second the coarse saliency mapis enhanced by an edge-preserving element-level filteringThird final saliency map is obtained by refinement basedon geodesic distance The main framework is depicted inFigure 1 Each step will be illustrated in detail in each sectionbelow

21 Background Distribution Since backgroundness priorhas shown its importance in salient object detection we aimto find an effective way to establish the background distribu-tion from which saliency map can be calculated Similar towork in [11] we define the pseudobackground region as theborder region of the image Some improved methods [15ndash17]separate the border region into several certain parts such astop bottom left and right and deal with them respectivelyTo our consideration this may fail to exploit as much aspossible the statistical information about backgroundnesspriors In our approach we construct the background distri-bution with the help of clustering algorithm To implement

clustering on all patch features within pseudobackgroundregion provide a way to construct background distributionwhich allows for automatically and substantially exploitingstatistical information of the backgroundness prior Work in[18] also uses clustering algorithm to establish backgrounddistribution and thus generate a coarse saliencymap based onthe distribution Different from ourmethod it only considerscolor information in superpixel level By contrast patchfeature used in our method represents not only color infor-mation but also structural information of a neighborhood ofa locationThe effectiveness of patch feature has already beendemonstrated in [13 19]

For the feature representation of each patch we chooseto use the feature in CIE LAB color space which possessesthe property of perceptual uniformity and has also provedits effectiveness in many saliency algorithms [20 21] Thusfor example if we take a 7 times 7 patch from an imagethe feature of this patch will be represented by a 147 times1 column vector (pixelNumber times channelNumber) Thiskind of representation can represent both color and patterninformation Here we denote this vectorized feature forimage patch 119894 to be 119891(119894)

After patch features have been drawn from the bor-der region of an image we treat them as a collection ofbackground feature samples and use 119870-means clusteringalgorithm [22] to cluster the samples Compared to separatingborder region into certain parts and dealing with themclustering the features within border region into differentgroups would provide a novel way to exploit statisticaland structural information of the backgroundness priorConsequently this would find clusters that best representbackground feature distribution Larger clusters have morecontribution to background distribution and smaller clustershave less contribution Through exercising with differentnumbers of clusters we found that clustering the samples intofour clusters could maintain relatively good performancethus we choose to cluster the features into four groups Fromthe above process we can get four feature clusters samplenumber of each cluster is denoted119873(119897) and centroid locationvalue of each cluster is denoted 119862(119897) 119897 isin 1 2 3 4

In most of the previous work a variety of measurementshave been used to measure image saliency The most com-monly usedmeasurement is rarity and contrast But as statedin Section 1 such measurement may not be suitable for moregeneral cases A more intuitive saliency measurement can bedefined as the difference frombackground distributionwhichdescribes the backgroundness prior In our approach we


Patchextraction




Linear combination

Clustering




119863(119897) (119894) = 1003817100381710038171003817119891 (119894) minus 119862 (119897)10038171003817100381710038172 (1)



119878coarse = 4sum119897=1

119908119897 times 119863(119897) (2)


119908119897 = 119873 (119897)sum4119894=1119873(119894) (3)







(a) (b) (c) (d)




119908119902119895119878119888 (119895) (4)



119908119902119895 = 1003817100381710038171003817119875 (119902) minus 119875 (119895)10038171003817100381710038172 (5)





119878 (119901) = 119873sum119894=1

120575119901119894119878119891 (119894) (6)



(a) (b) (c) (d) (e) (f)



119889119892 (119901 119894) = min1198861=1199011198862 119886119899=119894

119899minus1sum119896=1

119889119888 (119886119896 119886119896+1) (7)


120575119901119894 = expminus1198892119892 (119901 119894)21205902119888 (8)



3 Experiment






(9)



truthwCO






4 Conclusion



PR curve

OursIttiGBVSCAS

FTPCADRFIwCtr

0

01

02

03

04

05

06

07

08

09

1

Prec

ision

01 02 03 04 05 06 07 08 09 10Recall

(a)

OursIttiGBVSCAS

FTPCADRFIwCtr

PR curve

0

01

02

03

04

05

06

07

08

09

1

Prec

ision

01 02 03 04 05 06 07 08 09 10Recall

(b)




0

01

02

03

04

05

06

07

08

09

(a)



01

02

03

04

05

06

07

08

09

(b)






Competing Interests


Acknowledgments


References

































RoboticsJournal of





Journal of



RotatingMachinery





VLSI Design



Shock and Vibration







Journal of



Volume 2014


SensorsJournal of





Propagation










Patchextraction




Linear combination

Clustering




119863(119897) (119894) = 1003817100381710038171003817119891 (119894) minus 119862 (119897)10038171003817100381710038172 (1)



119878coarse = 4sum119897=1

119908119897 times 119863(119897) (2)


119908119897 = 119873 (119897)sum4119894=1119873(119894) (3)







(a) (b) (c) (d)




119908119902119895119878119888 (119895) (4)



119908119902119895 = 1003817100381710038171003817119875 (119902) minus 119875 (119895)10038171003817100381710038172 (5)





119878 (119901) = 119873sum119894=1

120575119901119894119878119891 (119894) (6)



(a) (b) (c) (d) (e) (f)



119889119892 (119901 119894) = min1198861=1199011198862 119886119899=119894

119899minus1sum119896=1

119889119888 (119886119896 119886119896+1) (7)


120575119901119894 = expminus1198892119892 (119901 119894)21205902119888 (8)



3 Experiment






(9)



truthwCO






4 Conclusion



PR curve

OursIttiGBVSCAS

FTPCADRFIwCtr

0

01

02

03

04

05

06

07

08

09

1

Prec

ision

01 02 03 04 05 06 07 08 09 10Recall

(a)

OursIttiGBVSCAS

FTPCADRFIwCtr

PR curve

0

01

02

03

04

05

06

07

08

09

1

Prec

ision

01 02 03 04 05 06 07 08 09 10Recall

(b)




0

01

02

03

04

05

06

07

08

09

(a)



01

02

03

04

05

06

07

08

09

(b)






Competing Interests


Acknowledgments


References

































RoboticsJournal of





Journal of



RotatingMachinery





VLSI Design



Shock and Vibration







Journal of



Volume 2014


SensorsJournal of





Propagation










(a) (b) (c) (d)




119908119902119895119878119888 (119895) (4)



119908119902119895 = 1003817100381710038171003817119875 (119902) minus 119875 (119895)10038171003817100381710038172 (5)





119878 (119901) = 119873sum119894=1

120575119901119894119878119891 (119894) (6)



(a) (b) (c) (d) (e) (f)



119889119892 (119901 119894) = min1198861=1199011198862 119886119899=119894

119899minus1sum119896=1

119889119888 (119886119896 119886119896+1) (7)


120575119901119894 = expminus1198892119892 (119901 119894)21205902119888 (8)



3 Experiment






(9)



truthwCO






4 Conclusion



PR curve

OursIttiGBVSCAS

FTPCADRFIwCtr

0

01

02

03

04

05

06

07

08

09

1

Prec

ision

01 02 03 04 05 06 07 08 09 10Recall

(a)

OursIttiGBVSCAS

FTPCADRFIwCtr

PR curve

0

01

02

03

04

05

06

07

08

09

1

Prec

ision

01 02 03 04 05 06 07 08 09 10Recall

(b)




0

01

02

03

04

05

06

07

08

09

(a)



01

02

03

04

05

06

07

08

09

(b)






Competing Interests


Acknowledgments


References

































RoboticsJournal of





Journal of



RotatingMachinery





VLSI Design



Shock and Vibration







Journal of



Volume 2014


SensorsJournal of





Propagation










(a) (b) (c) (d) (e) (f)



119889119892 (119901 119894) = min1198861=1199011198862 119886119899=119894

119899minus1sum119896=1

119889119888 (119886119896 119886119896+1) (7)


120575119901119894 = expminus1198892119892 (119901 119894)21205902119888 (8)



3 Experiment






(9)



truthwCO






4 Conclusion



PR curve

OursIttiGBVSCAS

FTPCADRFIwCtr

0

01

02

03

04

05

06

07

08

09

1

Prec

ision

01 02 03 04 05 06 07 08 09 10Recall

(a)

OursIttiGBVSCAS

FTPCADRFIwCtr

PR curve

0

01

02

03

04

05

06

07

08

09

1

Prec

ision

01 02 03 04 05 06 07 08 09 10Recall

(b)




0

01

02

03

04

05

06

07

08

09

(a)



01

02

03

04

05

06

07

08

09

(b)






Competing Interests


Acknowledgments


References

































RoboticsJournal of





Journal of



RotatingMachinery





VLSI Design



Shock and Vibration







Journal of



Volume 2014


SensorsJournal of





Propagation











truthwCO






4 Conclusion



PR curve

OursIttiGBVSCAS

FTPCADRFIwCtr

0

01

02

03

04

05

06

07

08

09

1

Prec

ision

01 02 03 04 05 06 07 08 09 10Recall

(a)

OursIttiGBVSCAS

FTPCADRFIwCtr

PR curve

0

01

02

03

04

05

06

07

08

09

1

Prec

ision

01 02 03 04 05 06 07 08 09 10Recall

(b)




0

01

02

03

04

05

06

07

08

09

(a)



01

02

03

04

05

06

07

08

09

(b)






Competing Interests


Acknowledgments


References

































RoboticsJournal of





Journal of



RotatingMachinery





VLSI Design



Shock and Vibration







Journal of



Volume 2014


SensorsJournal of





Propagation










PR curve

OursIttiGBVSCAS

FTPCADRFIwCtr

0

01

02

03

04

05

06

07

08

09

1

Prec

ision

01 02 03 04 05 06 07 08 09 10Recall

(a)

OursIttiGBVSCAS

FTPCADRFIwCtr

PR curve

0

01

02

03

04

05

06

07

08

09

1

Prec

ision

01 02 03 04 05 06 07 08 09 10Recall

(b)




0

01

02

03

04

05

06

07

08

09

(a)



01

02

03

04

05

06

07

08

09

(b)






Competing Interests


Acknowledgments


References

































RoboticsJournal of





Journal of



RotatingMachinery





VLSI Design



Shock and Vibration







Journal of



Volume 2014


SensorsJournal of





Propagation











Competing Interests


Acknowledgments


References

































RoboticsJournal of





Journal of



RotatingMachinery





VLSI Design



Shock and Vibration







Journal of



Volume 2014


SensorsJournal of





Propagation














RoboticsJournal of





Journal of



RotatingMachinery





VLSI Design



Shock and Vibration







Journal of



Volume 2014


SensorsJournal of





Propagation









Salient Object Detection Based on Background Feature...

Documents

Transcript of Salient Object Detection Based on Background Feature...