Optimal seamline detection in dynamic scenes via graph ... · objects and produce high-quality...

19
Machine Vision and Applications (2017) 28:819–837 DOI 10.1007/s00138-017-0874-y ORIGINAL PAPER Optimal seamline detection in dynamic scenes via graph cuts for image mosaicking Li Li 1 · Jian Yao 1 · Haoang Li 1 · Menghan Xia 1 · Wei Zhang 2 Received: 3 October 2015 / Revised: 13 June 2017 / Accepted: 10 August 2017 / Published online: 19 September 2017 © Springer-Verlag GmbH Germany 2017 Abstract In this paper, we present a novel method for cre- ating a seamless mosaic from a set of geometrically aligned images captured from the scene with dynamic objects at dif- ferent times. The artifacts caused by dynamic objects and geometric misalignments can be effectively concealed in our proposed seamline detection algorithm. In addition, we simultaneously compensate the image regions of dynamic objects based on the optimal seamline detection in the graph cuts energy minimization framework and create the mosaic with a relatively clean background. To ensure the high qual- ity of the optimal seamline, the energy functions adopted in graph cuts combine the pixel-level similarities of image char- acteristics, including intensity and gradient, and the texture complexity. To successfully compensate the image regions covered by dynamic objects for creating a mosaic with a rel- atively clean background, we initially detect them in overlap regions between images based on pixel-level and region-level similarities, then refine them based on segments, and deter- mine their image source in probability based on contour matching. We finally integrate all of these into the energy minimization framework to detect optimal seamlines. Exper- imental results on different dynamic scenes demonstrate that our proposed method is capable of generating high-quality mosaics with relatively clean backgrounds based on the detected optimal seamlines. B Jian Yao [email protected] http://cvrs.whu.edu.cn/ 1 Computer Vision and Remote Sensing (CVRS) Lab, School of Remote Sensing and Information Engineering, Wuhan University, Wuhan, Hubei, People’s Republic of China 2 School of Control Science and Engineering, Shandong University, Jinan, Shandong, People’s Republic of China Keywords Image mosaicking · Seamline detection · Dynamic object compensation · Graph cuts · Image parallax 1 Introduction Image mosaicking is an important and classical problem in the fields of image processing, photogrammetry, remote sens- ing and computer vision, which is used to blend two or multiple geometrically aligned images into a single com- posite image as seamlessly as possible, e.g., panorama stitching [18] and satellite imagery mosaicking [27]. In an ideal static scene, both the photometric inconsis- tencies and the geometric misalignments are not existed or not obviously visible in overlap regions between images; the composite image mosaicked from multiple ones always looks very perfect. However, due to illumination variation and differences in exposure, there always exist photomet- ric inconsistencies to some extents between images captured from different viewpoints. The artifacts caused by photo- metric inconsistencies can be solved very well through color correction and smoothing transition [6, 16, 25, 31], which try to conceal stitching artifacts by smoothing color differences between input images. In addition, due to the fact that input images are captured without the precisely common projec- tion center, and there are large depth differences in the scene with respective to the camera, or captured from dynamic scene with moving objects at different times by a single cam- era mounted on a rotation platform, the geometric positions of corresponding pixels from different images may be differ- ent, especially for those pixels covered by dynamic objects. The smoothing transition technique can deal with the color differences along the seamline, but it cannot handle the obvi- ous parallax caused by geometric misalignments or dynamic objects. One effective way to solve this problem is to detect 123

Transcript of Optimal seamline detection in dynamic scenes via graph ... · objects and produce high-quality...

Page 1: Optimal seamline detection in dynamic scenes via graph ... · objects and produce high-quality image mosaics for most scenes. However, these methods cannot remove dynamic objects

Machine Vision and Applications (2017) 28:819–837DOI 10.1007/s00138-017-0874-y

ORIGINAL PAPER

Optimal seamline detection in dynamic scenes via graph cutsfor image mosaicking

Li Li1 · Jian Yao1 · Haoang Li1 · Menghan Xia1 · Wei Zhang2

Received: 3 October 2015 / Revised: 13 June 2017 / Accepted: 10 August 2017 / Published online: 19 September 2017© Springer-Verlag GmbH Germany 2017

Abstract In this paper, we present a novel method for cre-ating a seamless mosaic from a set of geometrically alignedimages captured from the scene with dynamic objects at dif-ferent times. The artifacts caused by dynamic objects andgeometric misalignments can be effectively concealed inour proposed seamline detection algorithm. In addition, wesimultaneously compensate the image regions of dynamicobjects based on the optimal seamline detection in the graphcuts energy minimization framework and create the mosaicwith a relatively clean background. To ensure the high qual-ity of the optimal seamline, the energy functions adopted ingraph cuts combine the pixel-level similarities of image char-acteristics, including intensity and gradient, and the texturecomplexity. To successfully compensate the image regionscovered by dynamic objects for creating a mosaic with a rel-atively clean background, we initially detect them in overlapregions between images based on pixel-level and region-levelsimilarities, then refine them based on segments, and deter-mine their image source in probability based on contourmatching. We finally integrate all of these into the energyminimization framework to detect optimal seamlines. Exper-imental results on different dynamic scenes demonstrate thatour proposed method is capable of generating high-qualitymosaics with relatively clean backgrounds based on thedetected optimal seamlines.

B Jian [email protected]://cvrs.whu.edu.cn/

1 Computer Vision and Remote Sensing (CVRS) Lab, Schoolof Remote Sensing and Information Engineering, WuhanUniversity, Wuhan, Hubei, People’s Republic of China

2 School of Control Science and Engineering, ShandongUniversity, Jinan, Shandong, People’s Republic of China

Keywords Image mosaicking · Seamline detection ·Dynamic object compensation · Graph cuts · Image parallax

1 Introduction

Image mosaicking is an important and classical problem inthefields of image processing, photogrammetry, remote sens-ing and computer vision, which is used to blend two ormultiple geometrically aligned images into a single com-posite image as seamlessly as possible, e.g., panoramastitching [18] and satellite imagery mosaicking [27].

In an ideal static scene, both the photometric inconsis-tencies and the geometric misalignments are not existed ornot obviously visible in overlap regions between images;the composite image mosaicked from multiple ones alwayslooks very perfect. However, due to illumination variationand differences in exposure, there always exist photomet-ric inconsistencies to some extents between images capturedfrom different viewpoints. The artifacts caused by photo-metric inconsistencies can be solved very well through colorcorrection and smoothing transition [6,16,25,31], which tryto conceal stitching artifacts by smoothing color differencesbetween input images. In addition, due to the fact that inputimages are captured without the precisely common projec-tion center, and there are large depth differences in the scenewith respective to the camera, or captured from dynamicscene with moving objects at different times by a single cam-era mounted on a rotation platform, the geometric positionsof corresponding pixels from different images may be differ-ent, especially for those pixels covered by dynamic objects.The smoothing transition technique can deal with the colordifferences along the seamline, but it cannot handle the obvi-ous parallax caused by geometric misalignments or dynamicobjects. One effective way to solve this problem is to detect

123

Page 2: Optimal seamline detection in dynamic scenes via graph ... · objects and produce high-quality image mosaics for most scenes. However, these methods cannot remove dynamic objects

820 L. Li et al.

the optimal seamline between two images and ensure that theoptimal seamline avoids crossing obvious dynamic objectsand the regionswith low image similarity and largeobject dis-location. If the stitching artifacts are still visible due to colordifferences, the smoothing transition technique can be furtherapplied to solve it easily. In this paper, ourwork focuses on theoptimal seamline detection despite the presence of dynamicobjects and geometric misalignments. In addition, by com-pensating the image regions of dynamic objects existing inone image with the background regions in another image, wecan generate visually pleasant image mosaics with relativelyclean backgrounds.

Optimal seamlines are generally located in overlap regionsbetween images where their intensity or gradient differencesare minimal. Many methods regard the seamline detectionas an energy optimization problem. They solved it by min-imizing the specially designed energy functions which aredefined to represent the differences along the seamline. Forthose methods, the key issues concentrate on how to defineeffective energy functions and how to guarantee the optimal-ity of the solution. The energy functions are often definedby considering color, gradient, and texture and are optimizedvia different optimization algorithms, e.g., snake model [13],Dijkstra’s algorithm [10], dynamic programming [2], andgraph cuts [4,5].Our proposed algorithmwill apply the graphcuts algorithm to solve the problem of optimal seamlinedetection.

However, most stitching algorithms are designed for staticscenes, and they will possibly fail in dynamic scenes. Ifthe scene contains dynamic objects, ghosting effect wouldpossibly appear in the last mosaicked images. Some algo-rithms [3,9,12,14,19,20,24,29,30] have been designed fordynamic scenes to remove ghosting effect. Boutellier et al.[3] proposed an image stitching algorithm to generate high-quality image mosaics from both video and separate imageson mobile phones. To avoid the appearance of ghosts causedby moving objects, it detected the moving objects by sim-ply thresholding the difference map between two images.López et al. [19] also applied this strategy to avoid the mov-ing objects when mosaicking the images captured by mobiledevices. Davis [9] presented a system for creating pleasingmosaics in the presence of moving objects. It first gener-ated the cost image by computing the intensity differencesbetween corresponding pixels. Then, it applied the Dijkstra’salgorithm [10] to search for an optimal seamline (minimumcost path) basedon this cost image.However, if there are largeexposure differences between source images, this methodmay fail to find the optimal seamline using intensity differ-ence alone. To solve this problem, Mills and Dudek [20]proposed to first compensate for exposure differences andthen find the optimal seamline via the Dijkstra’s algorithm inthe cost image defined by combining the image informationof intensity and gradient. However, if the dynamic objects

are similar to the background, the optimal seamline foundby this algorithm may cross the dynamic objects and the lastcomposite image would contain ghosting effect. To improvethe algorithm proposed by Mills and Dudek [20], Zeng etal. [29] proposed a new optimal searching criterion thatcombines gradient difference with edge-enhanced weightingintensity difference, which provides an effective mechanismfor avoiding problems caused by moving objects. Then, thisalgorithm applied dynamic programming [2], which is moreefficient than Dijkstra’s algorithm, to find the optimal seam-line. These existing methods can avoid crossing dynamicobjects and produce high-quality image mosaics for mostscenes. However, these methods cannot remove dynamicobjects from the finally mosaicked images and obtain thelast background panoramic images, as shown in Fig. 1d–f, especially in (d), the moving object appears twice in thelast panorama. To solve this problem, Zhi and Cooperstock[30] proposed an image stitching algorithm which is com-prised of two stages. In the first stage, it discovered motionsbetween input aligned images and extracted the movingregion pairs through amulti-seed-based region growing algo-rithm. In the second stage, with prior information providedby the extracted regions, it detected the optimal seamline byperforming the graph cut optimization in gradient domainand simultaneously ensured that the pixels in each extractedregion only come from one image. Since it used informationfrom only one image for each extracted region pair, it canavoid that the moving object appears twice in the last com-posite image andmake it just like as if the images are capturedwithout themoving objects in the scene. In addition, Kim andHong [14] proposed a background estimation algorithm andapplied it to stitch the images captured from the dynamicscenes. This algorithm can generate the background as com-pletely as possible and also guarantee the completeness ofmoving objects if they move very slowly in the scene. Thegoal of our method is similar to this background estimationalgorithm. However, this algorithm used multiple images foreach region to estimate the background, but we only use twoimages for each overlap region.

In this paper, we propose a novel optimal seamline detec-tion method to create a seamless composite image with arelatively clean background (i.e., with dynamic objects asfew as possible) and free from artifacts, as shown in Fig. 1g.The motivations and applications of our proposed methodare summarized as follows:

• First, our proposed method can construct the panoramicbackground of an image sequence and eliminate themoving objects from the dynamic scene. The panoramicimages generated by our method have relatively cleanbackgrounds. In many cases, we only want to obtainthe background of a scene, and the moving objectswill occlude some important signs or information (e.g.,

123

Page 3: Optimal seamline detection in dynamic scenes via graph ... · objects and produce high-quality image mosaics for most scenes. However, these methods cannot remove dynamic objects

Optimal seamline detection in dynamic scenes via graph cuts for image mosaicking 821

(a) (b)

(e) (f) (g)

(c) (d)

Fig. 1 Two geometrically aligned input images A and B in a, b with amoving object in the overlap region as shown in c. The image mosaicswith different seamlines are presented in d–g, which avoid crossingthe moving object, like the works completed in [9,20,29]. The moving

object is visible in d–f, especially in d, the person appears twice in thelast composite image. However, it is invisible in g by the compensationwith the background image regions, which is the goal of our method

instruction of safe passage, poster, and beautiful scenery).For example, wewant to take a panoramic photograph fora beautiful scenery, but there are always several peoplewalking in this scene; our method can solve this problemvery well and generate the last panoramic photographfree from moving objects.

• Second, our proposed method can avoid the samedynamic object appears twice or multiple times in thesame scene. In many cases, particularly humans in anindoor scene, themoving object is only allowed to appearonce at most; otherwise, there will be artifacts in theimage mosaic.

• Last, the panoramic images generated by our method arefree from the artifacts caused by geometric misalign-ments and dynamic objects. The ultimate standard tojudge whether a mosaic is good or not, is if it is free fromartifacts. Our proposedmethod can eliminate the artifactscausedbygeometricmisalignments anddynamic objects,so can be used to generate high-quality panoramic imagesfor dynamic scenes.

We formulate the optimal seamline detection as an energyoptimization problem and solve it in the graph cuts energyminimization framework [17]. In our proposed method, notonly the intensity and gradient differences are considered intothe cost of each pixel in the overlap regions between images,but also the texture complexity inspired by histogram of ori-ented gradient (HOG) [8] is integrated into the cost function.In addition, to create an image mosaic with a relatively cleanbackground, the dynamic objects are detected initially bycomputing the pixel-level similarity in the CIELAB colorspace and the region-level similarity between HOG featuredescriptors, then refined by segments generated by the MeanShift algorithm [7], and finally determined fromwhich imagethey come in probability based on contour matching. All

of these are integrated into the graph cuts framework fordetecting the optimal seamline concealing the parallax andcompensating the image regions covered by dynamic objectsfor image mosaicking, as shown in Fig. 1g.

The remaining part of this paper is organized as follows.Section 2 introduces the graph cuts energy minimizationframework for finding the optimal seamline between twoimages. The energy cost of each pixel by considering theintensity, gradient, and texture complexity is defined inSect. 3. Dynamic objects detection between two images isdemonstrated in Sect. 4. Experimental results on indoor oroutdoor scenes with dynamic objects are presented in Sect. 5followed by the conclusion drawn in Sect. 6 .

2 Seamline detection via graph cuts

In the overlap region between two images, we regard eachpixel as a node in the graph and assume that each nodehas four cardinal neighbors in the 4-neighborhood. The linkbetween two adjacent nodes is regarded as an edge in thegraph, and its weight cost is defined as the sum of energycosts of these two nodes. The graph cuts algorithm is usedto associate the label of each pixel to one of the input sourceimages with the minimum energy cost.

2.1 Graph cuts

Graph cuts algorithm [5] is an efficient energy optimizationalgorithm to solve labeling problems. It has been widelyused in many applications of image processing and com-puter vision, such as image matting, image segmentation,stereo matching, and image blending [21,24]. For exam-ple, Rother et al. [23] provided a powerful semiautomaticalgorithm for foreground object extraction based on iterated

123

Page 4: Optimal seamline detection in dynamic scenes via graph ... · objects and produce high-quality image mosaics for most scenes. However, these methods cannot remove dynamic objects

822 L. Li et al.

Algorithm 1 Optimal Seamline Detection for Dynamic ScenesInput: The aligned image pair I1 and I2.Output: The last composite image I.1. Energy cost definition

(a) Convert color images into grayscale ones.(b) Calculate gradient magnitudes via the Sobel operator.(c) For each pixel x:

i. Calculate the intensity difference Cc(x) between I1 and I2 according to Eq. (6).ii. Calculate the gradient difference Cg(x) between I1 and I2 according to Eq. (7).iii. Extract the HOG descriptorsH1 andH2 in I1 and I2, and calculate the texture complexity �1(x) and �2(x) based onH1 andH2

according to Eq. (8), respectively. The texture complexity energy Ct (x) is the sum of �1(x) and �2(x), as defined in Eq. (9).iv. Calculate the last energy cost C(x) by combining Cc(x), Cg(x) and Ct (x), as defined in Eq. (5).

2. Dynamic object detection(a) Convert the RGB color space into the CIELAB color space.(b) For each pixel x:

i. Calculate the color distance Dc(x) in the CIELAB color space according to Eq. (10).ii. Compute the HOG distance Dh(x) between two extracted HOG descriptors H1 and H2 according to Eq. (11).iii. Calculate the last distance D(x) by combining Dc(x) and Dh(x) according to Eq. (12), and generate the distance map Md .

(c) Find all dynamic regions O = {Ok}Kk=1 from Md .(d) For each dynamic region Ok :

i. Segment the corresponding rectangular image regions in I1 and I2 via the Mean Shift algorithm.ii. Check which image Ok comes from via contour matching.iii. Refine the dynamic region and apply the contour matching again.iv. Calculate two probabilities P1(Ok) and P2(Ok) based on the results of contour matching.

3. Seamline detection via graph cuts(a) Define the data energy term Edata(I). For each pixel x:

i. Check whether x locates on the regions of dynamic objects found in step 2.ii. If not, define the data energy costs according to the C2 case of Eq. (3). Otherwise, according to the C1 case.

(b) Define the smooth energy term Esmooth(I). For each pixel pair (x, y), the smooth energy Esmooth(x, y) is defined as the sum of C(x)and C(y) computed in step 1.

(c) Set the initial labels of all pixels as I1 or I2. Optimize the last defined energy function Eq. (2) via graph cuts, and find the last optimalseamline.

graph cuts, named as “GrabCut”. Kolmogorov and Zabih[15] presented an energy minimization formulation of thecorrespondence problem with occlusions and provided a fastapproximation algorithm based on graph cuts. The basic ideais to first construct a weighted graph where each edge weightcost represents the corresponding cost energy value, and thenfind the minimum cut in this graph based on the max-flowor min-cut algorithm [4]. Let P be the set of all elements(i.e., vertexes in the graph),N be the set of all element pairs{p, q} in the neighborhood (i.e., edges in the graph), and Lbe the set of all labels. The objective is to find a labelingmap f assigning a label f p ∈ L to each element p ∈ P byminimizing the following cost energy function:

E( f ) =∑

p∈PDp( f p) +

(p,q)∈NVp,q( f p, fq), (1)

where Dp( f p) denotes the cost of assigning the label f p tothe element p and Vp,q( f p, fq) defines the cost of assigningthe labels f p and fq to the element pair p and q, respec-tively, which are often called as the data energy term and thesmooth energy term, respectively. If f p and fq are equal, thevalue of Vp,q( f p, fq) would be equal to 0, namely the Pottsmodel. The graph cuts algorithm can guarantee the globaloptimality of the last solution, which means we can detectthe optimal seamline globally. The time complexity of graphcuts is O(n2E), where n is the number of nodes and E is thenumber of edges.

2.2 Labeling via graph cuts

We formulate the optimal seamline detection as an energyminimization problem and use graph cuts to find the solutionbetween pixels, as shown in Fig. 2. Major steps of the pro-posed algorithm are simply described in Algorithm 1. For acomposite image I from an input pair (I1, I2) with an over-lap, the energy cost E(I) is comprised of the data energyterm Edata(I) and the smooth energy term Esmooth(I). Thedata energy term represents all energy costs for individualpixels with one of input source images. The smooth energyterm represents all energy costs between adjacent pixels. Thetotal energy cost E(I) is defined as:

E(I) = Edata(I) + Esmooth(I), (2)

where the data energy term Edata(I) is defined as:

Edata(I) =∑

x∈I

(D1l (x) + D2

l (x))

, (3)

where D1l (x) and D2

l (x) represent the costs of assigning thelabel of the pixel x to I1 and I2, respectively. Given a pixelpoint x in the composite image I, the data cost will be sepa-rately calculated as the following two cases:

123

Page 5: Optimal seamline detection in dynamic scenes via graph ... · objects and produce high-quality image mosaics for most scenes. However, these methods cannot remove dynamic objects

Optimal seamline detection in dynamic scenes via graph cuts for image mosaicking 823

Fig. 2 An illustration example of the optimal seamline detection method via graph cuts. The thickness of lines between adjacent pixels representsthe value of the energy cost and the “cut” denotes the minimum cut, which means the optimal seamline

C1: x ∈ Ok , D1l (x) = P1(Ok) × T and D2

l (x) = P2(Ok) ×T , where T is the penalty coefficient (T = 100 wasused in this paper).

C2: x /∈ O, D1l (x) = 0 if x ∈ I1 otherwise D1

l (x) = ∞ ifx /∈ I1 and similar for D2

l (x).

where O = {Ok}Kk=1 represents a set of dynamic objectsand P1(Ok) and P2(Ok) are the probabilities of the dynamicobject Ok belonging to I1 and I2, respectively. Detaileddescription will be introduced in Sect. 4.

The smooth energy term Esmooth(I) represents all energycosts of adjacent pixel pairs, which is defined as:

Esmooth(I) =∑

(x,y)∈N (I)

σ (x, y) · Esmooth(x, y), (4)

where N (I) denotes the set of all adjacent pixel pairs in I,and the coefficient σ(x, y) = 0 if the labels of the pixels xand y are the same; otherwise, σ(x, y) = 1. Esmooth(x, y)represents the smooth energy between two adjacent pixelsx and y, which is defined as Esmooth(x, y) = C(x) + C(y),where C(x) and C(y) are the energy costs of the pixels x andy, respectively, and will be defined in Sect. 3.

3 Energy cost definition

The energy cost C(x) of the pixel point x = (x, y)� in Iis comprised of three terms: the intensity difference termCc(x), the gradient difference term Cg(x), and the texturecomplexity term Ct (x), which is defined as:

C(x) = (Cc(x) + Cg(x)

) × Ct (x). (5)

3.1 Intensity and gradient difference

The intensity difference of the pixel x between two imagesis computed in the grayscale space, which is defined as:

Cc(x) = |Ig1(x) − Ig2(x)|, (6)

where Ig1(x) and Ig2(x) denote the intensity values of x in I1and I2, respectively.

The gradient magnitudes of each pixel in the horizontaland vertical directions are calculated via the Sobel operator inthe grayscale space. The gradient difference cost term Cg(x)of the pixel point x is defined as:

Cg(x) =|Gx1(x) − Gx

2(x)| + |Gy1(x) − Gy

2(x)|, (7)

where Gx1(x) and Gy

1(x) denote the horizontal and verticalgradient magnitudes of x in I1, respectively, and there are thesame meanings for Gx

2(x) and Gy2(x).

3.2 Texture complexity

Based on two energy costs in terms of intensity and gradientdifferences defined above, in theory, the optimal seamlinesdetected via graph cuts can locate on the regions with highsimilarity and avoid passing through obvious foregroundobjects. As we known, some specific regions such as coarsefloors and ceilings, and richly textured trees and flowers arealso suitable to be crossed by the seamline due to that theimagedifferences in these regions are not easy to beobserved.However, we find that sometimes the seamlines detectedbased on above two differences fail to cross those regions.The major reason of this problem is that those regions alsohave large intensity and gradient differences in some cases.But, we observe that texture complexity of those regions ispoor. Thus, to solve this problem, we propose a new texture

123

Page 6: Optimal seamline detection in dynamic scenes via graph ... · objects and produce high-quality image mosaics for most scenes. However, these methods cannot remove dynamic objects

824 L. Li et al.

(a)

(b)

Fig. 3 Some typical regions (top) existed in indoor/outdoor scenes and their corresponding normalized texture complexity maps (bottom) wherethe lighter regions indicate higher texture complexities. a Indoor scenes. b Outdoor scenes

complexity measurement to distinguish those regions, whichis inspired by histogram of oriented gradient (HOG) featuredescriptor. HOG has been widely and successfully used incomputer vision and image processing for the purpose ofobject detection, such as human detection [32].

The texture complexity �(p) of the pixel p in the overlapregion is calculated as follows. At first, all gradient orienta-tions are computed and converted into the range of [0, 2π ].Then, for each pixel point x in an image I, we create a k × k(k = 11 was used in this paper) size window regionNk×k(x)centered at this pixel. Then, we compute the histogram oforiented gradient H(x) comprised of B (B = 12 was usedin this paper) bins over this window region. Based on thehistogram of oriented gradient, the texture complexity at x isdefined as:

�(x) = 1 −∑B

b=1 min(Hb(x), H̄(x))∑B

b=1 Hb(x), (8)

where Hb(x) denotes the frequency of the bth bin in H(x)and H̄(x) represents the mean of frequencies of all bins,i.e., H̄(x) = 1

B

∑Bb=1 Hb(x). Obviously, if x locates on the

local region with poor texture like walls or strongly repetitivepatterns like coarse floors or trees, the frequencies of differentbins in the histogram are approximately equal, so �(x) isclose to 0. In contrast, �(x) is close to 1 if the frequencies offew bins are high and the remaining are low.

Based on the definition of the proposed texture complex-ity, the texture complexity term Ct (x) for the pixel point x isdefined as:

Ct (x) = �1(x) + �2(x), (9)

where �1(x) and �2(x) represent the texture complexitiesof x in I1 and I2, respectively. In this way, we can applyCt (x) to constrain the intensity and gradient differences in theregions with poor texture or strongly repetitive patterns with-

123

Page 7: Optimal seamline detection in dynamic scenes via graph ... · objects and produce high-quality image mosaics for most scenes. However, these methods cannot remove dynamic objects

Optimal seamline detection in dynamic scenes via graph cuts for image mosaicking 825

(a) (b) (c) (d) (e) (f)

(g) (h) (i) (j) (k)

Fig. 4 a, b Two geometrically aligned input source images with adynamic object (a pedestrian); c the color distance map in CIELAB; dthe HOG distance map; e the final distance map; f the dynamic objectsbinary map after mathematical morphology; g, h the correspondingrectangle segmentation results in the second image of the Mean Shift

algorithm for the regions masked by red and yellow rectangles in f; ithe dynamic objects binary map refined by the Mean Shift algorithm;j, k the finally mosaicked images with optimal seamlines without/withdynamic object compensation (color figure online)

out affecting other richly textured regions. Figure 3 showsthe normalized texture complexity maps of several typicalregions existed in indoor and outdoor scenes.

4 Dynamic object detection

Tocompensate the image regions coveredbydynamic objectsin one image with background regions in another image, weshouldfirst detect the dynamicobjects basedon the imagedif-ference in the overlap region and then determinewhich imagethey come from in probability. The image difference D(x)of the pixel point x from two images I1 and I2 is computedby combining the color distance Dc(x) and the HOG dis-tance Dh(x). The color distance is computed in the CIELABcolor space. The CIELAB color space also has been used togenerate superpixels [1]. This color space includes all per-ceivable colors, which means that its gamut exceeds those

of the RGB and CMYK color models. In addition, this colorspace is widely considered as perceptually uniform for smallcolor distances. The color distance Dc(x) is defined as:

Dc(x) = (L1(x)−L2(x))2+(A1(x)−A2(x))2+(B1(x)−B2(x))2,

(10)

where L1(x), A1(x), and B1(x) denote the intensity valuesof L , A, and B channels of x in I1, respectively, and there arethe same meanings for L2(x), A2(x), and B2(x). However, itis very difficult to clearly detect the dynamic objects from theimage distance map only based on the above defined colordistances, as shown in Fig. 4c. Because there are many noiseand error regions caused by the geometric misalignmentsand the photometric inconsistencies. To overcome this short-coming, we further modify the point distance based on thecorrespondingHOGdistance. TheHOGof each pixel in I1 orI2 has been calculated for computing the texture complexity

123

Page 8: Optimal seamline detection in dynamic scenes via graph ... · objects and produce high-quality image mosaics for most scenes. However, these methods cannot remove dynamic objects

826 L. Li et al.

defined in Sect. 3.2. LetH1 = {Hb1 }Bb=1 andH2 = {Hb

2 }Bb=1be HOGs of the pixel point x in I1 or I2, respectively. Thedistance betweenH1 andH2 is computed based on the Bhat-tacharyya distance as follows:

Dh(x) =√√√√1 − 1√

H̄1(x)H̄2(x)B2

B∑

b=1

√Hb1 (x)Hb

2 (x),

(11)

where H̄1(x) = 1B

∑Bb=1 H

b1 (x) and H̄2(x) = 1

B

∑Bb=1

Hb2 (x). In this way, we obtain a HOG distance map as shown

in Fig. 4d. By combining Dc(x)with Dh(x), the final distanceD(x) is defined as:

D(x) = − 1

ln(Dh(x))× Dc(x). (12)

Based on the final distance map, as shown in Fig. 4e, wecan simply get the binary map with a distance threshold.We further filter out small noise regions via mathematicalmorphology, as shown in Fig. 4f. Let O = {Ok}Kk=1 be allseparated dynamic objectswhere K is the number of dynamicobjects. For each dynamic object Ok , to decide which imageit comes from, we first segment the corresponding rectan-gular image regions of Ok in I1 and I2 by the Mean Shiftalgorithm [7], respectively. We then find the contour of Ok ,as shown in Fig. 4i. After that, we match the contours ofdynamic objects with the boundaries of segments gener-ated in I1 and I2, respectively. Furthermore, we refine eachdynamic object Ok based on the segments with a higher con-tour matching value. If more than half pixels of one segmentbelong to the foreground, we label this whole segment as apart of dynamic object; otherwise, we regard it as the back-ground. The contour matching values M1(Ok) and M2(Ok)

of the refined dynamic object Ok with respective to I1 andI2 are re-computed at first. Then, we compute the proba-bilities P1(Ok) and P2(Ok) of Ok belonging to I1 and I2as P1(Ok) = M1(Ok )

M1(Ok )+M2(Ok ), P2(Ok) = M2(Ok )

M1(Ok )+M2(Ok ),

respectively, and P1(Ok) + P2(Ok) = 1. For example, inFig. 4f, there are two dynamic regions, masked by red andyellow rectangles, respectively.Weobserve that they all comefrom the second image, because the contour matching valuesare higher than the first image. So, we further refine themby the segments generated in the second image, as shown inFig. 4g, h. In this way, we obtain the refined dynamic objectsmap as shown in Fig. 4i.

5 Experimental results and analysis

We tested our proposed method on the images captured fromseveral dynamic scenes with moving objects. The images

were captured by rotating a Nikon D7100 camera of 24 mil-lion pixels with a wide-angle lens on a tripod platform atdifferent times. The faces in some pictures presented in thispaper have been blurred due to privacy protection. Thoseimages were first aligned and warped into a common coor-dinate system via the popularly used open-source libraryPanorama Tools,1 which also serves as the underlying coreengine for many image stitching softwares, such as PTGui 2

and Hugin.3 Here, we only used the modules and functionsof image alignment and image projection of Panorama Toolsto generate the aligned images. And the color or another fea-tures of input fish-eye images are not changed. Of course,there always exist geometric misalignments between thoseimages in different extents. We utilized those geometricallyaligned images to test our proposed optimal seamline detec-tion algorithm despite the presence of dynamic objects andgeometric misalignments. In addition, we also applied thestructural similarity (SSIM) index [26] to evaluate the qual-ity of the detected seamline. Our algorithmwas implementedwith C++ under Windows and tested in a computer with anIntel Core i7-4770 at 3.4GHz.

5.1 Quality assessment

After imagemosaickingwith the found optimal seamline, weneed to evaluate the quality of the mosaicked image. In thelast several decades, a great deal of effort has gone into thedevelopment of the image quality assessment methods thattake advantage of known characteristics of the human visualsystem (HVS). Wang et al. [26] proposed a noticeable qual-ity metric named as the structural similarity (SSIM) index.In last several years, the SSIM index has been widely usedto evaluate the quality of color correction [28] and imagemosaicking methods [11,22]. In this paper, we also appliedthe SSIM index to evaluate the quality of image mosaick-ing along the detected seamline. Let S be the found optimalseamline between the image pair I1 and I2, and I be the lastmosaicked image with the seamline S. The quality measure-ment SSIM(I) used in this paper is defined as:

SSIM(I) = 2× 1

N

N∑

i=1

min(SSIM2k(Ai , Bi ))−1; k = 1, 2,

(13)

where SSIM2k(Ai , Bi ) denotes the normalized SSIM indexbetween two local window blocks Ai and Bi , centered at thei th point of S, from two images Ik and I, and N is the pointnumber of the seamline S. Since SSIM index SSIMk(Ai , Bi )

1 http://panotools.sourceforge.net/.2 http://www.ptgui.com/.3 http://hugin.sourceforge.net/.

123

Page 9: Optimal seamline detection in dynamic scenes via graph ... · objects and produce high-quality image mosaics for most scenes. However, these methods cannot remove dynamic objects

Optimal seamline detection in dynamic scenes via graph cuts for image mosaicking 827

(a) (b) (c) (d)

(e) (f) (g)

Fig. 5 The seamline detection results with the different combinationof intensity (A), gradient (G), and texture complexity (T ): a A; b G;c T ; d A + G; e A + T , f G + T ; g A + G + T . The computationaltimes (consisting of all elapsed times in the energy cost computation and

the graph cuts optimization) of seven combinations in a–g are 6.4840,8.3790, 7.0290, 8.3280, 6.5910, 7.3190, and 7.3710 s, respectively. a A(6.4840 s). b G (8.3790 s). c T (7.0290 s). d A+G (8.3280 s). e A+ T(6.5910 s). f G + T (7.3190 s). g A + G + T (7.3710 s)

belongs to [−1,+1], so we modify the classical SSIM indexto SSIM2k(Ai , Bi ) = (SSIMk(Ai , Bi ) + 1) /2. In addition,the SSIM index is computed independently in each channelof the image in the RGB color space, and then the final SSIMindex SSIMk(Ai , Bi ) is computed by averaging them. TheSSIM index itself is defined as a combination of luminance,contrast, and structure components:

SSIM(A, B) = [l(A, B)]α · [c(A, B)]β · [s(A, B)]γ , (14)

where l(A, B) = 2μaμb+C1

μ2a+μ2

b+C1, c(A, B) = 2σaσb+C2

σ 2a +σ 2

b +C2,

s(A, B) = σab+C3σaσb+C3

, μa and μb are the mean luminancevalues of the local window blocks A and B, respectively, andσa and σb are the standard variances of the blocks A and B,respectively. σab is the covariance between the blocks A andB. Here, the small constants C1, C2, and C3 are includedto avoid the divide-by-zero error, α, β, and γ are threeparameters used to adjust the relative importance of three

123

Page 10: Optimal seamline detection in dynamic scenes via graph ... · objects and produce high-quality image mosaics for most scenes. However, these methods cannot remove dynamic objects

828 L. Li et al.

components. According to the method presented in Wang etal. [26], we used the default settings: C1 = (0.01 × L)2,C2 = (0.03 × L)2, C3 = C2/2, L = 255 for images ofthe intensity range [0, 255] and α = β = γ = 1. The highervalue of SSIM(I) indicates the higher quality of the detectedseamline.

5.2 Results

To illustrate whether our energy criteria defined in Sect. 3is effective, we compared the seamline detection results byusing different combinations of intensity (A), gradient (G),and texture complexity (T ), as shown inFig. 5.Thus, there areseven kinds of combinations of features. We observed fromFig. 5 that the seamlines detected by using A, G and A +G without considering texture complexity (T ) cannot avoidcrossing the gray car. However, when the texture complexity(T ) was considered, bothG+T and A+G+T can generatehigh-quality seamlines which avoided crossing the obviousobjects and crossed the regions with high similarity, and onlyA+ T failed. This is mainly due to the fact that the intensityof the gray car is similar to the intensity of the road. However,we also found that the seamline detected by using T cannotavoid crossing the building as A did, but it can avoid crossingthe gray car, as shown in Fig. 5c. We also observed that theseamlines produced by using G + T and A + G + T aresimilar, but according to the previous works completed byMills and Dudek [20] and Zeng et al. [29], the combinationof intensity and gradient information is more robust. In theaspect of computational cost, when the texture complexitywas considered, the times ofG+T and A+G+T are 7.3190and 7.3710 s, respectively. All of those times are less thanthe times of G and A + G, which are 8.3790 and 8.3280 s.And the times of A + T and A are 6.5910 and 6.4840 s,respectively, which are almost the same, but they are less thanthe time of T , 7.0290 s. Although the energy computationtime has increased due to the use of texture complexity, theoptimization computational time of the seamline detectionhas decreased. This is mainly due to that the optimal solutioncan be more easily found via graph cuts when the texturecomplexity was used.

Two geometrically aligned images captured in a shop-ping center with multiple dynamic objects, as shown inFig. 6a, b were used to test our algorithm. In this experiment,we compared our proposed algorithm with the popularlyused commercial software PTGui and the Zeng et al.’sapproach [29]. From Fig. 6a, b, we observed that there aremany dynamic objects in the overlap region, including agirl wearing a hat and many walking peoples nearing theescalator. The problem of ghosts caused by those movingobjects arose in the composition image generated by PTGui,as shown in Fig. 6c. This problem can be solved well via theoptimal seamline detection algorithm, as shown in Fig. 6d, e.

However, themoving objects always appear in the final imagemosaics. But, it disappeared when the proposed dynamicobjects compensation was applied, as shown in Fig. 6f. FromFig. 6e, f,we also observed that the seamlines can always passthrough the regions with high image similarity and simul-taneously avoid crossing dynamic objects; this is mainlydue to the constraint of our proposed energy criteria definedin Sect. 3. Another example is shown in Fig. 7. The maindynamic object in this indoor scene is a walking man. Andthe similar results can be observed as shown in Fig. 6. Inaddition, we presented the results of the quantitative assess-ment with the SSIM index for the seamlines detected bydifferent methods in Table 1. We found that the qualitiesof the seamlines detected by our proposed method withoutdynamic objects compensation and Zeng et al.’s approachare almost the same. In Fig. 7, the seamline detected by ourproposed method without dynamic objects compensation isthe best. And, in Fig. 6, the seamline detected by Zeng etal.’s approach is the best; the score of our seamline detectedwithout dynamic objects compensation is only a little lessthan the score of Zeng et al.’ approach. But, with the use ofdynamic objects compensation, the qualities drop a little dueto the fact that the seamlines need to cross some regions withlow image similarity for compensating the dynamic regions.From the visual comparison, we proved the superiority of ourproposed algorithm which can generate image mosaics withrelatively clean backgrounds and free from artifacts.

Under the case that an object just moves locally (i.e., smallpart of this object moves) or in the scene with the textureof dynamic objects very similar to the background, thesedynamic objects cannot be fully and accurately detected. Ifwe applied the hard constraint with the data energy definedas: D1

l (x) = ∞ and D2l (x) = 0 if P1(x) > P2(x) otherwise

D1l (x) = 0 and D2

l (x) = ∞, the detected seamline maycross the dynamic objects as shown in Fig. 8 from whichwe observed that only head locally moved in this scene andwas identified as a dynamic object region. However, with ourproposed soft constraint defined in theC1 case of Eq. (3) , thedetected seamline successfully avoids crossing this dynamicobject, as shown in Fig. 8d.

Figure 9 shows a representative example with the unsuc-cessful compensation. From Fig. 9a, b, we found that thereare two moving objects denoted as A and B in this scene.In Fig. 9c, we used the geometric center line to replace theseamline, from which we found that one moving object hasbeen crossed. This problem can be solved well by detectingthe optimal seamline, as shown in Fig. 9d–f. However, weobserved that our algorithm failed to compensate the regionsof two moving objects, as shown in Fig. 9f. For the dynamicobject A, the reason is because this man moved slowly andlocally, and the background region cannot be compensatedcompletely. For the dynamic object B, our algorithm shouldcompensate the corresponding dynamic region in theory.

123

Page 11: Optimal seamline detection in dynamic scenes via graph ... · objects and produce high-quality image mosaics for most scenes. However, these methods cannot remove dynamic objects

Optimal seamline detection in dynamic scenes via graph cuts for image mosaicking 829

(a) (b)

(c) (d)

(e) (f)

Fig. 6 A visual comparison of different algorithms stitching two inputimages captured in a mall shopping center: a, b two geometricallyaligned input images; c, d the image mosaics generated by PTGui andthe Zeng et al.’s approach, respectively; e, f the image mosaics gener-

ated by our proposed algorithm without dynamic object compensationand with it, respectively. a Left image. b Right image. c PTGui. d Zenget al.’s approach. eWithout compensation. fWith compensation

123

Page 12: Optimal seamline detection in dynamic scenes via graph ... · objects and produce high-quality image mosaics for most scenes. However, these methods cannot remove dynamic objects

830 L. Li et al.

(a) (b)

(c) (d)

(e) (f)

Fig. 7 A visual comparison of different algorithms stitching two inputimages captured in a small office: a, b two geometrically aligned inputimages; c, d the image mosaics generated by PTGui and the Zeng etal.’s approach, respectively; e, f the image mosaics generated by our

algorithm without dynamic object compensation and with it, respec-tively. a Left image. b Right image. c PTGui. d Zeng et al.’s approach.eWithout compensation. fWith compensation

123

Page 13: Optimal seamline detection in dynamic scenes via graph ... · objects and produce high-quality image mosaics for most scenes. However, these methods cannot remove dynamic objects

Optimal seamline detection in dynamic scenes via graph cuts for image mosaicking 831

Table 1 The quantitativequality assessment with theSSIM index of seamlinesdetected by our proposedmethod and the Zeng et al.’sapproach

Zeng et al.’s approach Without compensation With compensation

Figure 6 0.862721 0.859037 0.80416

Figure 7 0.8737 0.889709 0.860047

(a) (b)

(c) (d)

Fig. 8 An example of mosaicking two geometrically aligned imagesin a, b captured from a dynamic scene with a local motion of a humanhead based on the seamlines detected by the hard constraint in c and the

soft one in d, respectively. a Left image. b Right image. c Mosaickedimage with hard constraint. d Mosaicked image with soft constraint

However, due to that the color of background is similar tothis dynamic object, the dynamic region is very difficult tobe detected, and our algorithm cannot detect this dynamicobject. Therefore, our algorithm failed to compensate thedynamic region covered byB. Although our algorithm failedto compensate the dynamic regions with the information ofbackground, it also can avoid passing through all dynamicobjects as the Zeng et al.’s approach did.

Figure 10 shows the 360◦-view panoramas of an outdoorscene,whichwere generated bydifferent algorithms. Inmany

cases, the moving object is only allowed to appear onceat most; otherwise, there will be artifacts in the mosaickedimages. To mosaic a panoramic image, we first captured fiveoriginal images with the size of 3200×4800 and thenwarpedthem into the same coordinate system with the image size of9512 × 4756. From the original images, we can find thatthere is a moving man in this scene. The ghost caused by thismoving man appears in the panorama generated by PTGui;at the same time, the man appears five times in this scene, asshown in Fig. 10b. In Fig. 10c, d, which are generated by the

123

Page 14: Optimal seamline detection in dynamic scenes via graph ... · objects and produce high-quality image mosaics for most scenes. However, these methods cannot remove dynamic objects

832 L. Li et al.

(a) (b)

(c) (d)

(e) (f)

Fig. 9 An example with the unsuccessful compensation: a, b two geo-metrically aligned imageswith twomoving objectsA andB; c the imagemosaicked based on the geometric center line; d the composition imagegenerated by the Zeng et al.’s approach; e, f the image mosaics gener-

ated by our algorithm without dynamic objects compensation and withit, respectively. a Left image. b Right image. c Center line. d Zeng etal.’s approach. eWithout compensation. f With compensation

123

Page 15: Optimal seamline detection in dynamic scenes via graph ... · objects and produce high-quality image mosaics for most scenes. However, these methods cannot remove dynamic objects

Optimal seamline detection in dynamic scenes via graph cuts for image mosaicking 833

(a)

(b)

(d) (e)

(c)

Fig. 10 The 360◦-view panorama of an outdoor dynamic scene witha moving man: a 5 original images; b the image mosaicked by PTGui;c the image mosaicked by the Zeng et al.’s approach; d the imagemosaicked by our proposed method without dynamic objects compen-sation; e the image mosaicked by our proposed method with dynamic

objects compensation. The computation times of c–e are 31.4820,48.213 and 57.129 s, respectively. a Original images (3200 × 4800).b PTGui. c Zeng et al.’s approach. d Without compensation. e Withcompensation

Table 2 The quantitative quality assessment with the SSIM index of seamlines detected by our proposed method and the Zeng et al.’s approach

Zeng et al.’s approach Without compensation With compensation

Figure 10 0.590076 0.685662 0.640019

Figure 11 0.674495 0.695654 0.646024

Zeng et al.’s approach and our algorithm without dynamicobjects compensation (i.e., the data energy is always calcu-lated in the second caseC2), the ghost disappears, but themanalso appears four times in this same scene. This problem canbe solved well by our proposed dynamic objects compensa-tion strategy. As shown in Fig. 10e, this man only appearsonce when dynamic objects compensation was applied.Withthe use of dynamic objects compensation, the computationtime increases to 57.129 s from the original 48.213 s with-

out compensation. The increasing time was mainly spent ondetecting the dynamic objects and determining which imagethey come from. In addition, we found that the computationtime of the result shown in Fig. 10c is only 31.4820 s due tothe fact that this algorithm applied the dynamic programmingmethod to find the seamline with the minimal cost, which ismore efficient than graph cuts. But, we found that the qualityof seamlines detected by this algorithm is lower than our pro-posed algorithm. Particularly, the second seamline (from left

123

Page 16: Optimal seamline detection in dynamic scenes via graph ... · objects and produce high-quality image mosaics for most scenes. However, these methods cannot remove dynamic objects

834 L. Li et al.

(a)

(b) (c)

(d) (e)

(g)(f)

Fig. 11 The 360◦-view panorama of an indoor dynamic scene witha moving woman: a 5 original images; b the image mosaicked basedon the geometric center lines; c the image mosaicked by PTGui; d theimage mosaicked by Enblend; e the image mosaicked by the Zeng etal.’s approach; f the image mosaicked by our proposed method without

dynamic objects compensation; g the imagemosaicked by our proposedmethod with dynamic objects compensation. The computation times ofe–g are 31.5330, 41.910 and 55.385 s, respectively. a Original images(3200 × 4800). b Center lines. c PTGui. d Enblend. e Zeng et al.’sapproach. fWithout compensation. g With compensation

to right) passes through many edges of the building, but theseamline detected by our proposed algorithm avoids cross-ing them. And more importantly, the Zeng et al.’s approachcannot avoid this moving man appearing multiple times in

the last image mosaic. The results of quantitative assessmentwith the SSIM index for the seamlines are presented in thesecond row of Table 2. From Fig. 10, we found that there are5 seamlines on each stitched image. For each group of seam-

123

Page 17: Optimal seamline detection in dynamic scenes via graph ... · objects and produce high-quality image mosaics for most scenes. However, these methods cannot remove dynamic objects

Optimal seamline detection in dynamic scenes via graph cuts for image mosaicking 835

lines in each stitched image, we computed the quality valuefor each seamline independently, and then, the last qualityvalue was computed by averaging them. We found that ourproposed method outperforms than the Zeng el al.’s method,even with the use of dynamic objects compensation.

The 360◦-view panorama of an indoor scene is presentedin Fig. 11, and the similar conclusions can be drawn as theoutdoor scene shown in Fig. 10. In addition, to prove that ouralgorithm can eliminate the artifacts caused by geometricalignments and moving objects, we show the image mosaicgenerated based on the geometric center lines of the overlapregions in Fig. 11b. We also show the image mosaic gen-erated by the open-source software Enblend 4 in Fig. 11d,which also applied graph cuts to find the optimal seamlinesfor image mosaicking. From Fig. 11b, d and especially fromthe local regions highlighted by red rectangles, we observedthat there are many geometric misalignments between thoseinput aligned images in different extents. But our proposedalgorithm can eliminate those artifacts caused by geomet-ric misalignments by detecting the optimal seamlines, asshown in Fig. 11f, g. The results of quantitative assessmentwith the SSIM index for this experiment are presented inthe third row of Table 2, and we found that the qualitiesof the seamlines detected by our proposed method withoutdynamic objects compensation are the best. With the use ofdynamic objects compensation, the seamlines need to crosssome regions with the low image similarity for compensatingthe dynamic objects, so the qualities decrease aswe expected.

From the experiments conducted above, we observed thatour proposed algorithm can generate high-quality panoramicimages in dynamic scenes. Simultaneously, our algorithmcan compensate the regions occupied by dynamic objectsbased on our detected optimal seamlines. However, we alsofound that our algorithm is little time-consuming by compar-ing with the Zeng et al.’s approach. And, our algorithm canbe used to produce high-quality panoramic maps for indoorand outdoor scenes.

6 Conclusion

This paper proposed a novel optimal seamline detectionalgorithm for image mosaicking via graph cuts despite thepresence of dynamic objects and geometric misalignments.One contribution in this paper is that we applied a new energycost function combining the intensity difference, the gra-dient difference, and the texture complexity in the graphcuts energy minimization framework. Thus, our algorithmcan ensure that the seamline optimally passes through theregions with high image similarity and avoids crossing obvi-ous objects. Another contribution is that we proposed a novel

4 http://enblend.sourceforge.net/.

approach to generate the composite images with clean back-grounds. It first detected dynamic objects based on the imagedifference combining both the color distance in the CIELABcolor space and the texture distance in HOG. Then, we deter-mined in probability which image those dynamic objectscome frombased on segments and contourmatching. Finally,we integrated all of those into the seamline detection opti-mization process. Experimental results have proved that ouralgorithm can produce high-quality image mosaics free fromartifacts and with relatively clean backgrounds (i.e., withdynamic object as few as possible).

Nevertheless, the proposed algorithm can be improvedin the future in the following ways. First, the superpixelsegmentation can be introduced to greatly improve the opti-mization efficiency by decreasing the number of elements ingraph cuts. Second, the scene understanding or parsing canbe applied into our algorithm. For example, the floors andpeople can be detected out for guiding the seamline.

Acknowledgements Thisworkwaspartially supportedby theNationalNatural Science Foundation ofChina (ProjectNo. 41571436), theHubeiProvince Science and Technology Support Program, China (Project No.2015BAA027), theNationalNatural ScienceFoundationofChina underGrant 91438203, LIESMARS Special Research Funding, and the SouthWisdom Valley Innovative Research Team Program.

References

1. Achanta, R., Shaji, A., Smith, K., Lucchi, A., Fua, P., Süsstrunk, S.:SLIC superpixels compared to state-of-the-art superpixel methods.IEEE Trans. Pattern Anal. Mach. Intell. 34(11), 2274–2282 (2012)

2. Bellman, R.: Dynamic Programming. Princeton University Press,Princeton (1957)

3. Boutellier, J.J., Bordallo-Lopez,M., Silvén, O., Tico,M., Vehviläi-nen, M.: Creating panoramas on mobile phones. In: Proceeding ofSPIE Electronic Imaging, vol. 6498, issue 7 (2007)

4. Boykov, Y., Kolmogorov, V.: An experimental comparison of min-cut/max-flow algorithms for energy minimization in vision. IEEETrans. Pattern Anal. Mach. Intell. 26(9), 1124–1137 (2004)

5. Boykov, Y., Veksler, O., Zabih, R.: Fast approximate energy min-imization via graph cuts. IEEE Trans. Pattern Anal. Mach. Intell.23(11), 1222–1239 (2001)

6. Brown, M., Lowe, D.G.: Automatic panoramic image stitchingusing invariant features. Int. J. Comput. Vis. 74(1), 59–73 (2007)

7. Comaniciu, D., Meer, P.: Mean shift: a robust approach towardfeature space analysis. IEEE Trans. Pattern Anal. Mach. Intell.24(5), 603–619 (2002)

8. Dalal, N., Triggs, B.: Histograms of oriented gradients for humandetection. In: IEEE Computer Society Conference on ComputerVision and Pattern Recognition (CVPR) (2005)

9. Davis, J.: Mosaics of scenes with moving objects. In: IEEEComputer Society Conference on Computer Vision and PatternRecognition (CVPR) (1998)

10. Dijkstra, E.W.: A note on two problems in connexion with graphs.Numer. Math. 1(1), 269–271 (1959)

11. Dissanayake, V., Herath, S., Rasnayaka, S., Seneviratne, S.,Vidanaarachchi, R., Gamage, C.: Quantitative and qualitativeevaluation of performance and robustness of image stitching algo-

123

Page 18: Optimal seamline detection in dynamic scenes via graph ... · objects and produce high-quality image mosaics for most scenes. However, these methods cannot remove dynamic objects

836 L. Li et al.

rithms. In: International Conference on Digital Image Computing:Techniques and Applications (DICTA) (2015)

12. Hsieh, J.-W.: Fast stitching algorithm for moving object detec-tion and mosaic construction. Image Vis. Comput. 22(4), 291–306(2004)

13. Kass, M., Witkin, A.: Snakes: active contour models. Int. J. Com-put. Vis. 1(4), 321C–331 (1988)

14. Kim, D.-W., Hong, K.-S.: Practical background estimation formosaic blending with patch-based markov random fields. PatternRecognit. 41(7), 2145–2155 (2008)

15. Kolmogorov, V., Zabih, R.: Computing visual correspondence withocclusions using graph cuts. In: IEEE International Conference onComputer Vision (ICCV) (2001)

16. Levin,A., Zomet,A., Peleg, S.,Weiss, Y.: Seamless image stitchingin the gradient domain. In: European Conference on ComputerVision (ECCV) (2004)

17. Li, L., Yao, J., Liu, Y., Yuan,W., Shi, S., Yuan, S.: Optimal seamlinedetection for orthoimage mosaicking by combining deep convo-lutional neural network and graph cuts. Remote Sens. 9(7), 701(2017a)

18. Li, L., Yao, J., Xie, R., Xia, M., Zhang, W.: A unified frameworkfor street-view panorama stitching. Sensors 17(1), 1 (2017b)

19. López, M.B., Hannuksela, J., Silvén, O., Vehviläinen, M.: Inter-active multi-frame reconstruction for mobile devices. Multimed.Tools Appl. 69(1), 31–51 (2014)

20. Mills, A., Dudek, G.: Image stitching with dynamic elements.Image Vis. Comput. 27(10), 1593–1602 (2009)

21. Philip, S., Summa, B., Tierny, J., Bremer, P.-T., Pascucci, V.: Dis-tributed seams for gigapixel panoramas. IEEE Trans. Vis. Comput.Graph. 21(3), 350–362 (2015)

22. Qureshi, H., Khan, M., Hafiz, R., Cho, Y., Cha, J.: Quantitativequality assessment of stitched panoramic images. IET Image Proc.6(9), 1348–1358 (2012)

23. Rother, C., Kolmogorov, V., Blake, A.: Grabcut: Interactive fore-ground extraction using iterated graph cuts. In: ACM Transactionson Graphics (TOG) (Proceedings of SIGGRAPH 2004) (2004)

24. Tao, C., Sun, H., Yang, C., Tian, J.: Efficient image stitching in thepresence of dynamic objects and structure misalignment. J. SignalInf. Proc. 2(3), 205 (2011)

25. Tao, M.W., Johnson, M.K., Paris, S.: Error-tolerant image com-positing. Int. J. Comput. Vis. 103(2), 178–189 (2013)

26. Wang, Z., Bovik, A.C., Sheikh, H.R., Simoncelli, E.P.: Image qual-ity assessment: from error visibility to structural similarity. IEEETrans. Image Process. 13(4), 600–612 (2004)

27. Xia, M., Yao, J., Xie, R., Li, L., Zhang, W.: Globally consistentalignment for planar mosaicking via topology analysis. PatternRecognit. 66, 239–252 (2017)

28. Xu, W., Mulligan, J.: Performance evaluation of color correctionapproaches for automatic multi-view image and video stitching.In: IEEE Conference on Computer Vision and Pattern Recognition(CVPR) (2010)

29. Zeng, L., Zhang, S., Zhang, J., Zhang, Y.: Dynamic image mosaicvia SIFT and dynamic programming. Mach. Vis. Appl. 25(5),1271–1282 (2014)

30. Zhi, Q., Cooperstock, J.R.: Toward dynamic image mosaic genera-tion with robustness to parallax. IEEE Trans. Image Process. 21(1),366–378 (2012)

31. Zhou, Z., Li, S., Wang, B.: Multi-scale weighted gradient-basedfusion for multi-focus images. Inf. Fusion 20(1), 60–72 (2014)

32. Zhu, Q., Yeh, M.-C., Cheng, K.-T., Avidan, S.: Fast human detec-tion using a cascade of histograms of oriented gradients. In: IEEEComputer Society Conference on Computer Vision and PatternRecognition (CVPR) (2006)

Li Li received B.E. degree andM.E. degree from the Schoolof Remote Sensing and Infor-mation Engineering at WuhanUniversity, China, in 2013 and2016, respectively. He is pursu-ing a Ph.D. degree at School ofRemote Sensing and InformationEngineering, Wuhan University,China. He has published severalinternational conference papersand in several journals and is theinventor of several patents. Cur-rently, hemainlyworks on ImageMosaicking, LiDAR Data Pro-

cessing, Robotics, Machine Learning, etc.

Jian Yao received B.Sc. degreein Automation in 1997 fromXia-men University, China, M.Sc.degree in Computer Sciencefrom Wuhan University, China,and Ph.D. degree in ElectronicEngineering in 2006 from TheChinese University of HongKong. From 2001 to 2002, heworked as a Research Assistantat Shenzhen R&DCentre of CityUniversity of Hong Kong. From2006 to 2008, he worked as aPostdoctoral Fellow inComputerVisionGroup of IDIAPResearch

Institute, Martigny, Switzerland. From 2009 to 2011, he worked as aResearch Grantholder in the Institute for the Protection and Security ofthe Citizen, European Commission–Joint Research Centre (JRC), Ispra,Italy. From 2011 to 2012, he worked as a Professor in Shenzhen Insti-tutes of Advanced Technology (SIAT), Chinese Academy of Sciences,China. Since April 2012, he has been a Hubei “Chutian Scholar” Dis-tinguished Professor with School of Remote Sensing and InformationEngineering, Wuhan University, China, and the director of ComputerVision and Remote Sensing (CVRS) Lab (CVRS Website: http://cvrs.whu.edu.cn/), Wuhan University, China. He is the member of IEEE, haspublished over 100 papers in international journals and proceedings ofmajor conferences and is the inventor of over 30 patents. His currentresearch interests mainly include Computer Vision, Image Processing,Deep Learning, and LiDAR Data Processing, Robotics, SLAM, etc.

Haoang Li received B.E. degreefrom Wuhan University, China,in 2016. He is pursuing a M.E.degree at School of RemoteSensing and Information Engi-neering, Wuhan University,China. He has published severalinternational conference papers.His current research interestsinclude simultaneous localiza-tion and mapping (SLAM),structure from motion (SfM),Image Processing, etc.

123

Page 19: Optimal seamline detection in dynamic scenes via graph ... · objects and produce high-quality image mosaics for most scenes. However, these methods cannot remove dynamic objects

Optimal seamline detection in dynamic scenes via graph cuts for image mosaicking 837

Menghan Xia received B.E.degree in Remote Sensing Sci-ence and Technology in 2014and M.Sc. degree in PatternRecognition and Intelligent Sys-tem in 2017 from Wuhan Uni-versity, China. He is pursu-ing a Ph.D. degree at Depart-ment of Computer Science andEngineering, The Chinese Uni-versity of Hong Kong, China.He has published several inter-national conference papers andin several journals. Currently,he mainly works on Computer

Vision, Computer Graphics, Deep Learning.

Wei Zhang is with School ofControl Science and Engineer-ing of Shandong University as anassociate professor. He receivedhis Ph.D. at The Chinese Univer-sity of Hong Kong (CUHK). Hepreviously worked as a Postdocscholar at University of Cali-fornia, Berkeley (UC Berkeley).He is a member of IEEE andhas published over 50 papersin computer vision, image pro-cessing and machine learning.He received several internationalawards from IEEE and served for

many famous international conferences such as CVPR, ICCV, ECCV,ICIP, ROBIO, ICIA, ICAL.

123