Watershed Segementation

8/7/2019 Watershed Segementation

1/40

Fundamenta Informaticae 41 (2001) 187228 1

IOS Press

The Watershed Transform: Definitions, Algorithms and

Parallelization Strategies

Jos B.T.M. Roerdink and Arnold MeijsterInstitute for Mathematics and Computing Science

University of Groningen

P.O. Box 800, 9700 AV Groningen, The Netherlands

Email: [email protected],[email protected]

Abstract. The watershed transform is the method of choice for image segmentation in thefield of mathematical morphology. We present a critical review of several definitions of thewatershed transform and the associated sequential algorithms, and discuss various issueswhich often cause confusion in the literature. The need to distinguish between definition,algorithm specification and algorithm implementation is pointed out. Various examples aregiven which illustrate differences between watershed transforms based on different definitionsand/or implementations. The second part of the paper surveys approaches for parallelimplementation of sequential watershed algorithms.

Keywords: Mathematical morphology, watershed transform, watershed definition, se-quential algorithms, parallel implementation.

1. Introduction

In grey scale mathematical morphology the watershed transform, originally proposed by Digabeland Lantuejoul [9, 20] and later improved by Beucher and Lantuejoul [4], is the method of choicefor image segmentation [5, 46, 52]. Generally spoken, image segmentation is the process ofisolating objects in the image from the background, i.e., partitioning the image into disjointregions, such that each region is homogeneous with respect to some property, such as grey valueor texture [18].

The watershed transform can be classified as a region-based segmentation approach. Theintuitive idea underlying this method comes from geography: it is that of a landscape or to-pographic relief which is flooded by water, watersheds being the divide lines of the domains ofattraction of rain falling over the region [46]. An alternative approach is to imagine the landscapebeing immersed in a lake, with holes pierced in local minima. Basins (also called catchment


2/40

2 J.B.T.M. Roerdink and A. Meijster / The Watershed Transform

basins) will fill up with water starting at these local minima, and, at points where water comingfrom different basins would meet, dams are built. When the water level has reached the highestpeak in the landscape, the process is stopped. As a result, the landscape is partitioned intoregions or basins separated by dams, called watershed lines or simply watersheds.

When simulating this process for image segmentation, two approaches may be used: eitherone first finds basins, then watersheds by taking a set complement; or one computes a completepartition of the image into basins, and subsequently finds the watersheds by boundary detection.

To be more explicit, we will use the expression watershed transform to denote a labelling ofthe image, such that all points of a given catchment basin have the same unique label, and aspecial label, distinct from all the labels of the catchment basins, is assigned to all points of thewatersheds. An example of a simple image with its watershed transform is given in Fig. 1(a-b).We note in passing that in practice one often does not apply the watershed transform to theoriginal image, but to its (morphological) gradient [26]. This produces watersheds at the pointsof grey value discontinuity, as is commonly desired in image segmentation.

One of the difficulties with this intuitive concept is that it leaves room for various formal-izations. Different watershed definitions for continuous functions have been given, which will bebriefly reviewed in Section 3.1. However, our main interest here is in digital images, for whichthere is even more freedom to define watersheds, since in the discrete case there is no unique

definition of the path a drop of water would follow. Many sequential algorithms have been de-veloped to compute watershed transforms, see e.g. [26,51] for a survey. They can be divided intotwo classes, one based on the specification of a recursive algorithm by Vincent & Soille [52], andanother based on distance functions by Meyer [25]. In the context of parallel implementationsthere exists a notable tendency for introducing other definitions of the watershed transform,enabling easier parallelization. Examples are presented in Section 5.

(a) (b) (c) (d)

Figure 1. Examples of watershed segmentation by immersion (see Definition 3.2). (a): syntheticimage; (b): watershed transform of (a); (c): natural image; (d): watershed transform of (c).Different basins are indicated by distinct grey values.

The impression which the current literature on watershed algorithms makes upon the unini-tiated reader can only be one of great confusion. Often it is uncertain exactly which definitionfor the watershed transform is used. Sometimes the definition takes the form of the specification


3/40

J.B.T.M. Roerdink and A. Meijster / The Watershed Transform 3

of an algorithm. A careful distinction between algorithm specification and implementation is inmany cases lacking. Without such a separation, correctness assessment of proposed algorithmsis impossible. Even when a specification is given, the implementation often does not adhereto it. Ad hoc modifications are made to eliminate undesirable consequences of a watersheddefinition, but such changes tend to create new problems by solving an old one. Or optimiza-tions are introduced, for greater speed or memory reduction, which in the process change theoutcome of the algorithm as well, although this may often go undetected in the case of natural

images. These questions are not purely academic, since the algorithm is widely used in e.g.medical image processing where unwanted side effects should be avoided.

The purpose of this paper is twofold. In the first part we present a critical review of severaldefinitions of the watershed transform and the associated sequential algorithms, emphasizingthe distinction between definition, algorithm specification and algorithm implementation. Thesecond part of the paper surveys the main current approaches towards parallel implementationof watershed algorithms. An essential difficulty lies in the fact that the watershed transformis not a local concept. The decision whether a pixel belongs to a basin cannot be based onpurely local considerations. Another problem with some algorithms is that the result dependson the order in which pixels are treated during execution. In the sequential case, this canbe resolved by fixing the scanning order (e.g. raster scan), so that a deterministic result is

obtained. In a parallel implementation this is no longer true since the outcome depends on therelative time instants at which different processors treat the pixels, and this is unpredictablein the case of asynchronous processors. The emphasis in the second part is on methodologyand trends in current research. We point out the difficulties in the design of parallel watershedalgorithms. Efficiency results are quoted to some extent in order to give the reader an idea ofwhat is currently achievable. However, an in-depth comparison of the large body of results whichhave been obtained for different watershed algorithms on different architectures with differentprogramming methodologies is beyond the scope of this paper.

There are a number of issues concerning the watershed transform which are not discussedexplicitly. We mention a few of them. First, there is the question of accuracy of watershedlines. Usually, one has in mind here that the result should be a close approximation of the

continuous case. That is, the digital distances playing a role in the watershed calculation shouldapproximate the Euclidean distance. Chamfer distances are an efficient way to achieve accuratewatershed lines [25]. Second, the watershed method in its original form produces a severeoversegmentation of the image, i.e., many small basins are produced due to many local minimain the input image, see Fig. 1(c-d). Several approaches exist to remedy this, such as markersor hierarchical watersheds [3, 26]; also parallellization of marker-based watershed algorithmshas been studied [27, 31]. Third, we do not consider dedicated hardware architectures for fastcomputation of watershed transforms and related operations, see e.g. [19,37]. Such architecturestend to solve a very restricted class of image processing tasks, whereas our interest here is inmedium level image processing on general purpose (parallel) architectures.

The organization of this paper is as follows. In Section 2 some preliminaries are given.Section 3 presents definitions of the watershed transform, both for the continuous and thediscrete case. Sequential watershed algorithms are reviewed in Section 4. Section 5 containsa survey of parallelization strategies for the watershed transform. Conclusions are drawn inSection 6.


4/40


2. Preliminaries

This section contains some background material on graphs (see e.g. [8]) and digital images.

2.1. Graphs

A graph G = (V, E) consists of a set V of vertices (or nodes) and a set E V V of pairs

of vertices. In a (un)directed graph the set E consists of (un)ordered pairs (v, w). Instead ofdirected graph we will also write digraph. An unordered pair (v, w) is called an edge, an orderedpair (v, w) an arc. Ife = (v, w) is an edge (arc), e is said to be incident with (or adjacent to) itsvertices v and w; conversely, v and w are called incident with e. We also call v and w neighbours.The set of vertices which are neighbours of v is denoted by NG(v). A path of length in agraph G = (V, E) from vertex p to vertex q is a sequence of vertices (p0, p1, . . . , p1, p) suchthat p0 = p, p = q and (pi, pi+1) E i [0, ). The length of a path is denoted by length().A path is called simple if all its vertices are distinct. If there exists a path from a vertex p to avertex q, then we say that q is reachable from p, denoted as p q.

An undirected graph is connected if every vertex is reachable from every other vertex. Agraph G = (V, E) is called a subgraph of G = (V, E) if V V, E E, and the elements

of E

are incident with vertices from V

only. A connected component of a graph is a maximalconnected subgraph of G. The connected components partition the vertices of G.

In a digraph, a path (p0, p1, . . . , p1, p) forms a cycle if p0 = p and the path contains atleast one edge. If all vertices of the cycle are distinct, we speak of a simple cycle. A self-loop isa cycle of length 1. In an undirected graph, a path (p0, p1, . . . , p1, p) forms a cycle if p0 = pand p1, . . . , p are distinct. A graph with no cycles is acyclic. A forest is an undirected acyclicgraph, a tree is a connected undirected acyclic graph. A directed acyclic graph is abbreviatedas DAG.

A weighted graph is a triple G = (V , E , w) where w : E R is a weight function defined onthe edges. A valued graph is a triple G = (V , E , f ) where f : V R is a weight function definedon the vertices. A level component at level h of a valued graph is a connected component of the

set of nodes v with the same value f(v) = h. The boundary of a level component P at level hconsists of all p P which have neighbours with value different from h; the lower boundary ofP is the set of all p P which have neighbours with value smaller than h; the interior of Pconsists of all points of P which are not on the boundary. A descending path is a path alongwhich the value does not increase. By f(p) we denote the set of all descending paths startingin a node p and ending in some node q with f(q) < f(p). A regional minimum (minimum, forshort) at level h is a level component P of which no points have neighbours with value lower

than h, i.e. f(p) = for all p P. A valued graph is called lower complete when each nodewhich is not in a minimum has a neighbouring node of lower value.

2.2. Digital grids

A digital grid is a special kind of graph. Usually one works with the square grid D Z2, wherethe vertices are called pixels. When D is finite, the size of D is the number of points in D. Theset of pixels D can be endowed with a graph structure G = (V, E) by taking for V the domain D,


5/40


and for E a certain subset ofZ2 Z2 defining the connectivity. Usual choices are 4-connectivity,i.e., each point has edges to its horizontal and vertical neighbours, or 8-connectivity where apoint is connected to its horizontal, vertical and diagonal neighbours. Connected componentsof a set of pixels are defined by applying the definition for graphs.

Distances between neighbouring nodes in a digital grid are introduced by associating a non-negative weight d(p,q) to each edge (p,q). In this way a weighted graph is obtained. The distanced(p,q) between non-neighbouring pixels p and q is defined as the minimum path length among

all paths from p to q (this depends on the graph structure of the grid, i.e., the connectivity).

2.3. Digital images

A digital grey scale image is a triple G = (D,E ,f ), where (D, E) is a graph (usually a digitalgrid) and f : D N is a function assigning an integer value to each p D. A binary imagef takes only two values, say 1 (foreground) and 0 (background). For p D, f(p) is calledthe grey value or altitude (considering f as a topographic relief). For the range of a grey scaleimage one often takes the set of integers from 0 to 255, but we do not make this assumption inthis paper. A plateau or flat zone of grey value h is a level component of the image, consideredas a valued graph, i.e., a connected component of pixels of constant grey value h. The thresholdset of f at level h is

Th = {p D | f(p) h}. (2.1)

2.4. Geodesic distance

Let A E, with E = Rd or E = Zd, and a, b two points in A. The geodesic distance dA(a, b)between a and b within A is the minimum path length among all paths within A from a to b(in the continuous case, read infimum instead of minimum). If B is a subset of A, definedA(a, B) = minbB(dA(a, b)). Let B A be partitioned in k connected components Bi, i =1, . . . , k. The geodesic influence zone of the set Bi within A is defined as

izA(Bi) = {p A | j [1..k]\{i} : dA(p,Bi) < dA(p,Bj)}

Let B A. The set IZA(B) is the union of the geodesic influence zones of the connectedcomponents of B, i.e.,

IZA(B) =

ki=1

izA(Bi)

The complement of the set IZA(B) within A is called the SKIZ (skeleton by influence zones):

SKIZA(B) = A\IZA(B)

So the SKIZ consists of all points which are equidistant (in the sense of the geodesic distance)to at least two nearest connected components (for digital grids, there may be no such points).For a binary image f with domain A, the SKIZ can be defined by identifying B with the set offoreground pixels.


6/40


3. Definitions of the watershed transform

In this section we introduce definitions of the watershed transform, which may be viewed as ageneralization of the skeleton by influence zones (SKIZ) to grey value images. We start withthe continuous case, followed by two definitions for the digital case, the algorithmic definition byVincent & Soille [52], and the definition by topographical distance by Meyer [25]. A discussionof algorithms is postponed until Section 4.

3.1. Watershed definition: continuous case

A watershed definition for the continuous case can be based on distance functions. Dependingon the distance function used one may arrive at different definitions. We restrict ourselves hereto the one given in [25,36], but other choices have been proposed as well [39].

Assume that the image f is an element of the space C(D) of real twice continuously differen-tiable functions on a connected domain D with only isolated critical points (the class of Morsefunctions on D forms an example [17, 35]). Then the topographical distance between points pand q in D is defined by

Tf(p,q) = inf

f((s)) ds ,

where the infimum is over all paths (smooth curves) inside D with (0) = p, (1) = q.The topographical distance between a point p D and a set A D is defined as Tf(p,A) =minaA Tf(p,a). The path with shortest Tf-distance between p and q is a path of steepest slope.This motivates the following rigorous definition of the watershed transform.

Definition 3.1. (Watershed transform) Letf C(D) have minima {mk}kI, for some in-dex set I. The catchment basin CB (mi) of a minimum mi is defined as the set of points x Dwhich are topographically closer to mi than to any other regional minimum mj :

CB(mi) = {x D | j I\{i} : f(mi) + Tf(x, mi) < f(mj) + Tf(x, mj)}

The watershed of f is the set of points which do not belong to any catchment basin:

Wshed(f) = D

iI

CB (mi)

c. (3.1)

Let W be some label, W I. The watershed transform of f is a mapping : D I { W},such that (p) = i if p CB (mi), and (p) = W if p Wshed(f).

So the watershed transform of f assigns labels to the points of D, such that (i) different catch-ment basins are uniquely labelled, and (ii) a special label W is assigned to all points of thewatershed of f.


7/40


3.2. Watershed definitions: discrete case

A problem which arises for digital images is the occurrence of plateaus, i.e., regions of constantgrey value, which may extend over large image areas. Such plateaus form a difficulty whentrying to extend the continuous watershed definition based on topographical distances to discreteimages. This nonlocal effect is also a major obstacle for parallel implementation of watershedalgorithms, see Section 5.

The next algorithmic definition automatically takes care of plateaus, because it computesa watershed transform level by level, where each level constitutes a binary image for which aSKIZ is computed.

7 6 5 4

8 5 4 3

9 4 3 2

0 3 2 1

(a) original

B B B B

B B B B

W B B B

A W B B

(b) 4-conn.

B B B B

B B B B

W W B B

A W B B

(c) 8-conn.

W W B B

W W B B

A W B B

A A B B

(d) 4-conn.

B B B B

A B B B

A A B B

A A B B

(e) 8-conn.

Figure 2. Watershed transform on the square grid, for different connectivity. (a): original image(minima indicated in bold); (b-c): results according to immersion (Definition 3.2); (d)-(e):results according to topographical distance (Definition 3.1, with Tf as defined in (3.5)).

3.2.1. Algorithmic definition by immersion

An algorithmic definition of the watershed transform by simulated immersion was given byVincent and Soille [51,52] (see also [46, Ch. XI, H.5] for the binary case). Let f : D N be adigital grey value image, with hmin and hmax the minimum and maximum value of f. Define

a recursion with the grey level h increasing from hmin to hmax, in which the basins associatedwith the minima of f are successively expanded. Let Xh denote the union of the set of basinscomputed at level h. A connected component of the threshold set Th+1 at level h + 1 (cf. (2.1))can be either a new minimum, or an extension of a basin in Xh: in the latter case one computesthe geodesic influence zone of Xh within Th+1 (cf. Section 2.4), resulting in an update Xh+1.Let minh denote the union of all regional minima at altitude h.

Definition 3.2. (Watershed by immersion) Define the following recursion:Xhmin = {p D | f(p) = hmin} = ThminXh+1 = minh+1 IZTh+1(Xh), h [hmin, hmax)

(3.2)

The watershed Wshed(f) of f is the complement of Xhmax in D:

Wshed(f) = D \ Xhmax


8/40


For an example of the watershed transform according to the above recurrence, see Fig. 2(a-c), inwhich A and B are labels of basins, and W is used to denote watershed pixels (in this and otherfigures to follow, minima pixels in the input image are indicated in bold). Note the dependenceon the connectivity.

3 2 2

3 1 10 1 0

(a)

3 2 2

3 1 1A 1 B

(b) h = 0

3 2 2

3 W BA W B

(c) h = 1

3 B B

3 B BA W B

(d) h = 2

B B B

W B BA W B

(e) h = 3

Figure 3. Watershed transform by immersion on the 4-connected grid, showing relabelling ofwatershed pixels. (a): Original image; (b-e): labelling steps based on (3.2).

According to the recursion (3.2), it is the case that at level h + 1 all non-basin pixels (i.e.all pixels in Th+1 except those in Xh) are potential candidates to get assigned to a catchmentbasin in step h + 1. Therefore, the definition allows that pixels with grey value h h whichare not yet part of a basin after processing level h, are merged with some basin at the higherlevel h + 1. Pixels which in a given iteration are equidistant to at least two nearest basinsmay be provisionally labelled as watershed pixels by assigning them the label W (we will referto such pixels as W-pixels). However, in the next iteration this label may change again. Adefinitive labelling as watershed pixel can only happen after all levels have been processed. Anexample [42] is given in Fig. 3, for a 3 3 discrete image on the square grid with 4-connectivity.There are two local minima (the zeroes), so there will be two basins whose pixels are labelledA, B. The labelling according to (3.2) is shown in Fig. 3(b)-(e). This shows the phenomenon ofrelabelling of W-pixels: the pixel in the second row, second column, is first labelled W, then B.

The algorithm presented by Vincent & Soille in [52] as an implementation of (3.2) in factdoes not adhere to this definition, see Section 4.1 below.

3.2.2. Watershed definition by topographical distance

We follow here the presentation in [25]. Let f be a digital grey value image. Initially, we assumethat f is lower complete, that is, each pixel which is not in a minimum has a neighbour of lowergrey value [26]. This assumption will be relaxed later.

The lower slope LS(p) of f at a pixel p, is defined as the maximal slope linking p to any ofits neighbours of lower altitude. Formally,

LS(p) = maxqNG(p){p}

f(p) f(q)

d(p,q)

, (3.3)

where NG(p) is the set of neighbours of pixel p on the grid G = (V, E), and d(p,q) is the distanceassociated to edge (p,q) (for q = p the expression following the max-operator in (3.3) is definedto be zero). Note that for pixels whose neighbours are all of higher grey value, the lower slope


9/40


is zero. The cost for walking from pixel p to a neighbouring pixel q is defined as

cost(p,q) =

LS(p) d(p,q) if f(p) > f(q)LS(q) d(p,q) if f(p) < f(q)12(LS(p) + LS(q)) d(p,q) if f(p) = f(q)

(3.4)

Definition 3.3. The set of lower neighbours q of p for which the slope (f(p) f(q))/d(p,q) ismaximal, i.e. equals the value LS(p), is denoted by (p). The set of pixels q for which p (q)

is denoted by 1(p).

The topographical distance along a path = (p0, . . . , p) between p0 = p and p = q is definedas

Tf (p,q) =1i=0

d(pi, pi+1) cost(pi, pi+1).

The topographical distance between p and q is the minimum of the topographical distances alongall paths between p and q:

Tf(p,q) = min[pq]

Tf (p,q), (3.5)

where the set of all paths from p to q is denoted by [p q]. The topographical distance between

a point p D and a set A D is defined as Tf(p,A) = minaA Tf(p,a).We call (p0, p1, . . . , pn) a path of steepest descent from p0 = p to pn = q if pi+1 (pi) for

each i = 0, . . . , n 1. A pixel q is said to belong to the downstream of p if there exists a pathof steepest descent from p to q. A pixel q is said to belong to the upstream of p if p belongs tothe downstream of q.

The topographical distance has the following property, on which the watershed definitioncrucially depends.

Proposition 3.1. Let f(p) > f(q). A path from p to q is of steepest descent if and only ifTf (p,q) = f(p) f(q). If a path from p to q is not of steepest descent, T

f (p,q) > f(p) f(q).

This proposition implies that paths of steepest descent are the geodesics (shortest paths) of thetopographical distance function. With the introduction of the topographical distance for digitalimages, the definition of catchment basins and watersheds is the same as for the continuous case,cf. Definition 3.1.

It is a consequence of Proposition 3.1 that CB(mi) is the set of points in the upstream ofa single minimum mi. The watershed consists of the points p which are in the upstream of atleast two minima, i.e., there are at least two paths of steepest descent starting from p whichlead to different minima. Also, the following corollary is obvious.

Corollary 3.1. Any pixel in the upstream of a watershed pixel is itself a watershed pixel.

An example of the watershed transform according to topographical distance is given in Fig. 2(d-e). Note that the result differs from that obtained by immersion according to Definition 3.2. Aconsequence of Definition 3.1 in the digital case is the occurrence of thick watersheds, meaningthat the watershed pixels do not form one-pixel thick lines but extended areas. An examplefor the case of 4-connectivity is given in Fig. 4. The result according to simulated immersion isgiven for comparison; although thick watersheds also occur for this watershed definition, theytend to be less pronounced.


10/40


5 4 3 2 3 4 5

4 3 2 1 2 3 4

3 2 1 0 1 2 3

2 1 0 1 0 1 2

3 2 1 0 1 2 3

4 3 2 1 2 3 4

5 4 3 2 3 4 5

(a)

W W W B W W W

W W W B W W W

W W W B W W W

A A A W C C C

W W W D W W W

W W W D W W W

W W W D W W W

(b)

W B B B B B W

A W B B B W C

A A W B W C C

A A A W C C C

A A W D W C C

A W D D D W C

W D D D D D W

(c)

Figure 4. Watershed transform on the square grid with 4-connectivity, showing thick watersheds.(a): original image; (b): result according to topographical distance (Definition 3.1, with Tf asdefined in (3.5)); (c): result according to immersion (Definition 3.2).

Remark. A distance transform [43] on a digital grid (with unit distance values on the edges)of a binary image b produces a grey value image f whose cost function equals 1 on every edge

outside the minima of f. The watershed of f therefore equals the SKIZ of b [25].

Next we consider images which are not lower complete.

Plateau problem

Problems arise when we try to extend the above approach to images which are not lower com-plete. In such images non-minima plateaus with nonempty interior occur. When we directlyapply the above definitions, the topographical distance between interior pixels of a plateauturns out to be identically zero. Therefore an additional ordering relation between such pixelsis required. The usual solution is to compute geodesic distances to the lower boundary of the

plateau. This can be formalized by first transforming the image to a lower complete image, towhich the definitions above then can be applied.Recall that f(p) is the set of all descending paths starting in a pixel p and ending in some

pixel q with f(q) < f(p), and length() is the length of a path .

Definition 3.4. (Lower completion) Let f be a digital grey value image with domain D.Define the function d : D N by

d(p) =

0 if f(p) =

min

f(p)

length() otherwise

LetLc = maxpD d(p). Then the lower completion fLC of f is defined by

fLC(p) =

Lc f(p) if d(p) = 0Lc f(p) + d(p) 1 otherwise


11/40


(a) (b) (c)

Figure 5. Image (a), lower distance image (b) and lower complete image (c).

The process of lower completion transforms the image f into a lower complete image fLC.An example is given in Fig. 5. The function d has the value zero for minima pixels, and forall other pixels p, d(p) equals the length of the shortest path from p to the set of pixels with

grey value lower than that of p. We will refer to d(p) as the lower distance of p. If f is alreadylower complete, then fLC = f. An algorithm for lower completion is given in the next section(Algorithm 4.5).

By lower completion, we can define an order relation between pixels:

x y fLC(x) < fLC(y). (3.6)

After lower completion, the function Tf with f = fLC is a proper distance function on D

D,where D equals the domain D from which the minima are excluded.

The particular form of the lower slope and cost function was devised to ensure that steepestdescent paths would realize the smallest topographical distance. The mapping (p) can be usedto define a directed graph by arrowing [5,25] as follows.

Definition 3.5. Let G = (V , E , f ) be a digital grey value image. The lower complete graphG = (V, E) is defined as follows. For points p having a lower neighbour,

(p,p) E p (p) (3.7)

On the interior of plateaus, an arc is created from p to p if the geodesic distance to the lowerboundary of the plateau is greater for p than for p, i.e. if p p.

The lower complete graph is acyclic (a DAG).

Definition 3.6. (Watershed transform by topographical distance)Let f be a grey value image, with f = fLC the lower completion of f. Let (mi)iI be the

collection of minima of f. The basin CB (mi) of f corresponding to a minimum mi is definedas the basin of the lower completion of f:

CB(mi) = {p D | j I\{i} : f(mi) + Tf(p,mi) < f

(mj) + Tf(p,mj)}, (3.8)

and the watershed of f is defined as in (3.1).


12/40


So, basically we define the watershed transform by topographical distance of an arbitrarydigital grey value image as the watershed transform of its lower completion.

16 15 14 13 12 13

14 13 12 11 10 11

15 14 13 9 8 9

1 2 3 7 6 80 7 5 4 5 0

(a)

A A A A C C

A A A A C C

A A A A C C

A A A A C CA A A B C C

(b)

16 15 14 13 12 13

14 13 12 11 10 11

15 14 13 9 8 9

1 2 3 7 6 8

0 7 5 0 5 0

(c)

B B B B W W

B B B B W W

A A A B W W

A A A B W C

A A B B W C

(d)

Figure 6. Watershed transform according to topographical distance on the square grid with4-connectivity, showing effect of lowering minima. (a): original image; (b): watershed labellingof (a); (c): image (a) with all minima set to zero. (d): watershed labelling of (c).

In practice, algorithms to compute the watershed transform for images with plateaus oftendo not explicitly carry out the lower completion step, but assign plateau pixels to basins inanother way. This is the case for algorithms based on so-called ordered queues. As a cautionarynote we would like to point out that such algorithmic solutions lead to results which may differto varying degree from the result of Definition 3.6, depending on the precise implementation.This will be discussed in more detail in Section 4.2.

Lowering the minima values. Meyer states in [25] that the watershed lines will not changeif one replaces the values of all minima of f by the value of the deepest one. This statement iscorrect for Definition 3.2 of the watershed transform based on immersion, as is easy to verify.But for the definition based on topographical distance this property does in fact not hold, asalready observed in [50]. An example illustrating this is given in Fig. 6, where there are threeminima, two with value 0 and one with value 4. Replacing the value 4 by 0 does change theresult. Even more, the effect of lowering the value of this single minimum pixel propagates ina global way through the entire image (the image can be enlarged arbitrarily with the effectpropagating accordingly).

Isolated regions. When computing the watershed transform, regions in the image mayarise which are completely surrounded by watershed pixels. An example is given in Fig. 7. The


13/40


center pixel with value 2 has four watershed neighbours, therefore is watershed pixel. In someimplementations of watershed transforms by topographical distance, such regions may in factbecome temporarily or permanently isolated, see [12, 50]. This is a defect of the particularimplementation, since, according to Corollary 3.1, watershed pixels should be propagated. Suchproblems are often solved by ad hoc modifications of the implementation, which still do notcorrectly implement the definition.

0 1 0

1 2 1

0 1 0

(a) original

A W B

W 2 W

C W D

(b) labelled

Figure 7. Watershed according to topographical distance (4-connectivity). (a): original image;(b): Output after labelling pixels with grey values 0 and 1.

3.2.3. Watersheds based on a local condition

Several watershed algorithms exist which do not construct watershed pixels, but instead assignto each pixel the label of some minimum, so that the set of basins tessellates the image plane.Various motivations for such an approach can be given. First of all, watershed lines may in factcomprise large areas (thick watersheds), see Fig. 4, although the use of a higher connectivityalleviates the problem. Next, some implementations of the watershed transform by topographicaldistance have problems with isolated regions caused by watershed pixels, see above. Anotherreason is efficiency, since a correct determination of watershed pixels generally requires morecomputation time and memory.

An explicit definition of a watershed transform based on topographical distance which does

not construct watershed lines was given by Bieniek et al. [6,7], by introducing a local condition.

Definition 3.7. For any image without plateaus, a function L assigning a label to each pixel iscalled a watershed segmentation if:

1. L(mi) = L(mj) i = j, with {mk}kI the set of minima of f;

2. for each pixel p with (p) = , p (p) with L(p) = L(p).

Here the condition (p) = means that p has at least one lower neighbour (cf. Definition 3.3).The new element is that for a given input image, many labellings exist which qualify as awatershed segmentation. Pixels which would have been labelled as watershed points accordingto Definition 3.6, are now merged by random choice with a basin belonging to some minimummk. For an example, see Fig. 8.

The meaning of locality in this definition is that one may subdivide an image in blocks, do alabelling of basins in each block independently, and make the results globally consistent in a finalmerging step. Such increased locality is very advantageous for parallel implementation of the


14/40


watershed transform, which is exactly the context in which this local condition was proposed.Note however that locality should not be misinterpreted as saying that the watershed transformnow has become a purely local operation: in the merging step, basins in local blocks have tobe made consistent, and the resulting global basins can again extend over large regions of theimage (for a fuller discussion, see Section 5.2.3).

5 4 54 3 4

6 2 6

0 1 0

(a)

W W WW W W

A W B

A W B

(b)

A A AA A A

A A B

A A B

(c)

B B BB B B

A B B

A B B

(d)

Figure 8. Watershed transform on the 4-connected square grid. (a): original image; (b): resultaccording to topographical distance (Definition 3.6); (c-d): two watershed labellings consistentwith the local condition (Definition 3.7).

For an input image which would contain watershed pixels according to Definition 3.6, theoutput of a watershed algorithm based on Definition 3.7 is no longer deterministic, but willdepend on the order in which pixels are treated during execution of the algorithm. Whereasin the sequential case a deterministic result can be obtained by fixing the scanning order (e.g.raster scan), this is no longer true for parallel implementation, since in that case the outcomedepends on the relative time instants at which different processors treat the pixels, and thisis unpredictable in the case of asynchronous processors. Therefore, in principle considerabledifferences among watershed labellings computed in different runs of the same algorithm mayoccur, although the effect may be small for natural images.

4. Sequential watershed algorithmsGenerally spoken, existing watershed algorithms either simulate the flooding process, or directlydetect the watershed points. In some implementations, one computes basins which touch, i.e.,no watershed pixels are generated at all.

4.1. Watershed algorithms by immersion

4.1.1. Vincent-Soille algorithm

An implementation of the watershed transform of Definition 3.2 was presented by Vincent &Soille [52]. Since we want to discuss this implementation in some detail, we reproduce theiralgorithm here in pseudocode, see Algorithm 4.1. In this algorithm there are two steps: (i)sorting the pixels w.r.t. increasing grey value, for direct access to pixels at a certain grey level; (ii)a flooding step, proceeding level by level and starting from the minima. The implementation usesa fifo queue of pixels, that is, a first-in-first-out data structure on which the following operations


15/40


can be performed: fifo add(p, queue) adds pixel p at the end of the queue, fifo remove(queue)returns and removes the first element of the queue, fifo init(queue) initializes an empty queue,and fifo empty(queue) is a test which returns true if the queue is empty and false otherwise.

The algorithm assigns a distinct label lab[ ] to each minimum and its associated basin byiteratively flooding the graph using a breadth-first algorithm [8], as follows. In the floodingstep, all nodes with grey level h are first given the label mask. Then those nodes which havelabelled neighbours from the previous iteration are inserted in the queue, and from these pixels

geodesic influence zones are propagated inside the set of masked pixels. If a pixel is adjacentto two or more different basins, it is marked as a watershed node by the label wshed. If thepixel can only be reached from nodes which have the same label, the node is merged with thecorresponding basin. Pixels which at the end still have the value mask belong to a set of newminima at level h, whose connected components get a new label. As shown in [52], the timecomplexity of Algorithm 4.1 is linear in the number of pixels of the input image.

Algorithm 4.1 Vincent-Soille watershed algorithm [52].

1: procedure Watershed-by-Immersion2: Input: digital grey scale image G = (D,E,im).3: Output: labelled watershed image lab on D.4: #define init 1 ( initial value of lab image )5: #define mask 2 ( initial value at each level )6: #define wshed 0 ( label of the watershed pixels )7: #define fictitious (1, 1) ( fictitious pixel D )8: curlab 0 ( curlab is the current label )9: fifo init(queue)

10: for all p D do11: lab[p] init ; dist[p] 0 ( dist is a work image of distances )12: end for13: SORT pixels in increasing order of grey values (minimum hmin, maximum hmax)14:

15: ( Start Flooding )

16: for h = hmin

to hmax

do ( GeodesicSKIZ

of level h 1 inside level h )17: for all p D with im[p] = h do ( mask all pixels at level h )18: ( these are directly accessible because of the sorting step )19: lab[p] mask20: if p has a neighbour q with (lab[q] > 0 or lab[q] = wshed) then21: ( Initialize queue with neighbours at level h of current basins or watersheds )22: dist[p] 1 ; fifo add(p, queue)23: end if24: end for25: curdist 1 ; fifo add(fictitious,queue)26: loop ( extend basins )27: p fifo remove(queue)28: if p = fictitious then

29: if fifo empty(queue) then30: break

31: else32: fifo add(fictitious,queue) ; curdist curdist + 1 ;


16/40


33: p fifo remove(queue)34: end if35: end if36: for all q NG(p) do ( labelling p by inspecting neighbours )37: if dist[q] < curdist and (lab[q] > 0 or lab[q] = wshed) then38: ( q belongs to an existing basin or to watersheds )39: if lab[q] > 0 then40: if lab[p] = mask or lab[p] = wshed then

41: lab[p] lab[q]42: else if lab[p] = lab[q] then43: lab[p] wshed44: end if45: else if lab[p] = mask then46: lab[p] wshed47: end if48: else if lab[q] = mask and dist[q] = 0 then ( q is plateau pixel )49: dist[q] curdist + 1 ; fifo add(q,queue)50: end if51: end for52: end loop53: ( detect and process new minima at level h )54: for all p D with im[p] = h do55: dist[p] 0 ( reset distance to zero )56: if lab[p] = mask then (p is inside a new minimum )57: curlab curlab + 1 ; ( create new label )58: fifo add(p, queue) ; lab[p] curlab59: while not fifo empty(queue) do60: q fifo remove(queue)61: for all r NG(q) do ( inspect neighbours of q )62: if lab[r] = mask then63: fifo add(r, queue) ; lab[r] curlab64: end if65: end for

66: end while67: end if68: end for69: end for70: ( End Flooding )

The Vincent-Soille algorithm in fact does not implement the recursion (3.2), for the followingreasons (the line numbers mentioned refer to the pseudocode of Algorithm 4.1).

1. At level h only pixels with grey value h are masked for flooding (line 17), instead ofall non-basin pixels of level h, as the definition would require (see the discussion inSection 3.2.1).

2. Not only labels of catchment basins are propagated, but also labels ofwshed-pixels (line20). The need for this is a consequence of the previous point. Since the algorithm triesto classify pixels as wshed-pixels at the current grey level, watershed labels have to be


17/40


propagated, because it may be the case that pixels with grey value h only have wshed-pixels in their neighbourhood.

3. A pixel which is adjacent to two different basins, and therefore initially gets labelledwshed, is allowed to be overwritten at the current grey level by the label of anotherneighbouring pixel, if that pixel is part of a basin (lines 40-41). The motivation givenin [52] is that otherwise deviated watershed lines may result. This statement is probablybased on an intuitive expectation for the case of functions in continuous space. From our

point of view, an assessment of the correctness of the implementation should be basedsolely on agreement with the definition.

It is not very difficult to modify Algorithm 4.1 in order to implement the recursion (3.2)exactly. In line 17 all pixels with im[p] h have to be masked, the queue has to be initializedwith basin pixels only (drop the disjunct lab[q] = wshed in line 20), the resetting of distances(line 55) has to be done in line 19, and the propagation rules in lines 36-51 have to be slightlychanged. Note, however, that the theoretical time complexity would change from linear toquadratic in the number of pixels of the input image, due to repeated processing of watershedpixels, although in practice the number of such pixels may actually be rather small.

h = 40

h = 30

h = 20

h = 10

L1

L2

L 3

L4

L5

L6

L0

(b)(a)

L0

L1

L2

L6 L5

L3

L 4

40

30

10

30

40

30

20

(c)

Figure 9. (a) input image. (b) labelled level components. (c) components graph, with greyvalues of the nodes indicated.

Remark. In [42] we tried to formalize what the Vincent-Soille algorithm computes by defininga modified recursion as follows:

Xhmin = {p D | f(p) = hmin}

Xh+1 = Xh minh+1 (IZTh+1(Xh)\Th), h [hmin, hmax)(4.1)

The \Th term in (4.1) was introduced to ensure that at level h + 1 only pixels with grey valueh + 1 are added to existing basins. In the example of Fig. 3, the pixel in the second row, secondcolumn remains labelled as wshed according to (4.1). However, also this modified recursion

does not always correctly represent the implementation of Algorithm 4.1: it is possible that acatchment basin becomes disconnected by the \Th term. In fact, we have been unable to finda recursion which formalizes what actually is computed by Algorithm 4.1.


18/40


4.1.2. Components graph algorithm

A straightforward parallel implementation of the Vincent-Soille algorithm is difficult whenplateaus occur. Therefore, an alternative approach was proposed in [21], in which the im-age is first transformed to a directed valued graph with distinct neighbour values, called thecomponents graph off. On this graph the watershed transform can be computed by a simplifiedversion of the Vincent-Soille algorithm, where fifo queues are no longer necessary, since there

are no plateaus in the graph. The steps are as follows.Algorithm 4.2 Watershed transform w.r.t. topographical distance based on image integrationvia the Dijkstra-Moore shortest paths algorithm.

1: procedure ShortestPathWatershed;2: Input: lower complete digital grey scale image G = (V,E,im) with cost function cost.3: Output: labelled image lab on V.4: #define wshed 0 ( label of the watershed pixels )5: ( Uses distance image dist. On output, dist[v] = im[v], for all v V. )6:

7: for all v V do ( Initialize )8: lab[v] 0 ; dist[v] 9: end for

10: for all local minima mi do11: for all v mi do12: lab[v] i ; dist[v] im[v] ( initialize distance with values of minima )13: end for14: end for15: while V = do16: u GetMinDist(V) ( find u V with smallest distance value dist[u] )17: V V\{u}18: for all v V with (u, v) E do19: if dist[u] + cost[u, v] < dist[v] then20: dist[v] dist[u] + cost(u, v)21: lab[v] lab[u]22: else if lab[v] = wshed and dist[u] + cost[u, v] = dist[v] and lab[v] = lab[u] then

23: lab[v] = wshed24: end if25: end for26: end while

1. Consider the input image as a valued graph (V , E , f ), where f(p) denotes the grey valueof pixel p, p V. Transform this to the components graph (V, E, f) defined as follows.All pixels of a level component Ch at level h are represented by a single node v V:v = {p V|p Ch}, with f

(v) = h. A pair (v, w) of level components is an element ofE if and only if(p v, q w : (p,q) E f(p) < f(q)), cf. Fig. 9.

2. Compute the watershed transform of the directed graph.3. Transform the labelled graph back to an image. Pixels corresponding to a watershed node

are coloured white, the other pixels black. This yields a binary image with plateaus repre-senting watersheds of the original image. Thin watersheds can be obtained by computinga skeleton of this image, for which different skeleton algorithms can be used.


19/40


Algorithm 4.3 Watershed transform w.r.t. topographical distance by hill climbing.1: procedure Hill Climbing2: Input: lower complete digital grey scale image (V,E,im).3: Output: labelled image lab on V.4: #define wshed 0 ( label of the watershed pixels )5:

6: LabelInit ( initialize image lab with distinct labels for minima )7: ( and special label mask for all other pixels )8: S {p V|q NG(p) : im[p] = im[q]} ( interior pixels of minima excluded )9: while not empty(S) do

10: select point p S with minimal grey value11: remove p from S12: for all q 1(p) S do ( label steepest upper neighbours of p )13: if lab[q] = mask then14: lab[q] lab[p]15: else if lab[q] = wshed and lab[q] = lab[p] then16: lab[q] = wshed17: end if18: end for19: end while

4.2. Watershed algorithms by topographical distance

Several shortest paths algorithms for the watershed transform with respect to topographicaldistance can be found in the literature [5,25,26].

Ordered algorithms. The nodes for which the shortest topographical distance is knownare ordered w.r.t. their distance. These methods are based upon the shortest paths algorithmassociated with the names of Dijkstra [10] and Moore [34].a. integration: this algorithm is based on integration of the lower slope of the image, bypropagating distances starting from the regional minima. The distances are related to the lowerslope of the image through the cost function (3.4). On output, the distance value of a pixel pequals f(p), where f is the input image. The pseudocode is given in Algorithm 4.2, which isdescribed in more detail in Section 4.2.2.b. hill climbing: The geodesics between points of a basin and the corresponding minimum arepaths of steepest descent. This relation may be inverted as follows. Label all minima withdistinct labels. Starting from the boundary pixels of the minima, label all pixels q in the set1(p) of all steepest upper neighbours of the current pixel p by the label ofp, unless q is alreadylabelled and the label differs from that of p, in which case q is classified as a watershed pixel.The pseudocode is given in Algorithm 4.3, see Section 4.2.3 for details.

Unordered algorithms. The shortest path algorithm of Berge [2] assumes no order on thetreatment of pixels, so that classical raster scanning modes can be used. This algorithm can beadapted for flooding from the minima and solving the eikonal equation [49]. The implementationis based on an iterative algorithm [25] which integrates the lower slope of the input image, see


20/40


Algorithm 4.4. In [25] a variant is mentioned based on propagation of labelled pixels to steepestupper neighbours, as in hill climbing.

Algorithm 4.4 Watershed transform w.r.t. topographical distance by sequential scanning basedon image integration.

1: procedure Sequential scanning2: Input: lower complete image im on a digital grid G = (D, E) with cost function cost.

3: Output: labelled image lab on D.4: #define wshed 0 ( label of the watershed pixels )5: ( Uses distance image dist. On output, dist[v] = im[v], for all v D. )6:

7: for all v D do ( Initialize )8: lab[v] 0 ; dist[v] 9: end for

10: for all local minima mi do11: for all v mi do12: lab[v] i ; dist[v] im[v] ( initialize distance with values of minima )13: end for14: end for15: stable true ( stable is a boolean variable )

16: repeat17: for all pixels u in forward raster scan order do18: Propagate (u)19: end for20: for all pixels u in backward raster scan order do21: Propagate (u)22: end for23: until stable24:

25: procedure Propagate (u)26: for all v NG(u) in the future (w.r.t. scan order) of u do27: if dist[u] + cost[u, v] < dist[v] then28: dist[v] dist[u] + cost(u, v)29: lab[v] lab[u]30: stable false31: else if lab[v] = wshed and dist[u] + cost[u, v] = dist[v] and lab[v] = lab[u] then32: lab[v] = wshed33: stable false34: end if35: end for

In [25] slightly different versions of the above algorithms are presented which do not producewatershed labels (lines 21-22 in Algorithm 4.2, lines 14-15 in Algorithm 4.3 and lines 30-32 inAlgorithm 4.4 are omitted), and therefore are not exact implementations of Definition 3.6. Allpixels are merged with some basin, so that, dependent on the order in which pixels are treated,different results may be produced. Unfortunately, a discussion of this point is missing in [25].In fact, those algorithms are in agreement with the local definition of the watershed transform,as discussed in Section 3.2.3.


21/40


Algorithm 4.5 Algorithm for lower completion using a fifo queue.1: procedure LowerCompletion

2: Input: digital grey scale image G = (D,E,im).3: Output: lower complete image G = (D,E,lc).4:

5: fifo init(queue)6: for all p D do ( Initialize queue with pixels that have a lower neighbour )7: lc[p] 08: if p has a lower neighbour then9: fifo add(p,q ueue)

10: lc[p] 111: end if12: end for13: dist 1 ( dist is an integer variable )14: fifo add(fictitious, queue) ( insert fictitious pixel )15: while not fifo empty(queue) do16: p fifo remove(queue)17: if p = fictitious then18: if not fifo empty(queue) then19: fifo add(fictitious,queue)20: dist dist + 121: end if22: else23: lc[p] dist24: for all q NG(p) with (im[q] = im[p] and lc[q] = 0) do25: fifo add(q,queue)26: lc[q] 1 ( to prevent from queueing twice )27: end for28: end if29: end while30:

31: for all p D do ( Put the lower complete values in the output image )32: if lc[p] = 0 then33: lc[p] = dist im[p] + lc[p] 134: else35: lc[p] = dist im[p]36: end if37: end for


22/40


To solve the plateau problem, the image may first be made lower complete. This can bedone by a linear-time breadth-first algorithm using a fifo queue [8] to propagate distances, cf.Algorithm 4.5. In the case of the ordered algorithms, an alternative to lower completion aspreprocessing is to use ordered queues. This will be discussed in more detail below. But firstwe consider the initial step which is necessary in these algorithms, i.e., detection of the minima.

4.2.1. Minima detection

Usually a flooding algorithm based on fifo queues is used for minima detection [22, 29, 32].However, the union-find algorithm for implementing disjoint sets [48], see also [8, 47], can beused for computing connected components, and therefore for minima detection, as well. Inpractice the union-find algorithm outperforms the flooding algorithm.

Algorithm 4.6 Computing level components by breadth-first search using a fifo queue.1: procedure LevelComponents2: Input: digital grey scale image G = (V,E,im).3: Output: image lab on V, with labelled level components.4: #define init 1 ( initial value of lab image )5:

6: for all p D do7: lab[p] init8: end for9: curlab 1 ( curlab is the current label )

10: fifo init(queue)11:

12: for all p V with lab[p] = init do13: lab[p] curlab14: fifo add(p, queue)15: while not fifo empty(queue) do16: s fifo remove(queue)17: for all q NG(s) with im[s] = im[q] do18: if lab[q] = init then

19: lab[q] curlab20: fifo add(q, queue)21: end if22: end for23: end while24: curlab curlab + 125: end for

FIFO algorithm. Standard flooding (breadth-first) implementations use a fifo queue tofind the level components, i.e. the connected components of pixels of constant grey value, cf. Al-gorithm 4.6. For each component a pixel is stored in an empty fifo queue, followed by a floodingprocess which runs until the queue is empty. The flooding process consists of removing a pixelfrom the queue, and inserting into the queue its neighbours with the same grey value that havenot been labelled yet. The time complexity is linear in the number of edges of the graph. In


23/40


practice, the image is a graph with a fixed connectivity k, so that the complexity of the algorithmis linear in the number of pixels of the image. We cannot construct an algorithm with a bettertime complexity. However, the minimally required size of the queue is not known in advance,and memory is addressed in a very unstructured manner, causing performance degradation onvirtual memory and especially on parallel computers, since it requires a lot of synchronization.

Figure 10. Disjoint set forest of sets of integers {1, 2, 3, 4, 5}, {6}, {7, 8, 9}.

UNION-FIND algorithm. In the union-find algorithm, disjoint sets are stored in trees,

forming a disjoint-set forest, in which each node p is pointing to its parent parent[p]; Fig. 10gives an example where sets of integers are stored. A node p of a tree is called the root of thetree if parent[p] = p. For each tree, the root is chosen as the representative of the set stored inthe tree.

If two sets are merged (united), it is sufficient to change the root of one of the trees suchthat it points to the root of the other tree. To prevent the height of the tree from increasing toodrastically, resulting in longer search times to find representatives, path compression is applied.This means that not only the root, but all nodes on the path from an arbitrary node p to theroot, are set to point directly to the root. By this technique the length of paths to roots rarelyexceeds 3 in practical cases.

In [47] Tarjan uses a second technique, called union by rank, to prevent the height of the trees

from growing too drastically as well, keeping the resulting tree reasonably balanced when mergingtwo trees. In [47] it is shown that the time complexity of the algorithm using both techniques,for an input of size N, is O(N(N, N)), where (N, N) is the inverse of the Ackermann function,whose value is smaller than 5 if N is of the order 1080. So, in practice, this algorithm can beregarded to run in linear time with respect to its input. When using the algorithm for computingconnected components in images, it turns out that only the path compression technique reallypays off, and therefore the ranking technique is omitted.

Using the disjoint-set technique, the labelling of connected components can easily be per-formed in a scan-line fashion, cf. Algorithm 4.7. In this case, the nodes of the trees are pix-els. Let denote the lexicographical order between pixels. E.g., in a 2-D image with pixelsp = (i, j), q = (k, l), p q denotes that (i < k) ((i = k) (j < l)); also, p q (p q p = q).In the scan-line algorithm, pixels are visited in lexicographical order. Let p0 denote the firstpixel, and curpix the current pixel, during scanning. Then the following order on the arrayparent is maintained: (p : p0 p curpix : p0 parent[p] p). Since this order preventscycles, we can iteratively evaluate parent to find the root of the tree containing p, denoted by


24/40


FindRoot (p).

Let p be the current pixel. If p has no neighbours q (with q p) with the same imagevalue, a new set is created by setting parent[p] to p. If there exist neighbouring pixels q (withq p) that have the same grey value as p, the representatives of these neighbours are computedand the (lexicographically) smallest of them is chosen as the representative of the union of thesets containing these neighbours. Then the paths of these neighbours are compressed usingPathCompress( ), and p is merged with this set. In a second pass through the input image, the

output image lab is created. All root pixels get a distinct label; for any other pixel p its path iscompressed, making explicit use of the order imposed on parent (see line 29 in Algorithm 4.7),and p gets the label of its representative.

This algorithm can be used for the computation of connected components in images of anydimension, size and connectivity, in contrast to the algorithm of Rosenfeld-Pfaltz [44], whichworks only for 2-dimensional images using 4-connectivity. The same restriction holds for theunion-find algorithm in [14] which performs in exact linear time by post-processing each scanline. Also an in-situ variation of the algorithm is possible in which the array parent has beenremoved. In this case the image lab plays the role of output image and parent array at the sametime.

We now resume our discussion of watershed algorithms based on topographical distance.

4.2.2. Image integration by the Dijkstra-Moore shortest paths algorithm.

Given a directed weighted graph G = (V , E , w), with w : E N a nonnegative weight functionon the arcs, the Dijkstra-Moore algorithm computes the length of the shortest path from a sourcenode s to every other node v [8, 10]. This algorithm can be simply adapted for computing thewatershed transform. First, an edge (p,p) in the image is considered as a pair of arcs (p,p) and(p, p) with the same weight. Next, a label image lab and a distance image dist are introduced,just as in the case of Algorithm 4.4, where lab[v] is the index of the minimum nearest to v, anddist[v] is the distance to this minimum [22]. From each minimum a wavefront is started, labelledby the index of the minimum it started in, and the distance is initialized with the value of theminimum, cf. (3.8). If wavefront i reaches a node v after it has propagated over a distance ,and is less than dist[v], the value is placed in dist[v], while lab[v] is set to i. If a node v isreached by another wavefront that has propagated over the same distance but originated froma different minimum (if it already carries the label wshed this is also the case), lab[v] is set tothe artificial value wshed, designating that v is a watershed pixel. For the pseudo-code, seeAlgorithm 4.2.

If the input image has non-minima plateaus, it may be first lower completed. An alternativeis to keep track of distances to the lower border of plateaus during execution of the algorithm.This can be achieved by the use of ordered queues.

Implementation by ordered queues. The function GetMinDist in Algorithm 4.2 can beimplemented such that it has a time complexity which is linear in the number of pixels of theimage. This can be realized with a data structure called hierarchical or ordered queue (OQ),which is a priority queue of N fifo queues, one queue for each of the N grey values in theimage, such that the lower grey values have higher priority [5,24]. The OQ processes lower grey


25/40


Algorithm 4.7 Scan-line algorithm for labelling level components based on disjoint sets.1: procedure union-find-ComponentLabelling2: Input: grey scale image im on digital grid G = (D, E).3: Output: image lab on D, with labelled level components.4: ( Uses array parent of pointers. )

5:6: ( First pass )7: for all p D in lexicographical order do8: r p9: for all q NG(p) with q p do

10: if im[q] = im[p] then11: r r min FindRoot (q) ( min denotes minimum w.r.t. lexicographical order )12: end if13: end for14: parent[p] r15: for all q NG(p) with q p do ( compress paths )16: if im[q] = im[p] then17: PathCompress(q, r)

18: end if19: end for20: end for21:

22: ( Second pass )23: curlab 1 ( curlab is the current label )24: for all p D in lexicographical order do25: if parent[p] = p then (p is a root pixel )26: lab[p] = curlab27: curlab = curlab + 128: else29: parent[p] = parent[parent[p]] ( Resolve unresolved equivalences )30: lab[p] = lab[parent[p]]31: end if32: end for33:

34: function FindRoot (p : pixel)35: while parent[p] = p do36: r parent[p] ; p r37: end while38: return r39:

40: procedure PathCompress(p : pixel, r : pixel)41: while parent[p] = r do42: h parent[p] ; parent[p] r ; p h

43: end while


26/40


levels before higher ones, and is initialized with the labelled border pixels of minima. Pixelswith grey value h are inserted in the fifo queue with priority level h of the OQ. Pixels areremoved from the OQ by priority, and propagate their labels to (i) non-labelled neighbouringpixels, which are inserted in the OQ, or to (ii) neighbouring labelled pixels still in the OQ, whichchange to watershed pixels if the propagated label differs from the current label. By using thepriority order of grey values, pixels always propagate labels to steepest upper neighbours, excepton plateaus, where synchronous breadth-first propagation of labels coming from different pixels

of the lower border takes place. Thus an OQ automatically implements a hierarchical orderrelation between pixels, so that preprocessing to make the input image lower complete can beavoided.

It should be noted however, that the OQ does not always give exactly the same result aswhen the input image is first lower completed. For example, when the image has a plateauwhose pixels, after lower completion, are assigned to different basins without any pixel beinglabelled as watershed pixel (no pixel is equidistant to two or more minima), the OQ algorithmmay nevertheless introduce a watershed line at points where wavefronts coming from differentparts of the boundary meet. The exact location of this watershed line is dependent on the pro-cessing order, and is biased towards that part of the lower boundary from which the propagationproceeded last.

Remark. Algorithm 4.2 requires updating of the set V: distances and labels are only propa-gated to pixels which are still in V. In the ordered queue implementation, V is the set of pixelswhich have not yet entered the OQ, or are still in it. In the case of a lower complete image,one may instead propagate from a pixel u to all neighbours v of u: since the cost functionis positive (except on minima plateaus), the computed distance to an already processed pixelv V will always increase, so the algorithm will not change anything for such a pixel v. Thisentails redundant computation, but has the advantage that no memory is needed to encode theset V. However, when the OQ implementation is used for an image which is not lower complete,and the set V is not properly encoded, a broadening of the watershed line may occur on theinterior of plateaus, where the cost function is identically zero.

4.2.3. Hill climbing

Compared to image integration, hill climbing is much simpler since no distances have to becomputed: labels are simply propagated to all steepest upper neighbours, see Algorithm 4.3.For a lower complete image, determination of the upstream set 1(p) of a pixel p only requireslocal computation. Again, if an image contains non-minima plateaus, it may first be lowercompleted. Alternatively, just as above, ordered queues can be used.

If the version of the algorithm is used which does not compute watershed pixels, and thedistance values on the edges of the underlying grid are equal to 1 (d(p,q) = 1 in Eq. (3.3)), suchas is the case for the 4-connected and 8-connected neighbourhoods mostly used in practice, onemay simply replace the upstream set 1(p) by all unlabelled neighbours q of p. Because thealgorithm processes pixels with lowest grey value first, an unlabelled neighbour of a pixel p isnecessarily in the upstream of p, and a labelled pixel never has to be inspected again, since nowatershed labels are assigned. This implies that the initial computation of lower distances and


27/40


cost function can be avoided, leading to a time and memory efficient implementation. But, ofcourse, the result is not exact and dependent on the processing order.

0 1 2 1 0

1 2 3 2 1

2 3 4 3 2

1 2 3 2 1

0 1 2 1 0

T T T T T

T T T T T

c c c c cc c c c c

' ' E E

' ' E E

' ' E E

' ' E E' ' E E

0 1 2 1 0

1 2 3 2 1

2 3 4 3 2

1 2 3 2 1

0 1 2 1 0

T T

c c

' E

' E

dds

dd

Figure 11. Left: image and its corresponding DAG; right: graph after resolving (watershedpixels are surrounded by a box).

4.2.4. Watershed transform by UNION-FIND algorithm

The union-find algorithm described in Section 4.2.1 can be modified to compute the watershedtransform itself [23], by the following steps.

1. First, plateaus have to be removed from the image f by computing the lower completionfLC of f, see Algorithm 4.5. The last loop in the algorithm can be slightly adapted tolabel the minima pixels of f (i.e., pixels p with lc[p] = 0) as well.

2. From the lower complete image, the lower complete graph G = (V, E) is constructed(see Definition 3.5), which is a directed acyclic graph (DAG). See Fig. 11 for an example.The DAG is stored in an array sln, where sln[p,i] is a pointer to the ith steepest lowerneighbour of pixel p (the number of steepest lower neighbours is at most the connectivity).For each minimum m, one pixel r m is chosen as the representative of this minimum,and a pointer is created from r to itself. The array sln plays the role of parent in thelevel components algorithm, but note that a node can now have more than one parent(steepest lower neighbour). Therefore the graph G is not a disjoint set forest, as in thecase of connected components. The DAG can be constructed in a single pass scan-linealgorithm, in which for each pixel only its neighbours are referenced.

3. The last step is to apply the union-find algorithm to the DAG. The first pass is similar tothat of Algorithm 4.7. The resolving step has to be modified so that watershed pixels canbe detected, which are points having paths in the DAG to distinct roots. For the pseudo-code of the resolving algorithm, which closely resembles Tarjans FindRoot operation [47],see Algorithm 4.8.

This technique computes the exact watershed transform by topographical distance [25]. Asimilar approach was developed by Bieniek et al. [7], based on earlier work [6] on parallel imple-mentation of the watershed transform. However, these authors use the local condition in whichno watershed pixels are computed (see Sections 3.2.3, 5.2.3); when several steepest lower neigh-bours exist, one of them is arbitrarily chosen. Therefore, that algorithm, sometimes referred toas rainfalling, is a variant of the watershed transform by union-find, where the graph is not aDAG, but a disjoint-set forest.


28/40


Algorithm 4.8 Watershed transform w.r.t. topographical distance based on disjoint sets.1: procedure union-find-Watershed2: Input: lower complete graph G = (V, E).3: Output: labelled image lab on V.4: #define wshed 0 ( label of the watershed pixels )5: #define W (-1,-1) ( fictitious coordinates of the watershed pixels )6:

7: LabelInit ( initialize image lab with distinct labels for minima )8:

9: for all p V do ( give p the label of its representative )10: rep Resolve (p)11: if rep = W then12: lab[p] lab[rep]13: else

14: lab[p] wshed

15: end if16: end for17:

18: function Resolve (p : pixel)19: ( Recursive function for resolving the downstream paths of the lower complete graph. )20: ( Returns representative element of pixel p, or W if p is a watershed pixel )21: i 1 ; rep (0, 0) ( some value such that rep = W )22: while (i CON) and (rep = W) do (CON indicates the connectivity )23: if (sln[p,i] = p) and (sln[p,i] = W) then24: sln[p,i] Resolve (sln[p,i])25: end if26: if i = 1 then

27: rep sln[p, 1]28: else if sln[p,i] = rep then29: rep W30: for j 1 to CON do31: sln[p,j] W32: end for33: end if34: i i + 135: end while36: return rep


29/40


5. Parallelization

In this section we first make some general remarks about parallel computer systems and parallelprogramming. Then a review of parallelization strategies for the watershed transform is given,for both distributed and shared memory architectures.

5.1. General considerations5.1.1. Parallel computer systems

A standard classification of parallel computer systems into four types is due to Flynn, see [15,41]for details. The two types most often encountered in practice are SIMD (Single Instruction,Multiple Data), and MIMD (Multiple Instruction, Multiple Data). In a SIMD computer allprocessor elements simultaneously execute the same operation on different data items, whereasin a MIMD machine the processors may execute different operations on their own data. MIMDcomputers are more flexible, but are in general more difficult to program. Both SIMD andMIMD computers can be either of the shared memory or distributed memory type. In a sharedmemory parallel computer, there are a number of processors and a single (large) memory which

is accessible to all processors. In contrast, in a distributed memory architecture, each processorhas its own local memory and a processor can retrieve data in the memory of another processorby messages over a communication network.

The performance of a parallel computer is very much dependent on the bandwidth of theconnection of the processors to the memory, that is, the maximum number of simultaneous loador store operations per time unit. Shared memory systems typically have a bandwidth problemsince there is only a single memory, so that conflicts may arise when many processors try to accessthe same memory locations. On the other hand, distributed memory MIMD machines have thedisadvantage that the communication between processors is much slower than for shared memorymachines, so that the synchronization overhead is much higher when tasks have to communicate.This mismatch between communication vs. computational speed often makes communication thespeed-limiting factor on distributed memory MIMD architectures, while memory congestion isusually the speed-limiting factor on shared memory systems. The maximum amount of work aprocess can perform before communication with other processors becomes necessary is called thegranularity or grain size. Load balancing, i.e. ensuring equal work load of different processorsduring program execution, is an important requirement of parallel program design. In thiscontext, an important issue is that of mapping, i.e. the assignment of tasks to processors. Thismay be done statically at initialization, or dynamically during execution of the program.

5.1.2. Parallel programming models

Various parallel programming models exist. In message-passing programming, tasks are cre-ated, which interact by sending and receiving messages. The approach most often used is calledSPMD (single program multiple data), meaning that every processor runs the same program,performing operations on its own data space. In the shared-memory programming model, tasksshare a common address space. Mechanisms such as locks and semaphores [11] may be used


30/40


to control access to the shared memory. Below we will compare implementations of the water-shed transform on distributed memory machines making use of message passing, and on sharedmemory architectures where synchronization takes place through shared variables.

5.1.3. Classification of parallel watershed algorithms

The following classification of current parallel implementations of the watershed transform can

be made:

domain decomposition: distribute the image over the processors in a regular way (staticmapping) and use a sequential algorithm for the subimage in each subdomain. Insertsynchronization and communication points where the result depends on neighbouring sub-domains. Merge subresults to obtain the final solution.

functional decomposition: when simulating flooding from local minima, distribute the localminima over the processors. In this case, the efficiency depends crucially on the number oflocal minima, and the sizes of the corresponding basins. Load imbalance may arise whenthe sizes of basins differ significantly.

5.1.4. Speed versus scalabilityLet N be number of processors used. Define T(N) to be the running time between the momentthat the first processor starts and the moment that the last processor finishes. Speedup of theparallel algorithm is measured by:

SP(N) =T1

T(N),

where T1 is the execution time of the fastest serial algorithm on one processor. Often, T1 isreplaced by the time needed to execute the algorithm, which formed the starting point forparallelization, on one processor; then one speaks about relative speedup. Efficiency is definedas

E(N) = SP(N)/N.

A quality measure for the efficiency of a parallel algorithm is how close the efficiency is to unity,i.e., how well the speedup curve approximates the linear function SP(N) = N. Speedup dependscritically upon the amount of sequential computation. If f is the fraction of such sequentialoperations, then Amdahls law states that the maximum speedup achievable obeys [41]:

SP(N) 1

f + 1fN.

This implies that a small number of sequential operations can drastically limit the achievablespeedup, since SP(N) 1/f, no matter how many processors are used.

Usually, speedup is an increasing function of the problem size, since overhead costs, suchas creating processes, input/output and process synchronization are constant or increase slowerthan grain size. Note that an algorithm can be slow but at the same time have good scalingproperties.


31/40


5.2. Watershed implementation on distributed memory architectures

In the case of the watershed algorithm, usually domain decomposition is used on distributedmemory architectures. Granularity depends on the distribution of data among processors, thenumber of processors and the image content. When many subimages are used, the grains aresmall with a relatively large number of pixels on the boundary between subdomains, requiringmore communication.

In all algorithms discussed in this subsection, the image is distributed in stripes or blocksDi, i = 1, . . . , N over all processors. Each processor has access to an overlap region between do-mains, determined by the neighbourhood NG(p) of each boundary pixel. By D

+i = pDiNG(p)

is denoted the extension of subdomain Di, and by Di = Di (j=iD

+j ) the pixels of Di which

have outside neighbours. Pixels in boundary regions are written only by the process to which thesubdomain is assigned, but is available for reading by processors of neighbouring subdomains.An approach where a division of the image in rectangular blocks is used naturally leads to animplementation where the processors are connected in a rectangular mesh topology, which forexample is easily realizable by a transputer system (each transputer having four communicationlinks).

Speedups are usually measured excluding the time needed for image loading, distribution,

retrieval and saving.

5.2.1. Hill climbing by ordered queues

Parallellization of the watershed transform by ordered queues is discussed in [27]. The algorithmdoes not construct watershed lines. The program uses the SPMD approach with synchronizationby messages from and to a master process. The image is distributed in blocks. The steps in thewatershed computation are:

1. Minima detection: plateaus are examined by breadth-first scans in each subimage usinga fifo queue. If a plateau is spread over different subdomains, communication betweenprocessors is necessary during which merging of parts in different subdomains takes place.

This may require repeated communication until stabilization (i.e. no more changes occur).

2. Flooding by local OQs : each processor performs flooding in its own subdomain based onordered queues, as in the sequential algorithm. To allow flooding to propagate to neigh-bouring subdomains, two approaches have been considered. In the first one [32] processorsare tightly synchronized at each grey level by analyzing border pixels of subdomains whosesteepest lower neighbours (or, when these do not exist, neighbours of the same grey value)are in the extension area of the subdomains. When a processor reaches synchronizationlevel h, labels and values in their extension areas are exchanged with neighbouring proces-sors. Communication and reflooding takes place until the label propagation stabilizes, asdetected by the master process. Due to this tight synchronization considerable idle timesare introduced, since processors do not execute the same code at approximately the same

time. A second approach [27] first performs local flooding at all grey levels in the sub-domain, followed by communication and reflooding until the label propagation stabilizes.This reduces the amount of communication necessary for reflooding.


32/40


Performance measurements. Speedups for both schemes are reported in [27, Ch. 2]. Thetight synchronization scheme was implemented on a Parsytec Supercluster 128, which is a mas-sively parallel reconfigurable network of transputers, under PIPS (Parallel Image ProcessingSystem) [38]. (An initial implementation on a loosely coupled cluster of workstations usingthe PVM (Parallel Virtual Machine) package [40] resulted in marginal speedups with efficiencydeteriorating quickly as the number of processes was increased [32]). The second scheme wasimplemented on a Cray T3D MIMD distributed memory architecture with 256 nodes using MPI

(Message Passing Interface) [16]. The experimental results show a moderate increase of speedupwith number of processors for some images, the speedup for the second scheme being almosttwice as high as that of the first scheme. (Note however, that these schemes were implementedon different architectures.) For natural images, efficiency ranges from 25 50% at 16 processors,to 10% or less at 128 processors. However, both stages of the algorithm (minima detection andflooding) are very data dependent, leading to load imbalance. For artificial images with large orsnake-like plateaus spread over different subdomains, speedup may be marginal or even decreasewith number of processors, due to extensive relabelling. Also, better performance is not alwaysobtained for larger images.

5.2.2. Hill climbing and rainfalling after lower completion

Hill climbing, with lower completion as preprocessing, was considered by Moga et al. [27, 28],effectively using, but not explicitly introducing, the local condition of Definition 3.7. In addi-tion, the rainfalling algorithm was studied, see Section 4.2.4. For both algorithms, the stepsare: (i) minima detection, (ii) lower completion, (iii) flooding (by hill climbing and rainfalling,respectively).

Minima detection with lower completion on non-minima plateaus again requires repeatedcommunication until stabilization to achieve global consistency. The flooding step is consideredas labelling each vertex in the lower complete graph by the label of the minimum to which it isconnected by a path. The procedure of choosing arbitrarily one of the steepest lower neighboursof a given pixel, in case several exist, turns the DAG into a disjoint-set forest. This reduces theamount of non-locality, but introduces scanning order dependence (cf. Section 4.2.4).

For the rainfalling algorithm, the forest is labelled inside subdomains as described above,using a fifo queue to store root pixels of not yet resolved paths. Processors perform communi-cation with neighbours as long as there are unresolved paths in their subdomain. But, since aprocessor can decide locally when to terminate its calculation, no global reduction operation isnecessary; also no relabelling or synchronization between paths are needed. In the case of hillclimbing, each processor initializes by a raster scan a fifo queue with border pixels of minimain its subdomain. A pixel p removed from the queue propagates its label to all pixels q for whichthere is an arc from q to p in the lower complete graph. Labels are repeatedly exchanged withneighbouring processors through the extension area, initiating new labelling. A processor be-comes inactive as soon as all pixels in its subdomain have been labelled. Summarizing, plateausare treated in breadth-first order, while labelling is along paths generated by depth-first search,c.q. breadth-first search, for rainfalling, and hillclimbing, respectively.


33/40


Performance measurements. Implementations were carried out on a Parsytec Superclus-ter 128 under PIPS [38] and on Cray T3D under MPI. Speedup curves are rather similar forrainfalling and hillclimbing, with rainfalling having shorter running times but somewhat lowerspeedup. Efficiency decreases with increasing number of processors, and is very data dependentin the case of artificial scenes (E(128) 25% on the Parsytec system, E(128) 12.5% onthe Cray T3D). Compared to the implementation using ordered queues (cf. Section 5.2.1) thetime spent for flooding has been reduced, but the time of the first stages has increased due

to the lower distance computation. Overall execution time has not improved significantly. Anadvantage may be that ordinary queues are easier to implement correctly than ordered queues.

5.2.3. Hill climbing by ordered queues combined with a connected componentoperator

Parallelization of the hill climbing algorithm combined with a connected component operatorhas been considered by Bieniek et al. [6] using the local condition of Definition 3.7, and by Mogaet al. [27,30]. We first describe the former approach [6].

The main idea is to solve the watershed problem independently on all subdomains withoutsynchronization. Instead temporary labels are assigned to pixels which will be flooded fromadjacent subdomains. The boundary connectivity information is stored in a graph or equivalencetable. Global labels are computed by a reduction operation using the resolving step as in theunion-find algorithm (cf. Section 4.2.1). If N is the number of processors, computation of theglobal labels then takes log2 N steps, independent of the complexity of the data. The latterproblem is strongly related to the connected component labelling problem [1,13,45].

The algorithm for images without plateaus is as follows:

1. Give all local minima in each domain Di a globally unique label, using information fromD+i .

2. Give all pixels p of Di a temporary label (globally unique) if the downstream neighboursof p are in another subdomain. The set of boundary pixels requiring a temporary label isthus

Dtempi = {p Di|(p) (D+i \ Di) = }. (5.1)

3. Produce a watershed segmentation consistent with Definition 3.7, independently on eachsubdomain, using the minima and temporary labels as seeds for basins. By using orderedqueues, non-minima plateaus which are completely within a subdomain will be flooded inaccordance with Definition 3.7.

4. Merge subdomains pairwise: give all labels of the subdomains globally consistent valuesby linking basins, which have grown from pixels p with a temporary label, to basins in thedownstream of p.

An efficient implementation of step 4 in this algorithm can be based upon the union-findalgorithm, as discussed in Section 4.2.1.

As an example, consider Fig. 12. Figure 12(b) shows a watershed segmentation of the imagein Fig. 12(a) consistent with D

Watershed Segementation

Documents

Transcript of Watershed Segementation