arXiv:2006.14788v2 [cs.CV] 16 Aug 2020 · 1 Introduction The auditory ossicles, situated in the...

16
Ricci Curvature Based Volumetric Segmentation of the Auditory Ossicles Na Lei 1 , Jisui Huang 2 ,Yuxue Ren 3,* ,Emil Saucan 4 ,Zhenchang Wang 5 1 Key Laboratory for Ubiquitous Network and Service Software of Liaoning Province, Dalian University of Technology, Dalian, 116600 China (e-mail: [email protected]). 2 Beijing Advanced Innovation Center for Imaging Theory and Technology, Capital Normal University, Beijing, 100084 China (e-mail: [email protected]). 3,* (corresponding author) Beijing Advanced Innovation Center for Imaging Theory and Technology, Capital Normal University, Beijing, 100084 China (e-mail: snow [email protected]) 4 the Applied Mathematics Department, ORT Braude College of Engineering, Karmiel 2161002, Israel (e-mail: [email protected]). 5 Beijing Friendship Hospital, Capital Medical University, Beijing 100050, China (e-mail: [email protected]). Abstract The auditory ossicles that are located in the middle ear are the smallest bones in the human body. Their damage will result in hearing loss. It is therefore important to be able to automatically diagnose ossicles’ diseases based on Computed Tomography (CT) 3D imaging. However CT images usually include the whole head area, which is much larger than the bones of interest, thus the localization of the ossicles, followed by segmentation, both play a significant role in automatic diagnosis. The commonly employed local segmentation methods require manually selected initial points, which is a highly time consuming process. We therefore propose a completely automatic method to locate the ossicles which requires neither templates, nor manual labels. It relies solely on the connective properties of the auditory ossicles themselves, and their relationship with the surrounding tissue fluid. For the segmentation task, we define a novel energy function and obtain the shape of the ossicles from the 3D CT image by minimizing this new energy. Compared to the state-of-the-art methods which usually use the gradient operator and some normalization terms, we propose to add a Ricci curvature term to the commonly employed energy function. We compare our proposed method with the state-of-the-art methods and show that the performance of discrete Forman-Ricci curvature is superior to the others. Keywords— ossicles, automatic location, automatic segmentation, Ricci curvature, Forman curvature 1 Introduction The auditory ossicles, situated in the middle ear, comprise the malleus, incus and stapes, which are the smallest bones in the human body. Figure 1 [1] shows the anatomy of the ear, including the ossicles located in middle ear. The function of the ossicles is to transmit sounds from the air to the cochlea and their damage produces hearing loss. It is therefore important to automatically diagnose ossicles disease. Computed tomography (CT), which produces 3D image of areas inside the body, is an extensively employed and efficient technology for disease diagnosis. The ossicles localization followed by segmentation from 3D CT image play a significant role in auto- matically diagnose. Once the ossicles be segmented one can exactly construct surface triangular mesh for further analysis. The 3D CT image for the auditory ossicles always includes the whole head area, which is pretty large compared to the small bones we consider. So the automatic localization of the ossicles can represent a labor saving progress relative to the currently employed method. On the other hand, the automatic segmentation of the ossicles can simplify the reconstruction of the ossicles’ shapes, allowing for the easy comparison with that of the healthy ossicles. In general, location is a prerequisite for segmentation. The commonly employed local segmentation methods require manually selected initial points, rendering the procedure highly time consuming. To raise the automation level and help accurately segmenting the ossicles from the 3D CT image, we first propose a new method to locate the ossicles which requires neither template, nor manual labels. It relies on some of the connectivity properties of the auditory ossicles themselves and their relationship with the surrounding tissue fluid. Then arXiv:2006.14788v2 [cs.CV] 16 Aug 2020

Transcript of arXiv:2006.14788v2 [cs.CV] 16 Aug 2020 · 1 Introduction The auditory ossicles, situated in the...

Page 1: arXiv:2006.14788v2 [cs.CV] 16 Aug 2020 · 1 Introduction The auditory ossicles, situated in the middle ear, comprise the malleus, incus and stapes, which are the smallest bones in

Ricci Curvature Based Volumetric Segmentation of the Auditory

Ossicles

Na Lei1, Jisui Huang2,Yuxue Ren3,∗,Emil Saucan4,Zhenchang Wang51Key Laboratory for Ubiquitous Network and Service Software of Liaoning Province,

Dalian University of Technology, Dalian, 116600 China (e-mail: [email protected]).2Beijing Advanced Innovation Center for Imaging Theory and Technology,

Capital Normal University, Beijing, 100084 China (e-mail: [email protected]).3,∗(corresponding author) Beijing Advanced Innovation Center for Imaging Theory and Technology,

Capital Normal University, Beijing, 100084 China (e-mail: snow [email protected])4the Applied Mathematics Department, ORT Braude College of Engineering,

Karmiel 2161002, Israel (e-mail: [email protected]).5Beijing Friendship Hospital, Capital Medical University,

Beijing 100050, China (e-mail: [email protected]).

Abstract

The auditory ossicles that are located in the middle ear are the smallest bones in the human body.Their damage will result in hearing loss. It is therefore important to be able to automatically diagnoseossicles’ diseases based on Computed Tomography (CT) 3D imaging. However CT images usuallyinclude the whole head area, which is much larger than the bones of interest, thus the localizationof the ossicles, followed by segmentation, both play a significant role in automatic diagnosis. Thecommonly employed local segmentation methods require manually selected initial points, which isa highly time consuming process. We therefore propose a completely automatic method to locatethe ossicles which requires neither templates, nor manual labels. It relies solely on the connectiveproperties of the auditory ossicles themselves, and their relationship with the surrounding tissue fluid.For the segmentation task, we define a novel energy function and obtain the shape of the ossiclesfrom the 3D CT image by minimizing this new energy. Compared to the state-of-the-art methodswhich usually use the gradient operator and some normalization terms, we propose to add a Riccicurvature term to the commonly employed energy function. We compare our proposed method withthe state-of-the-art methods and show that the performance of discrete Forman-Ricci curvature issuperior to the others.

Keywords— ossicles, automatic location, automatic segmentation, Ricci curvature, Forman curvature

1 Introduction

The auditory ossicles, situated in the middle ear, comprise the malleus, incus and stapes, which are the smallestbones in the human body. Figure 1 [1] shows the anatomy of the ear, including the ossicles located in middleear. The function of the ossicles is to transmit sounds from the air to the cochlea and their damage produceshearing loss. It is therefore important to automatically diagnose ossicles disease. Computed tomography (CT),which produces 3D image of areas inside the body, is an extensively employed and efficient technology for diseasediagnosis. The ossicles localization followed by segmentation from 3D CT image play a significant role in auto-matically diagnose. Once the ossicles be segmented one can exactly construct surface triangular mesh for furtheranalysis.

The 3D CT image for the auditory ossicles always includes the whole head area, which is pretty large comparedto the small bones we consider. So the automatic localization of the ossicles can represent a labor saving progressrelative to the currently employed method. On the other hand, the automatic segmentation of the ossicles cansimplify the reconstruction of the ossicles’ shapes, allowing for the easy comparison with that of the healthyossicles. In general, location is a prerequisite for segmentation. The commonly employed local segmentationmethods require manually selected initial points, rendering the procedure highly time consuming. To raise theautomation level and help accurately segmenting the ossicles from the 3D CT image, we first propose a new methodto locate the ossicles which requires neither template, nor manual labels. It relies on some of the connectivityproperties of the auditory ossicles themselves and their relationship with the surrounding tissue fluid. Then

arX

iv:2

006.

1478

8v2

[cs

.CV

] 1

6 A

ug 2

020

Page 2: arXiv:2006.14788v2 [cs.CV] 16 Aug 2020 · 1 Introduction The auditory ossicles, situated in the middle ear, comprise the malleus, incus and stapes, which are the smallest bones in

Figure 1: Anatomy of the Ear [1]

we define a new energy function and segment the ossicles by minimize the energy. Most of the state-of-the-artmethods use the gradient operator and some normalization terms. However, the widely adopted gradient operatoronly takes into account the first order derivative of the energy function. It is obviously that the second orderderivative will provide more information. More precisely, curvature – i.e. the shape operator – represents thesecond derivative depending geometric tool that we shall prove efficient in the achievement of our proposed goal.

To this end we add a Ricci curvature [2] term to the energy function [3]. The energy function usually constructsa triangular grid according to the initial point to approximate the object, so the accuracy of the solution highlydepends on the selection of the initial point. If there are too few initial points, it is difficult for the triangularmesh to approximate the original shape of the object. If there are too many initial points, it will consume toomuch labor. We use the localization of the ossicles position as the initial point of the local segmentation method.Since it is already a three-dimensional connected component, a natural mesh can be constructed based on theboundary. So we no longer need to guess the shape of the mesh based on a few initial points as before. We nextproceed to the segmentation of the exact shape of the ossicles. Our results show that the Ricci curvature is indeedsuperior to the original energy function method based on gradient operator in the localization and segmentationtasks at hand.

2 Related Work

2.1 3D Image Segmentation

Image segmentation is the most important step in medical imaging. There exists a vast amount of literaturededicated to segmentation algorithms. Here we only review some energy based methods which are related toour method. The use of deformable models for image segmentation has been pioneered by Kass et al. [4] withthe introduction of the so called snakes: A snake is an energy-minimizing spline guided by external constraintforces and influenced by image forces that pull it toward features such as lines and edges. Terzopoulos et al. [5]expanded to three dimensions this approach which was initially used only in the two dimensional setting. In [6],Delingette introduced physically-based approach, resting on the geometry of simplicial meshes. McInerney andTerzopoulos [7] proposed a class of deformable models by formulating deformable surfaces in terms of an affinecell image decomposition.

Level-sets were introduced by Osher and Sethian [8] and further popularized in computer vision and imageanalysis by Malladi et al. [9]. The level sets based method has the ability of splitting freely to represent each

Page 3: arXiv:2006.14788v2 [cs.CV] 16 Aug 2020 · 1 Introduction The auditory ossicles, situated in the middle ear, comprise the malleus, incus and stapes, which are the smallest bones in

object when the considered image contains more than one object of interest. Leventon et al. [10] extended theoriginal energy by introducing a representation for deformable shapes and defining a probability distribution overthe variances of a set of training shapes. A frequent criticism of this approach is that the signed distance functionwhich the shape model assumes, does not form a linear space, which can lead to the generation of irregular shapesif the shape of the studied object changes abruptly. Nevertheless, the approach quickly became popular andwas extended in several directions, among others by Tsai et al. [11] who employed Leventon’s modeling methodwith a region-based energy functional. Pohl et al. [12] utilized LogOdds to obtain a shape representation thatencodes the shape of an anatomical structure as well as the variations within, in a manner suited for medicalimaging. Recently, Chen et al. [3] presented a 3D selective segmentation model which is inherently suited forefficient implementation.

2.2 Ricci Curvature

Curvature plays an important role in geometry since it represents a measure to quantify the deviation of ageometrical object from being intuitively flat. Amongst the various, classical existing types of curvature, sectionalcurvature is the most expressive one, and as such it has which has wide range of practical applications, in particularin complex networks. In particular, Eckmann et al. [13] used curvature to define a World Wide Web landscapewhose connected regions of high curvature characterize a common topic. Shavitt et al. [14] used curvaturecalled internet geometric curvature to captures the property of Internet. Saucan and Appleboim [15] introduceda new clustering method for vertex- and edge-weighted networks which is based upon a generalization of thecombinatorial curvature.

Ricci curvature is a more abstract and less intuitive concept. It became widely known to larger audiences,beyond the circle of differential geometers, after Perelman’s seminal work on the Ricci flow [16, 17], and itsconsequent use in solving the Poincare Conjecture. Spurred partly by the renewed interest in the Ricci flow,Chow and Luo [18] showed that the circle-packing based combinatorial analogue of Hamilton’s Ricci flow forsurfaces [19], behaves similarly to the classical one.

This approach proved to be extremely malleable and useful in a variety of applications in graphics, imagingand complex networks, as demonstrated by the work of Gu et al. [20–23]. A quite different and elegant approachto discretize the Ricci curvature, in particular to networks, is due to Ollivier [24]. However, by its very defi-nition it requires solving an optimal transport problem that reduced to a linear programming problem, whichrenders computations slow and laborious, especially on large scale networks. Yet another viewpoint towards thediscretization of Ricci curvature, based upon the connection between curvature and the Laplacian operator, wasadopted by Forman [25]. Forman’s method applies to weighted cell complexes and it was adapted to the case ofundirected (as well as directed) networks by Sreejith et al. [2]. It is this last type of Ricci curvature that we makeappeal to in the present paper, due to its simplicity and effectiveness in computations, even on very large scaleCT image.

3 Discrete Ricci Curvature

We begin by briefly recalling the classical definition of Ricci curvature of smooth manifolds (for further detailssee, e.g. [26]).

Suppose that (M, g) is an n-dimensional Riemannian manifold, let p be a point in M , let TpM denotethe tangent space of M at a point p, and let ζ, ξ, η ∈ TpM . Define g : TpM × TpM × TpM → TpM , asg(ζ, η, ξ) = R (ζ, η) ξ, where R denotes the Riemannian curvature tensor.

Then the Ricci curvature tensor at p Ricp is the billiniear map Ricp : TpM × TpM → R define as:

Ricp (ξ, η) = trace (g) ; (1)

for any ξ, η ∈ TpM × TpM . If ξ is a unit tangent vector, Ricci curvature of ξ, Ric (ξ) is defined naturally asfollows:

Ric (ξ) = Ric (ξ, ξ) . (2)

If eini=1 is an orthonormal basis of TpM , then the scalar curvature (at p) is defined as the average of the Riccicurvatures of the basis elements, that is

scal (p) =1

n

n∑i=1

Ric (ei) . (3)

Note that Ricci curvature is a tensorial measure, attached to a direction, while the scalar curvature is (as thename stands to wit), as a scalar measure, attached to a point.

While the tensorial approach to Ricci curvature presented above is important in classical Riemannian geometryas well as in its applications to generalized relativity and cosmology, data arising in computer related applications,and in particular in medical imaging, is discrete (obtained, usually, by sampling of the analogous signal/image).In consequence, a discrete version of Ricci curvature is required.

Page 4: arXiv:2006.14788v2 [cs.CV] 16 Aug 2020 · 1 Introduction The auditory ossicles, situated in the middle ear, comprise the malleus, incus and stapes, which are the smallest bones in

The discrete Forman-Ricci curvature, as it was extended and adapted to undirected graphs in [25], is anedge-based measure defined as follows (See Figure 2):

F (e) =we

(wx1we

+wx2we

−∑

ex1∼e,ex2

∼e

(wx1√wewex1

+wx2√wewex2

) (4)

• e denotes the edge under consideration between two vertexes x1 and x2.

• we denotes the weight of the edge e under consideration.

• wx1 and wx2 denote the weights associated with the vertexes x1 and x2, respectively.

• ex1 ∼ x1 and ex2 ∼ x2 denote the set of edges incident with vertexes x1 and x2, respectively, after excludingthe edge e = (x1, x2) under consideration.

v1 v2

Figure 2: The elements appearing in the Forman’s curvature of an edge.

Before proceeding further, let us expand somewhat on the formula above and the manner it was obtained.Firstly, let us note that it represents a reduction, to the limiting, 1-dimensional case of Forman’s discretization [25]of the a classical result in Riemannian geometry, namely the so called Bochner-Weitzenbock-Lichnerowicz formula(or just the Bochner formula, in short):

− 1

2∆(||df ||2

)= ||Hessf ||2− < df,∆df > +Ric(df, df) . (5)

where f is a real function on a given Riemannian manifold and Hessf denotes the Hessian of f : Hessf = ∇df =∇2f and < ·, · > as usual, the inner product.

Unfortunately, it is hard to translate to the discrete settings of meshes and graphs the formula above, sincerelevant functions are not easy to extract from the given data. However, differential forms do have as discreteanalogies cells (e.d. triangles, edges, etc.), and a fitting form of the Bochner formula by writing a differentialform ω as the differential of real-valued function f , that is by writing ω = df . The desired version of the Bochnerformula obtained is

1

2∆(||ω||2

)= ||Dω||2− < ω,∆ω > +Ric(ω, ω) . (6)

(Before proceeding further, let us note here that viewing cells as differential forms is by now a well known andpowerful route towards Discrete Differential Geometry – see [20].) Formula (6) can be succinctly written, in anydimension as

p = dd∗ + d∗d = ∇∗p∇p + Curv(R) , (7)

where Curv(R) is a quite complicated curvature expression, depending on the curvature tensor R and the dimen-sion of the manifold and which, for dimension p = 1, reduces to the Ricci curvature. It is precisely this compressedform of the Bochner formula that Forman used to derive his discretization.

Page 5: arXiv:2006.14788v2 [cs.CV] 16 Aug 2020 · 1 Introduction The auditory ossicles, situated in the middle ear, comprise the malleus, incus and stapes, which are the smallest bones in

The reason we presented in some detail the development of Forman’s discretization of Ricci curvature is thatit is pertinent to our goals and, as we shall see, it explains the quality of the results we obtained. To this end, weconcentrate first on Formula (5) and note that the Ricci curvature is, by its very definition, dependent only on thesecond order derivatives. However, the term < df,∆df > in the left side of the formula represents the derivativeof ∆f along the gradient lines of f , thus it contains order 3 derivatives of f . Thus the Bochner formula expressesthe second order operator Ric, in terms of order 3 derivatives, hence captures a higher order of smoothness, hencepotential detail, than expected. Amazingly enough, this behaviour is inherent in the differential form renderingof the formula and, even more, it is captured even by the Forman discretization. This is a phenomenon thatwe already observed in imaging applications of the original version of the discrete Forman-Ricci curvature –see [27], [28]. Moreover, as our results below clearly indicate, it appears to permeate to the graph version as well.

We work with the standard model for 3D CT image, where each voxel is viewed as a (small) cube, thus the3D picture is viewed as cube lattice. The graph we consider is the dual graph of this lattice, i.e. each vertexrepresents the center of a cube (voxel), and an edge connects adjacent voxels (cubes), where adjacency is alongfaces only.

Thus each vertex has six incident edges except the boundary vertexes. The gray value of a CT vertex x canbe regarded as wx and we be calculated by equation:

we = exp(|wx1−wx2 |) (8)

where x1 and x2 are vertexes incident with edge e.A discrete definition of the scalar curvature of a vertex (which is frequently named in the literature as the

Ricci curvature of a vertex) can also be defined in a similar manner to the one in the classical case introduced inFormula (3) above:

F (x) =1

deg (x)

∑ex∼x

F (ex) ; (9)

where ex denotes the edges incident with the vertex x and deg (x) denotes (as usually) the total number of ex.

4 Acquisition of initial point

The commonly employed local segmentation methods require manually selected initial points, a procedure thatrepresents a highly time consuming process. In this section we provide a fully automatic method to localize theossicles which requires neither templates, nor manual labels. We mainly use the connectivity to obtain connectedcomponent of ossicles. The proposed method is suitable for any scaling, rotation and affine transforms, and it isalso designed to fit the abnormal ossicles.

4.1 Concepts and Notation

The 3D CT image considered consist of three components: air, tissue fluid and bone which are rendered, accordingto their density, as black, gray and white respectively.• Background B: Background is a 3D connected component of air that contains infinite air vertex. This

implies that background is a subset of air vertexes.• Foreground F : This is defined, naturally, as the complementary domain of the background. Note that the

foreground will also contain some air areas which are not connected with infinite vertex.

4.2 Coarse Localization

We first roughly determine two 3D cubic domains surrounding left and right ossicles based on empirical observationand physician’s experience by determining the centroid of skull. This is easy to achieve since the skull is the largestthree-dimensional connected bone in the entire CT image. The obtained cubic area here is only a first, coarseapproximation. Figure 3 shows the two cubic domain of CT images.

We compute all the 3D connected components of bones in the cubic domains considered above. Because someabnormal ossicles may be connected to the surrounding components, so an eroding operation is needed on thebones to obtain the set of 3D bone connected components containing ossicles.

4.3 Different Situations in Location

From now on our discussion will be restricted on the small cubic domain Ω containning right ossicles.Case 1: Normal OssiclesWe begin by noting that the foreground in which ossicles are located is isolated by background in two di-

mensions. Figure 4 shows 12 sequential slices in the 3D CT image. The red color demonstrates the air domainconnecting to outside air vertex, i.e. the background, and the green color demonstrates the 3D ossicles boneconnected component. Noting that we perform the eroding operator to bones to prevent ossicles from connectingto skull so this green area is in fact smaller than the real ossicles. The first image shows the air surrounding the

Page 6: arXiv:2006.14788v2 [cs.CV] 16 Aug 2020 · 1 Introduction The auditory ossicles, situated in the middle ear, comprise the malleus, incus and stapes, which are the smallest bones in

Figure 3: Left and right cubic domains containning osscicles restricted in one slice of the 3D image

Page 7: arXiv:2006.14788v2 [cs.CV] 16 Aug 2020 · 1 Introduction The auditory ossicles, situated in the middle ear, comprise the malleus, incus and stapes, which are the smallest bones in

ossicles can connect to outside. The tenth to twelfth images show that the ossicles are surrounded by the redarea, look like “floating” in the air.

Figure 4: 12 sequential slices in the 3D CT image. This Figure show the properties of normal ossicles:The ossicles “float” in the air (background) and the air surronding ossicles is conneted to the air outsidethe skull. Here red color shows the air and green color indicates the ossicles.

So a 3D bone connected component O is ossicles if and only if:

• existing a slice on which O is surrounded by background.

The property above is only suitable for normal case. If the ossicles can not be located using the methoddescribed in Case 1, then it indicates that there exists a tissue blockage between the outer ear and the ossicles(cf. Case 2 below), or the tissue fluid connected component where the ossicles are located in is connected to theother tissue fluid (Case 3).

Case 2: Non-connected External auditory canalAs shown in Figure 5 (a), the outer ear and middle ear are separated by a thin tissue. In fact, in the entire

3D CT image, this tissue spans many partial 2D images, resulting in the nonexistence of path between outsideand middle ear. We can remove the tissue within a predefined distance to background and recalculate the newbackground in this case. The volume of new background will increase. The red area in Figure 5 (b) indicates aslice of background. Then we can use the same method as in Case 1 to locate the ossicles.

Figure 5: Non-connected external auditory canal in Case 2: (a) is a CT image in Case 2. There is ablockage between the outer ear and the middle ear. (b) shows the connected air area after we remove thethin tissue layer.

Case 3: Middle ear abnormal density shadow

Page 8: arXiv:2006.14788v2 [cs.CV] 16 Aug 2020 · 1 Introduction The auditory ossicles, situated in the middle ear, comprise the malleus, incus and stapes, which are the smallest bones in

Sometimes there are thicker tissue around the ossicles and the foreground including the ossicles will be con-nected to other part of foreground in some slices. We can remove this connectivity by erasing some tissue aroundossicles, i.e. by expanding the background. There is a difference between case 3 and case 2 that we don’t need torecalculate the background in this case. Figure 6 shows this process: after remove tissue around ossicles, ossiclescan still “float” in background.

Figure 6: Middle ear abnormal density shadow in case 3: (a) is the original CT ,(b) is the image afterremoving tissue around ossicles.

Case 4: External auditory canal obstacleThere are some patients whose external auditory canals are blocked seriously, the whole external auditory

canal is full of tissue, as shown in Figure 7.Intuitively, the external auditory canal is the jutting strongly concave domain on the bone of skull in cubical

domain, and it is the biggest “hole” of the skull bone connected component. So we utilize these properties tolocate the opening of the external auditory canal, by using a radius 9 dilatation kernel to the bone of skull. Theonly resuming groove is the opening of the external auditory canal.

Start observing from the last slice, after we locate the opening of the external auditory canal, we implementa dilatation kernel function to this area towards the inside of the skull dredge it. Since the auditory canal is thedeeper the narrower, we need to decrease the dilatation kernel during the operation. In Figure 7 we show theresult using three different size kernels, whose radii are 9, 7 and 5 voxels respectively, and indicated by yellow,magenta and cyan color respectively.

Figure 7: External auditory canal obstacle: the color area, which is the air in normal case, is filledwith tissue fluid. Yellow domain is the auditory canal determined using kernel with radius 9. Similarly,magenta and cyan represent the domains determined using kernel radius 7 and 5 respectively. A line inthe sixth image is the major axis of the ellipse that has the same second-moments as color area in theslice, then we extend it to outside to make sure middle ear can connect to outside air.

After the three dilatating operations, we extend major axis of the ellipse that has the same second-momentsas the color region in one slice to outside, as the sixth image shown in figure 7. Last remove all the obstacle tissueto reach the ossicles and so convert this case to Case 1.

Page 9: arXiv:2006.14788v2 [cs.CV] 16 Aug 2020 · 1 Introduction The auditory ossicles, situated in the middle ear, comprise the malleus, incus and stapes, which are the smallest bones in

5 Segmentation

5.1 Energy Function

In this section we introduce our novel method for the volumetric segmentation of the ossicles. The level setmethod [8] [3] is a classical segmentation method. It minimizes an energy function to get the object boundary.This method usually needs a manually labeled initial approximation of the boundary and the accuracy is highlyrelated to it. Here we use the automatic coarse approximation of ossicles obtained in the previous section asthe initialization. We redefine the energy function by adding a new item which is based on Ricci curvature,in a manner that is also inspired by the elegant framework introduced by Mumford and Sha in their classicalpaper [29].

Given a two-dimensional closed surface Γ which can be embedded in Ω ⊂ R3, and a function f on Ω (here inour problem f is the gray value of voxel). We define the energy function E(Γ) as follows:

minΓ

E (Γ) = αEG (Γ) + EF (Γ) . (10)

where

EG (Γ) =

∫Γ

Gω, (11)

ω is the Riemannian area element of Γ and G is usually a boundary detector function. One of the most popularboundary detectors is

G (x) =1

1 + |∇Gσ ∗ f |2(12)

which is used in Chen’s method [3], utilizes the gradient operator. Where Gσ is a Gaussian kernel with a standarddeviation, convolution ∗ is used to smooth the image to reduce the noise, f is our 3D CT image [30]. Howeverin this work we propose to use discrete Ricci curvature (9) of the vertex given by the intensity function f asboundary detector, since it adopts the higher order derivative.

EF (Γ) = λ1Df (S1) + λ2Df (S2) (13)

where S1 is the domain inside Γ and with distance to boundary Γ less than a given threshold γin, and similarly,S2 is the domain outside Γ with distance to boundary Γ less than another pre-established threshold γout. Df (Si)represents the variance of f in domain Si.

We first write this flow as

Γt = (α∇ (G~n) + λ2Df (S2)− λ1Df (S1)) ~n (14)

and then rewrite it using signed distance function (SDF) φ and dirac delta function δ:

φt = δ(φ)

(α∇

(G∇φ|∇φ|

)+ λ2Df (S2)− λ1Df (S1)

)(15)

where φ(v) represent the distance from point v to the surface Γ. Then φ is positive if v is inside Γ, and negativeif v is outside it. The property of SDF will be destroyed during iterating so we need to restore it after some steps.There are two accepted manners to restore it. The first is to add a penalty to E (Γ), thus introducing the gradientdescent to the iterative formula of the level set method [30]. The second is to iterate φ without any restrictionsand re-initialize it after each iteration to restore the signed distance property. The first method is more intuitiveand apparently efficient. However, the total number of volume occupied by the ossicles is approximately only 200voxels. Therefore, a little fluctuation of distance function will result in huge error. Numeric experiments showthat the first method has some fluctuation especially around φ−1 (0). The error introduced by this method is notobvious for big objects such as the whole brain, but it affects significantly results obtained on small object such asthe ossicles. This conducts us to adopt the second method. To this end we make appeal to a multigrid method [31]which needs to solve this question in a coarser grid for this purpose. But it is worth noting that the ossicles arevery small (approximate 200 voxels) so we cannot implement any formula on the coarser grid. In consequence,we use gradient descent to iterate. During the re-initialization process we use the specific initialization methodwhich described in Section 5.2.

5.2 Re-initialization and Narrow Band

The SDF φ has the following property:

|∇φ (v)| = 1, ∀ v ∈ Ω ⊂ Rn ;φ (v) = q (v) , ∀ v ∈ ∂Ω .

(16)

We use fast marching method [32,33] to calculate the distance function, the method in [34] to update grid valueand the binary heap method to maximize the speed of this algorithm.

Page 10: arXiv:2006.14788v2 [cs.CV] 16 Aug 2020 · 1 Introduction The auditory ossicles, situated in the middle ear, comprise the malleus, incus and stapes, which are the smallest bones in

Note that our aim is to locate the boundary, i.e. the set φ−1 (0), thus we only need to focus on the valuesaround it. Therefore during the algorithm the fast marching procedure can be terminated when the obtaineddistance is greater than a certain value γ. That is to say, we actually calculate all the points x such that |φ(x)| < γ.Therefore it follows that we can implement the gradient descent method only around φ−1 (0).

6 Algorithm

In this subsection we give three algorithms. Algorithm 1 is used to localize the ossicles in a bone connectedcomponent if they are included in this component. Algorithm 2 is used to deal with the 4 cases we discussed insection 4.3. Algorithm 3 is our main algorithm, which outputs the 3D surface of the ossicles.

Algorithm 1 Localization

Input: bone Ω1, background B, foreground FOutput: ossicles1: for 3D connected component O ∈ Ω1 do2: if O conforms to property in case 1 then3: return O4: end if5: end for6: return ∅

Algorithm 2 Searching Ossicles

Input: cubic domain ΩOutput: ossicles1: Calculate an air point outside the convex hull of skull2: for case i from case 1 to case 4 do3: Calculate 3D bone connected component Ω1, background B, and foreground F .4: a = Localization(Ω1, B, F )5: if a 6= ∅ then6: return a7: end if8: end for9: return ∅

Algorithm 3 Segmentation

Input: An entire 3D CT image CTOutput: The 3D surface of the ossicles1: Calculate each cubic area Ω of 3D CT image according to section 4.22: O = SearchingOssicles(Ω)3: if O == ∅ then4: There are no ossicles in this cubic5: return6: end if7: Calculate the SDF φ0, such that if x ∈ ∂O, φ0(x) = 0, if x ∈ O− ∂O, φ0(x) > 0, if x /∈ O, φ0(x) < 08: i = 09: repeat

10: a = φ−1i (0)

11: Update a using gradient descent of equation (14)12: Construct SDF φi+1 in |φi+1| < γ using a13: i = i+ 114: until φi = φi−1

15: Construct the zero isosurface of φi+1

Page 11: arXiv:2006.14788v2 [cs.CV] 16 Aug 2020 · 1 Introduction The auditory ossicles, situated in the middle ear, comprise the malleus, incus and stapes, which are the smallest bones in

7 Experiments

The data used in our experiments is courtesy of the Beijing Friendship Hospital and consists of 100 CT images.In section 4.2, we only need a linear time to find largest three dimensional connected bone in the entire CT

image. In case 1, we need to judge whether existing a slice of 3D bone connected component surrounded bybackground, this property can be pre-calculated in linear time and stored in array. So the remaining calculationsin only to iterate every 3D bone connected component which also spend a linear time. Case 2 and case 3 are someoperations that remove tissue around background and then convert to case 1, So they are also a linear process. Incase4, the dilatation is in fact a linear operator, and calculation of ellipse that has the same second-moments needto calculate covariance matrix and use singular value decomposition method to calculate eigenvector. Becausethe color area in figure 7 has a few pixels in one slice, so this calculation can be completed in a very short time.The most important is that the process may return in case 1 because most ossicles are normal. So we point outthat the process in 4 is a linear time method.

After obtaining the initial 3D connected component of ossicles, we use five different methods to segmentit: equation(9) embedded in equation(10), gradient operator (equation (12)) embedded in equation(10), activecontours (snakes) [35], Lazy Snapping [36] and grabcut [37]. The first three method are based on energy functionand the last two method are based on machine learning.

The energy methods take SDF as input and machine learning methods use 3D binary mask of connectedcomponent of ossicles as input which all need the initial points obtained from section 4. Compared to othermethods which use very rough initial boundary, our coarse localization step has given a very closed initializationalready. The initial SDF function φ(x) = 0 can be constructed in time O(n), where n is number of the voxels.

In Figure 8 we show the segmentation results of a normal ossicles. The first four rows are the CT images aroundossicles, the middle row shows the precision and recall of five method, and the last row shows the reconstructed 3Dossicles demonstrated by triangulation meshes. The different color contours in the CT images are the segmentationresults restricted on each slices by different methods. The red color indicates the proposed method, which usesRicci curvature in the energy function. The green color shows the result of Chen’s method, which uses gradientoperation. We can see clearly the energy based methods (Ricci, gradient and snake methods) get much betterresults than the machine learning based method (lazy snapping and grabcut). And the Ricci curvature methodcaptures the most details.

Figure 8: The comparison experiment results of the normal ossicles (case 1). The first four rows are theCT images around ossicles, the middle row shows the precision and recall of five method, and the last rowshows the reconstructed 3D ossicles demonstrated by triangulation meshes. The different color contoursin the CT images are the segmentation results restricted on each slices by different methods. The redcolor indicates the proposed method, which uses Ricci curvature in the energy function. The green colorshows the result of Chen’s method, which uses gradient operation. The blue color expresses the snakemethod.

Figure 9 is the comparison results for an abnormal ossicles surrounded by thicker tissue which results inconnecting to other tissue components (case 3). The color and content is same as that in figure 8. The LazySnapping method and grabcut method fail in detecting a large part of the ossicles. The snake method reachesthe skull so that it’s not very stable in this case.

In Figure 10 we show the comparison experiment results for the defected ossicles. From the CT images wecan see that almost a half of the ossicles is defected. In this case the remaining ossicles is still one bone connected

Page 12: arXiv:2006.14788v2 [cs.CV] 16 Aug 2020 · 1 Introduction The auditory ossicles, situated in the middle ear, comprise the malleus, incus and stapes, which are the smallest bones in

Figure 9: The comparison experiment results of the abnormal ossicles (case 3): The ossicles surroundedby thicker tissue which results in connecting to other tissue components. The color and content is sameas figure 8.

component and isolated “floating” in the surrounding air, which can also be located though method in section4. Just like the previous two examples, the performance of Lazy Snapping method and grabcut method aresimilar, miss quite a large part, and the Ricci curvature method still capture the most details of ossicles thanother methods.

Figure 10: The comparison experiment results of a case of defected ossicles. A half of the ossicles isdefected and can still be located though method in section 4

To quantitatively compare the proposed method with the state-of-the-art methods, we use the manuallylabeled boundary as the ground truth and compare it with all the reconstructed results. The precision, recalland the F1 score in Figure 11. Here we count the number of the voxels which belong to the ossicles according tothe manually labeled result, denoted as Ng. The number of the voxels belonging to the ossicles according to thecomputational methods is denoted as Nm, and the number of the voxels belonging to the ossicles both by groundtruth and by compatational method as Nt. So the precesion is defined as Nt/Nm, recall is defined as Nt/Ng andF1 score is defined as 2

1/precision+1/recall.

From the figure 11 we can see the curves of Ricci curvature based method are always above the others. Eventhough the Lazy Snapping method and grabcut method usually have a very high precision, but they show lowrecall, the Ricci curvature method has a best performance on F1 score. This is because the two “legs” are two2D separated regions and approximate one or two voxels width, just like two lines if not amplifying the image.

Page 13: arXiv:2006.14788v2 [cs.CV] 16 Aug 2020 · 1 Introduction The auditory ossicles, situated in the middle ear, comprise the malleus, incus and stapes, which are the smallest bones in

Each method has more or less defects in dealing with this kind of problem, expect the simple search methodwhich can only work in the absolutely ideal situation. The method of Ricci curvature shows a best performanceby capturing the most details.

Figure 11: Comparison of the precision, recall and the F1 score on our data set. We simply draw the linechart. The coordinate of the horizatal axis is the index of our data and that of the vertical axis is thescore of the corresponding value.

At last we construct the SDF of ground truth φtrue as well as the SDF of the above five methods denoted byφi. We calculate the `2 norm di of φtrue − φi. We normalize these distances in all the CT data independently.

For ease of expression in the diagram: di =√∫

(φtrue − φi)2 ω, where ω is the Riemannian volumetric element.

Then let di = di∑di

and show them as a line chart in Figure 12. We can see that the Ricci curvature method has

a best performance.

8 Conclusion

This paper has two major contributions: First, a method for automatically positioning the ossicles is proposed,and secondly, the energy function with Ricci curvature is used to segmente the 3D ossicles which be representedas triangular mesh. While curvature-based level sets represent the very core of Osher and Sethian’s original

Page 14: arXiv:2006.14788v2 [cs.CV] 16 Aug 2020 · 1 Introduction The auditory ossicles, situated in the middle ear, comprise the malleus, incus and stapes, which are the smallest bones in

Figure 12: Normalized `2 errors of the SDFs for each method. The coordinate of the horizatal axis is theindex of our data and that of the vertical axis is the error value.

Page 15: arXiv:2006.14788v2 [cs.CV] 16 Aug 2020 · 1 Introduction The auditory ossicles, situated in the middle ear, comprise the malleus, incus and stapes, which are the smallest bones in

method [8], this paper shows that the the discrete Forman-Ricci curvature is highly suitable for segmentingsmall objects such as the ossicles, due to intrinsic edge detection capability of the curvature component. Thisedge detection capability of Forman’s Ricci curvature has already been demonstrated (see e.g. [38]). However, theForman curvature used in the previous work cited above is the “full” (2-dimensional) version of the this curvature,which renders the computations more difficult than the extremely simple ones that makes appeal to the graphversion employed in the present paper. More importantly, the Forman-Ricci curvature of graphs allows us toconcentrate on the main information contained in each pixel, without any averaging effect that may occur in thepreviously used 2-dimensional version. This is extremely important when dealing, like in the case of ossicles, withsmall groups of voxels (about 200 in our case). Furthermore, the Forman-Ricci curvature uses the voxel structuredirectly, without further processing, thus further enhancing its edge detection capabilities, especially on 3D imageconsisting of a small number of voxels. These advantages are further enhanced by the extreme computationalsimplicity and efficiency of the graph Forman-Ricci curvature. The efficiency of the Forman-Ricci curvature, asdemonstrated by the results in the present paper is augmented by its intrinsic capability, inherited via its verydefinition from the Bochner formula, to capture high order smoothness, hence to ability to identify small details.

References

[1] E. Britannica. (2020) Human ear anatomy. [Online]. Available: https://www.britannica.com/science/ear#/media/1/175622/530

[2] R. Sreejith, K. Mohanraj, J. Jost, E. Saucan, and A. Samal, “Forman curvature for complex networks,”Journal of Statistical Mechanics: Theory and Experiment, vol. 2016, no. 6, p. 063206, 2016.

[3] J. Zhang, K. Chen, and D. A. Gould, “A fast algorithm for automatic segmentation and extraction of a singleobject by active surfaces,” International Journal of Computer Mathematics, vol. 92, no. 6, pp. 1251–1274,2015.

[4] M. Kass, A. Witkin, and D. Terzopoulos, “Snakes: Active contour models,” International journal of computervision, vol. 1, no. 4, pp. 321–331, 1988.

[5] D. Terzopoulos, A. Witkin, and M. Kass, “Constraints on deformable models: Recovering 3d shape andnonrigid motion,” Artificial intelligence, vol. 36, no. 1, pp. 91–123, 1988.

[6] H. Delingette, Simplex meshes: a general representation for 3D shape reconstruction. INRIA Sophia An-tipolis, France, 1994.

[7] T. McInemey and D. Terzopoulos, “Topology adaptive deformable surfaces for medical image volume seg-mentation,” IEEE transactions on medical imaging, vol. 18, no. 10, pp. 840–850, 1999.

[8] S. Osher and J. A. Sethian, “Fronts propagating with curvature-dependent speed: algorithms based onhamilton-jacobi formulations,” Journal of computational physics, vol. 79, no. 1, pp. 12–49, 1988.

[9] R. Malladi, J. A. Sethian, and B. C. Vemuri, “Shape modeling with front propagation: A level set approach,”IEEE transactions on pattern analysis and machine intelligence, vol. 17, no. 2, pp. 158–175, 1995.

[10] M. E. Leventon, W. E. L. Grimson, and O. Faugeras, “Statistical shape influence in geodesic active contours,”in 5th IEEE EMBS International Summer School on Biomedical Imaging, 2002. IEEE, 2002, pp. 8–pp.

[11] A. Tsai et al., “A shape-based approach to the segmentation of medical imagery using level sets,” IEEEtransactions on medical imaging, vol. 22, no. 2, pp. 137–154, 2003.

[12] K. M. Pohl et al., “Logarithm odds maps for shape representation,” in International Conference on MedicalImage Computing and Computer-assisted Intervention. Springer, 2006, pp. 955–963.

[13] J.-P. Eckmann and E. Moses, “Curvature of co-links uncovers hidden thematic layers in the world wide web,”Proceedings of the national academy of sciences, vol. 99, no. 9, pp. 5825–5829, 2002.

[14] Y. Shavitt and T. Tankel, “On the curvature of the internet and its usage for overlay construction anddistance estimation,” in IEEE INFOCOM 2004, vol. 1. IEEE, 2004.

[15] E. Saucan and E. Appleboim, “Curvature based clustering for dna microarray data analysis,” in IberianConference on Pattern Recognition and Image Analysis. Springer, 2005, pp. 405–412.

[16] G. Perelman, “The entropy formula for the ricci flow and its geometric applications,” arXiv preprintmath/0211159, 2002.

[17] ——, “Ricci flow with surgery on three manifolds,” arXiv preprint math/0303109, 2003.

[18] B. Chow and F. Luo, “Combinatorial ricci flows on surfaces,” Journal of Differential Geometry, vol. 63, no. 1,pp. 97–129, 2003.

[19] R. S. Hamilton, “The ricci flow on surfaces,” in Mathematics and general relativity, Proceedings of the AMS-IMS-SIAM Joint Summer Research Conference in the Mathematical Sciences on Mathematics in GeneralRelativity, Univ. of California, Santa Cruz, California, 1986. Amer. Math. Soc., 1988, pp. 237–262.

Page 16: arXiv:2006.14788v2 [cs.CV] 16 Aug 2020 · 1 Introduction The auditory ossicles, situated in the middle ear, comprise the malleus, incus and stapes, which are the smallest bones in

[20] X. D. Gu and S.-T. Yau, Computational conformal geometry. International Pressof Boston Incorporated,2008, vol. 3.

[21] M. Zhang, W. Zeng, R. Guo, F. Luo, and X. D. Gu, “Survey on discrete surface ricci flow,” Journal ofComputer Science and Technology, vol. 30, no. 3, pp. 598–613, 2015.

[22] M.-Y. Kao, Encyclopedia of algorithms. Springer Science & Business Media, 2008.

[23] J. Gao, X. David Gu, and F. Luo, “Discrete ricci flow for geometric routing,” in Encyclopedia of Algorithms,M.-Y. Kao, Ed. New York, NY: Springer New York, 2016, pp. 556–563.

[24] Y. Ollivier, “Ricci curvature of markov chains on metric spaces,” Journal of Functional Analysis, vol. 256,no. 3, pp. 810–864, 2009.

[25] R. Forman, “Bochner’s method for cell complexes and combinatorial ricci curvature,” Discrete and Compu-tational Geometry, vol. 29, no. 3, pp. 323–374, 2003.

[26] M. Berger, A panoramic view of Riemannian geometry. Springer Science & Business Media, 2012.

[27] E. Appleboim, E. Saucan, and Y. Y. Zeevi, “Ricci curvature and flow for image denoising and superesolution,”in Proceedings of EUSIPCO 2012. IEEE, 2012, pp. 2743–2747.

[28] E. Sonn, E. Saucan, E. Appleboim, and Y. Y. Zeevi, “Ricci flow for image processing,” in Proceedings ofIEEEI 2014. IEEEI, 2014, pp. 1–6.

[29] D. Mumford and J. Shah, “Optimal approximations by piecewise smooth functions and associated variationalproblems,” Comm. Pure Appl. Math., vol. 42, no. 5, pp. 577–685, 1989.

[30] C. Li, C. Xu, C. Gui, and M. D. Fox, “Distance regularized level set evolution and its application to imagesegmentation,” IEEE transactions on image processing, vol. 19, no. 12, pp. 3243–3254, 2010.

[31] N. Badshah and K. Chen, “Multigrid method for the chan-vese model in variational segmentation,” Com-munications in Computational Physics, vol. 4, no. 2, pp. 294–316, 2008.

[32] J. N. Tsitsiklis, “Efficient algorithms for globally optimal trajectories,” IEEE Transactions on AutomaticControl, vol. 40, no. 9, pp. 1528–1538, 1995.

[33] J. A. Bærentzen, “On the implementation of fast marching methods for 3d lattices,” Informatics and Math-ematical Modelling, 2001.

[34] A. Chacon and A. Vladimirsky, “Fast two-scale methods for eikonal equations,” SIAM Journal on ScientificComputing, vol. 34, no. 2, pp. A547–A578, 2012.

[35] T. F. Chan and L. A. Vese, “Active contours without edges,” IEEE Transactions on image processing, vol. 10,no. 2, pp. 266–277, 2001.

[36] Y. Li, J. Sun, C.-K. Tang, and H.-Y. Shum, “Lazy snapping,” ACM Transactions on Graphics (ToG), vol. 23,no. 3, pp. 303–308, 2004.

[37] C. Rother, V. Kolmogorov, and A. Blake, “” grabcut” interactive foreground extraction using iterated graphcuts,” ACM transactions on graphics (TOG), vol. 23, no. 3, pp. 309–314, 2004.

[38] E. Saucan, E. Appleboim, G. Wolansky, and Y. Y. Zeevi, “Combinatorial ricci curvature and laplacians forimage processing,” in Proceedings of CISP’09. IEEE, 2009, pp. 992–997.