Nonlinear diffusion based image segmentation using two fast algorithms

In this paper, a new variational model is proposed for image segmentation based on active contours, nonlinear diffusion and level sets. It includes a Chan-Vese model-based data fitting term and a regularized term that uses the potential functions (PF) of nonlinear diffusion. The former term can segment the image by region partition instead of having to rely on the edge information. The latter term is capable of automatically preserving image edges as well as smoothing noisy regions. To improve computational efficiency, the implementation of the proposed model does not directly solve the high order nonlinear partial differential equations and instead exploit the efficient alternating direction method of multipliers (ADMM), which allows the use of fast Fourier transform (FFT), analytical generalized soft thresholding equation, and projection formula. In particular, we creatively propose a new fast algorithm, normal vector projection method (NVPM), based on alternating optimization method and normal vector projection. Its stability can be the same as ADMM and it has faster convergence ability. Extensive numerical experiments on grey and colour images validate the effectiveness of the proposed model and the efficiency of the algorithms.

Abstract. In this paper, a new variational model is proposed for image segmentation based on active contours, nonlinear diffusion and level sets. It includes a Chan-Vese model-based data fitting term and a regularized term that uses the potential functions (PF) of nonlinear diffusion. The former term can segment the image by region partition instead of having to rely on the edge information. The latter term is capable of automatically preserving image edges as well as smoothing noisy regions. To improve computational efficiency, the implementation of the proposed model does not directly solve the high order nonlinear partial differential equations and instead exploit the efficient alternating direction method of multipliers (ADMM), which allows the use of fast Fourier transform (FFT), analytical generalized soft thresholding equation, and projection formula. In particular, we creatively propose a new fast algorithm, normal vector projection method (NVPM), based on alternating optimization method and normal vector projection. Its stability can be the same as ADMM and it has faster convergence ability. Extensive numerical experiments on grey and colour images validate the effectiveness of the proposed model and the efficiency of the algorithms.
1. Introduction. Mumford-Shah model [20] is a milestone in the development process of variational partial differential equations (PDEs) based image segmentation. Much of the subsequent research progresses based on it. One important class, the region-based geodesic active contour model, is derived from it. However, the Mumford-Shah model is an ideal one which includes two energy terms defined in two-dimensional image space and one-dimensional contour space respectively; there are inevitable difficulties in the course of realization. For this, a series of approximations are presented which might beroughlyclassified intothree. In 1990, Ambrosi and Tortorilli proposed an equivalent model [5] based on Gamma convergence theory by introducing auxiliary variables and approximating the construction of elliptic function. This forms the first Gamma-convergence family for variational image segmentation. The second family is the variational level set method [32,28,33] based piece-wise constant approximation of the Mumford-Shah model. In 2001, Chan and Vese proposed the two-phase Chan-Vese model [11], which is especially applicable to the circumstance of blurring or discontinuous boundaries. Then it became the most famous model of this family. The third family based on variational label function method is an extended version of the second one. The typical representative is the global convex segmentation model, which was transformed from the Chan-Vese model with constraints and the relaxation-based method by Bresson et al. [25,7]. Nonlinear diffusion technology is a widely used method for image denoising, which has given rise to several significant applications [24,18,36]. During operations, different strategies can be adopted to get better diffusion effects according to the difference between the edge region and the non-edge region. Thus it can adaptively smooth images regions and enhance edges. In 1990, Perona and Malik creatively introduced the thermal balance equation based anisotropic diffusion model in their famous research [23]. In their research, φ was defined as a function. The final result of φ obtained through iterative calculation represented the denoised target image. As its corresponding regularization term is λ 2 ln(1+|∇φ| 2 /λ 2 ) (where λ > 0 is a threshold value), we can know: (a) If |∇φ| > λ, the pixels are enhanced; (b) If |∇φ| < λ, the pixels are reduced; (c) If |∇φ| = λ, the pixels are invariant. In this way, it is effective for both smoothing and edge-preserving in the filtering process. In 1992, Catte et al. [9] developed Perona and Maliks algorithm by applying Gaussian smoothing to diminish the effect of noise and improve the models adaptability. Then many authors [21,31,1,15,19,13,22,12,17] designed diffusion terms to control diffusion effects by considering the neighbourhood features based on their theories. It should be noticed that the regularization term must be reasonably designed in order to preserve useful details such as edges in the image as much as possible.
Our purpose is to propose a general variational model for image segmentation based on nonlinear diffusion technology and Chan-Vese model. The proposed model adopts diffusion terms as the regularization term and therefore, can obtain a higher quality of segmentation results. Instead of solving high order nonlinear PDEs, alternating direction method of multipliers (ADMM) [28,16,27] (i. e. augmented Lagrangian method (ALM) [2,6,35]) is applied to transform the energy minimization problem of proposed model into three subproblems, which are then solved by fast Fourier transform (FFT) [14], projection formula [27], analytical soft thresholding equation [26,35] and threshold method [25,35]. Moreover, we creatively propose a new fast algorithm (NVPM) based on normal vector projection and alternating optimization method to solve our model. There are still three subproblems that need gradient descent method, normal vector projection, projection formula and the same threshold method to solve. To further improve the computing efficiency, Nesterovs optimal first-order methods [16] is applied into NVPM. The new model and algorithms are validated through extensive experiments.
The remaining of the paper is organized as follows. In Section 2, some related works are briefly introduced. In section 3, we propose the general nonlinear segmentation model and design two fast algorithms. Implementations are also put forward in this section. In section 4, the results of the proposed model using these two algorithms are given, and comparison experiments on different kinds of images are presented to validate the effectiveness and efficiency of the proposed algorithms. Concluding remarks are drawn in Section 5.
2. Image segmentation model and nonlinear diffusion technology.
2.1. The variational formulation of the Chan-Vese model. The Chan-Vese model [11] proposed for two-phase image segmentation can be solved as a minimization problem of Munford-Shah functional [20] which has been transformed into an equivalent variational level set formulations. Based on the piece-wise constant approximation, it divides one image into two heterogeneous regions by using the zero level set. Combined with the binary level set function, the Chan-Vese model can be written as the following form: where f (x) is a given image. α 1 , α 2 are positive parameters. γ is also a positive parameter which controls the trade-off between the fitting data term and the regularization [3]. u 1 , u 2 ∈ u are piece-wise constants which represent the mean intensity values of the regions. φ(x) is the binary level set representation supposed to take on either 0 or 1, which is defined as: where Ω 1 is a closed region inside Ω. This method does not have to reinitialize the level sets during the evolution procedure. Thus the convergence speed can be accelerated. Then the Chan-Vese model (1) can be transformed into the following minimization problem mathematically: where r(u1, u2 Note that the binary constraint for φ will also cause non-convexity in sub problems (3). [7] demonstrated that certain non-convex minimization problems could be equivalent to the following convex minimization problems by adopting the convex relaxation and the thresholding method. Minimization problem (3) can be rewritten as: Recently the Chan-Vese model has been an important tool for image segmentation. In this model, active contours can automatically detect interior objects in a given image whose boundaries are not necessarily defined using techniques of curve evolution. Moreover, the initial curve can be placed anywhere in the image. It can detect and preserve the locations of boundaries well without smoothing the initial image, even if the image is very noisy.

2.2.
Nonlinear diffusion regularization. In this part, let us review the nonlinear diffusion technology. The main idea is that the diffusion coefficient depends on the local image property in the diffusion procedure. Specifically, the diffusion coefficient can automatically increase in smooth regions so that the noise can be suppressed. Similarly, the coefficient can automatically decrease in edge regions so that the edges can be preserved. After Perona and Malik proposed the PM model [23], many other authors presented the extensions. In 1997, its general variational model [4] was proposed by Aubert and Vese. Here several customarily used potential functions (PF) ϕ are given in Table 1.
Nonlinear diffusion can transform the image estimation into a temporal evolution whose stable solution is the favourable result of image processing. It can produce new filtering methods by simply recombining some other filtering methods. Due to its strong robustness, it can be extended into many other aspects in image processing.
3. Image segmentation based nonlinear diffusion and the corresponding fast algorithms. Considering the advantages of using nonlinear diffusion as the regularization and good effects of active contour model mentioned above, we attempt to propose a novel model which combines these two technologies to get better effects for image segmentation. Thus the variational formulation of our nonlinear segmentation model can be obtained as follows: Before solving this functional, let us introduce the discretization which needs to be used in the calculation of ADMM and NVPM. Let Ω = {(i, j)|1 ≤ i ≤ M, 1 ≤ j ≤ N } denote the two-dimensional image space and each point (i, j) is called a pixel point. The coordinates x and y are oriented along rows and columns, respectively. We first give the first order backward and forward differences of φ at point (i, j) along x-direction and y-direction with periodic boundary condition as follows: Then the gradient operators, divergence operators and Laplace operators are defined as: Where p = [p 1 , p 2 ] r and the second order differences are just the corresponding compositions of the first order differences. Here only necessary operators are presented, more details can be found in [14].
3.1. Design of ADMM for the proposed model. ADMM is widely used to minimize the energy functionals of image denoising [27] and inpainting models [34] and solve some other complicated high-order models [29,26] fast and effectively. Though a good result could be obtained by traditional gradient descent flow algorithm, the computation cost utilizing the method of finite difference associated with high-order variables becomes very expensive. Thus ADMM which not only reduces the order of the variables well but also needs even fewer steps is selected in this paper.
Our proposed algorithm is using the idea of alternating direction optimization method under the ADMM frame [28,16,27]. The destination of ADMM is to simplify the energy functional including high-order variables by using the introduction of low-order variables which can be computed by the easier iterative algorithm and avoid the subproblem which occurs in φ associating with no convergence successfully. Here we introduce auxiliary vector w = [w 1 , w 2 ] r such that w ≈ ∇φ = [∂ x φ, ∂ y φ] r to transform the foremost energy functional (5) into a constrained problem as follows: where λ = [λ 1 , λ 2 ] r is the Lagrange multiplier and µ is the penalty parameter which ought to be positive. In practice, we use an iterative algorithm to obtain the minimizers of variables in (6). It is very difficult to directly solve the following minimization problem: Thus an alternating optimization method can be used. The procedure of solving (8) is carried out as follows. We firstly initialize the unknowns u 0 , φ 0 , w 0 , λ 0 at k = 0, then realize minimization problems about only a kind of unknowns fixed other ones temporarily at each step from kth to (k + 1)th until convergence. It exactly reflects the property of alternating direction optimization approach. Using this method, (8) is divided into three separate subproblems for u, φ and w, each of which can be solved quickly. Detailed implementation of ADMM is shown in Algorithm 1.
u, φ and w in subproblems, 1-3 can be obtained by minimizing the following energy functionals: Next, (15) to (17) will be solved respectively by different iterative methods.
3.1.1. Estimations of piecewise constant parameters. Unknown intensities u 1 and u 2 can be obtained by the following conditions: According to the given procedure, we can get optimal results in order.
3.1.2. The computing of the binary level set function. When dealing with the iterations of φ, we attempt to apply the fast Fourier transform (FFT) [14], which is used for fast calculating linear equations to obtain the minimizer of this variable. When φ is being computed, the (k + 1)th value of u k+1 and the kth auxiliary variable w k are fixed. This concept is also applied in the following paragraphs. We can directly derive the Euler-Lagrange equation concerning φ, which leads to the fourth order linear PDE as follows: Where the second formula of (20) is the boundary condition. Depends on the linearity, it is obvious that (20) can be transformed into the following discretization: Here instrumental variable g is introduced to help obtain the result. The discrete form of (21) gives: . Then we bring the introduction of the identity operator If (i, j) = f (i, j) and shifting operators . The equations mentioned in (22) can be rewritten as: Now it is time to explain how to apply Fourier transform F . By using the properties of discrete Fourier transform, corresponding relations between the Fourier transform types of the shifting operators and those of original functions f (y i , y j ) can be given as: where y i and y j are discrete frequencies. Z i = 2π N1 y i , y i = 1, 2, ...N 1 and Z j = 2π N2 y j , y j = 1, 2, ...N 2 . Algebraic equations can be obtained in this way: −2µ(cos Z i + cos Z j − 2)F (u) = F (g).
(26) provides us with a closed-form solution of φ as follows: where F − 1(·) denotes the discrete inverse Fourier transform, (·) represents the real part of a complex number. After obtaining the value of φ k+1 obtained by (27), we use a simple projection [25,7] to make sure φ k+1 ∈ [0, 1] as in Equation (11), that is: 3.1.3. Calculation of the auxiliary variable. The soft thresholding formula [26] can be used in this part to calculate the variable w. It is one of the most classical algorithms which has been widely used to obtain minimizers of the variables in this kind of equation. Here we extend it to the general form. The variables u k+1 and φ k+1 are fixed to calculate w. The Euler Lagrange equation of (16) is given by: The calculation result is shown as: After the minimizers of subproblems are found, Lagrange multipliers should be updated according to (13). At the end of the implementation, we can work out the threshold α for binarization of φ k+1 based on the histogram of its result (as described in (14)). The stopping criterions mentioned in Algorithm 1 will be clearly stated in section 4. ADMM can reduce the possibility of ill-conditioning and make the numerical computation stable through iterative Lagrangian multiplier during the process of the minimization. No doubt it is an effective way to solve complex energy functional for image processing based on PDE.

3.2.
Design of NVPM for the proposed model. As noted earlier, we can reduce the order of energy functional (5) and use an alternating optimization method to solve the corresponding subproblems. It has been proved that an alternating optimization method can improve calculation efficiency and push the complexity into small. In this section, NVPM is proposed exactly based on this idea. Then we optimize NVPM by using Nesterovs optimal first-order methods [16] for accelerating the iterations. First, we introduce the computational procedure of NVPM. From the Euler-Lagrange equation of the functional (5): we can find it very difficult to calculate the normal vector ∇φ |∇φ| . And its resultant high-order term ∇ · ϕ (|∇φ|) ∇φ |∇φ| is believed to result in slow computation speed. Here we introduce auxiliary vector p = [p 1 , p 2 ] r such that p ≈ ∇φ |∇φ| according to the concept of replacing high-order terms with low-order terms to reduce the order of the variables. Thus the solvingprocess can be simplified and then alternating optimization method is selected to calculate variables as mentioned before. We also need to face three separate subproblems for u, φ and p. Detailed implementation of NVPM is shown in Algorithm 2.
2.4. φ k+1 needs to be processed by thresholding: 3. The overall loop will be terminated if the stopping criterions (described in section 4) are satisfied.
The normal vector projection Π p≈ ∇φ |∇φ| (·) in equation (35) is used to constrain the instrumental vector p. For k = 0, 1, ... the minimizers of variables u, φ and p in subproblems 1-3 can be obtained by minimization of the energy functionals and the projection method: In the following part, (37) to (39) will be solved, respectively. (37). u 1 and u 2 can be obtained as in Equation (18) and (19).

3.2.2.
Minimization of ε 2 (u) in (38). The variables u k+1 and p k are fixed to calculate φ. By introducing the instrumental vector p, the Euler-Lagrange equation (31) can be simplified as follows: If this Euler-Lagrange equation is used to obtain directly φ k+1 , the problem of convergence difficulty or un-convergence will come into the energy curve. Thus, the gradient descent method is added in order to realize the absolute convergence.
where the second formula of (41) is the boundary condition. After obtaining the value of φ k+1 , we also use the same projection method to make sure φ k+1 ∈ [0, 1] as in (28).

Projection on p in (39).
It is easy to calculate p k+1 according to (35) under the condition of obtaining u k+1 and φ k+1 . Here the norm vector projection can be directly used as: Thus a conclusion about the range of p k+1 by the nature of the norm vector can be got as follows: According to [27], we can use the constraint condition | p k+1 | ≤ 1 and get the following variant.
which is a simple re-projection method, seems to perform better. At the end of the implementation, the binarization of φ k+1 is required as well. The stopping criteria presented in Algorithm 2 will also be clearly stated in section 4.

3.2.4.
Optimization of the normal vector projection method. In this part, we will discuss how to optimize the normal vector projection method. From Algorithm 2, it can be found that the key factor which influences the rate of convergence is the use of gradient descent method. As described in [16], the solution process of Subproblem 2 in Algorithm 2 can be accelerated through the over-relaxation iterative method. After obtaining piecewise constant parameters u k+1 1 and u k+1 2 as in Equation (18) and (19), we introduce the over-relaxation parameter α and intermediate variableφ. φ should be carefully calculated by the intermediate variable as: Then update the over-relaxation parameter α (we set α 0 = 1): The (k + 1)th and kth values of φ, α is used to update the intermediate variableφ: The rest of the steps will not be repeated because they are similar steps to (42)-(44) in Algorithm 2. We can use the same stopping criterions as presented for NVPM to terminate the overall loop. NVPM has more concise algorithm framework and guarantees fast convergence of the energy functional in theory. Accordingly, it can greatly enhance the calculation efficiency for our segmentation models. Moreover, it could be another good way to solve variational problems. 4. Numerical experiments. In this section, the numerical results of our proposed model are applied to some grey cases (synthetic and real images) and colour cases (real images). We choose the multi-channel Chan-Vese model [10] as the variational framework by means of nonlinear diffusion. And we draw a comparison with both edge-based [8] and region-based [30] active contour segmentation models by presenting their segmentation results, respectively. Through experiments, the threshold method shown in (14) and (36)  As described in [27,8,30] the iterations need to be terminated when the following criterions are satisfied. In this paper, stopping criterions used for the proposed methods might be different.
i We need to monitor the constraints error in iterations: with where · L1 denotes the L 1 norm on image domain Ω. If R k < ε (ε is a small enough parameter), iteration of outer repeat k will be stopped. The good numerical indicator (49) are also used to determinate the value of µ, which can be the basis of penalty parameter adjustment for ADMM.
ii In iterations, the relative errors of Lagrange multipliers and the solution φ k should be noticed. They should reduce to a sufficiently small level: For ADMM, these two conditions should be considered. However, for NVPM (or optimized NVPM), only (52) should be considered.
iii The convergence of energy functional E(φ) needs to be guaranteed. That is, ≤ ε should be satisfied. This is for both of the proposed methods. In this experiment, our proposed model is applied to images with spatially varying background and textures. In Fig. 1, some results of our model are firstly presented, respectively. Geodesic active contour model (GAC model [8]): moreover, piece-wise smooth, active contour model (PSAC model [30]): are used to compare with our model. The edge function is defined as g(x) = 1 1+|∇(Gσ * f (x))| 2 . Fig.1 (a1)-(d1) show the initial curves. The segmentation results of our model are shown in Fig.1 (a2)-(d2). Fig.1 (a2) and (b2) are obtained by ADMM and Fig.1 (c2) and (d2) by NVPM. In practice, the results of our model using all the PFs shown in Table 1 are very close. Thus we do not discuss it too much in the following experiments. The experiments using ADMM and NVPM are all conducted on the same initializations, and the performances of these two algorithms are similar. The difference lies in the efficiency, which is shown in Table  2. The difference lied in efficiency. Then initial curves and results of GAC model and PSAC model are shown in Fig.2. We present the parameters of each model under the image. From Fig.2 (e2) and (f2), we observe that the GAC model cannot detect interior contours of the ring and the box. In contrast, the PSAC model cannot detect objects with varying features from Fig.2 (e3) and (f3). For regions with textures, both GAC and PSAC fail in segmenting the complete regions ( Fig.2 (g2)  From left to right, we illustrate: (I) for ADMM, relative residual (48), relative error of Lagrange multiplier (51), relative error of φ k (52) and energy curve along the outer repeat k, (II) for NVPM, relative residual (48), relative error of φ k (52) and energy curve in Fig.3. The graphs come from Fig.1 (a2). Other graphs from Fig.1 have similar profiles in Fig.2. All graphs related to our algorithms are drawn up to 100 iterations. From these plots, it can be observed that the algorithms have converged long before 100 iterations. They also give important information about how to choose the penalty parameter µ of ADMM. R k will converge to zero with the same speed as the iteration proceeds, and the energy will decrease to a steady constant value when µ is chosen properly. Moreover, these two plots show that NVPM owns better convergence ability.  Table 2. The number of iterations is the number of total outer iterations in Fig. 1. The computational time is measured in seconds. NVPM* is the optimized representation of NVPM. In this experiment, we choose PF (i), (iii), (v) and (vii) from Table 1 as examples. It is easy to see that NVPM has a faster convergence rate and higher efficiency with or without optimization.  with the segmentation results obtained by our model, GAC model and PSAC model subsequently and then discuss how to choose a proper threshold value for (14) and (36). The same initial contours are used in Fig.4 (a1) and (b1). Then the results of our model, the GAC model and PSAC model are given from left to right. Fig.4 (a2) is obtained by ADMM and (b2) is obtained by NVPM. The parameters of each model are also presented under the corresponding image. In contrast, it can be found that our proposed model has better segmentation ability and precision on real images. When we deal with the convex optimization problem, we have to use a threshold method (14) and (36) to realize the binarization of φ k+1 . It is an important way to help find accurate results. Non-threshold solutions of the proposed methods are shown as follows. It can be observed that non-threshold often results in fuzzy regions (red circles in (d2)). In practice, we find the threshold of φ k+1 could be set an applicable value 0.5.
In Table 3, comparisons of iterations and computation time using different methods are shown. The introduction of this table is as we have described in Table 2.
Here PF (ii) and (iv) from Table 1 are chosen as examples. We can find that the convergence ability of NVPM is nothing less than that of ADMM.
which uses multi-channel Chan-Vese model [10] as the variational framework. The same initial contours for each image using different methods are shown in Fig. 6 (a1), (b1) and (c1). Then the results of our model, the GAC model and PSAC model are given subsequently. The result obtained by our model using ADMM is presented in Fig. 6 (a2), (b2) is under the NVPM algorithm framework and (c2) gives the NVPM* result. All the potential functions presented in Table 1 can get very similar results. As described before, the parameters of each model are under the corresponding image. By comparing the results of these models, it is found that the GAC model only detects the major objects, which cant capture the detailed object structures (e.g. the birds plumage, the top of the leaf and the insects legs) and often results in fuzzy edges. The calculation for obtaining the parameters value u1 and u2 of the PSAC model were given in (54) and (55). Parameters value u 1 and u 2 of our multi-channel model (57) can be got through a similar computation in (18) and (19). It can be observed that the values from (54) and (55) are in the form of function while the ones from (18) and (19) are piece-wise constants. PSAC model transforms the piece-wise constant parameters u 1 and u 2 into variables that need to be calculated. Through experiments we know that the performance of PSAC can surpass GAC in general, while inconstant values of u 1 and u 2 included in PSAC may give rise to detected objects with excessive details and some tiny structures such as the leaf stem and the insects legs will also be smoothed by the regularizers |∇u 1 | 2 and |∇u 2 | 2 . In Fig.7, plots of relative errors and energy versus iteration numbers of ADMM and NVPM* are shown, for example, Fig. 6 (a2). As we mentioned above, all graphs related to our algorithms are drawn up to 50 iterations. By doing so, we can easily observe the convergence of the algorithms. From this test, we can see that NVPM takes fewer iterations to converge compared with ADMM. The two algorithms produce similar results.
At last, we compare the iterations and computational time of these algorithms used in this experiment. Here PF (vi), (viii), (ix) and (x) from Table 1

5.
Conclusion. In this paper, by using the relevant concepts of nonlinear diffusion and Chan-Vese model, we propose a new general model and apply it into grey and colour image segmentation. Our model has properties of detecting regions with different features and has been successfully extended to colour images. Then we design its fast algorithm ADMM and creatively propose a new fast algorithm NVPM Figure 7. Plots of parametric errors and energy curves. The first row is obtained by our model using ADMM. The second row is obtained by our model using NVPM*. based on the alternating optimization method and normal vector projection. Also, an accelerated method is applied to optimize NVPM. The good effects of our model and the efficiency of proposed algorithms has been validated by several numerical experiments. Comparisons of results and computational time obtained by ADMM and our proposed approach indicate that NVPM yields shorter runtime while the quality of results is identical. The idea presented in our paper can be extended to help solve many other mathematical problems in image processing in the future work.