A NONCONVEX TRUNCATED REGULARIZATION AND BOX-CONSTRAINED MODEL FOR CT RECONSTRUCTION

. X-ray computed tomography has been a useful technology in cancer detection and radiation therapy. However, high radiation dose during CT scans may increase the underlying risk of healthy organs. Usually, sparse-view X-ray projection is an eﬀective method to reduce radiation. In this paper, we propose a constrained nonconvex truncated regularization model for this low-dose CT reconstruction. It preserves sharp edges very well. Although this model is quite complicated to analyze, we establish two useful theoretical results for its minimizers. Motivated by them, an iterative support shrinking algorithm is introduced. To handle more nondiﬀerentiable points of the regularization function except zero point, we use a general proximally linearization technique at them, which is helpful to implement our algorithm. For this algorithm, we prove the convergence of the objective function, and give a lower bound theory of the iterative sequence. Numerical experiments and comparisons demonstrate that our model with the proposed algorithm performs good for low-dose CT reconstruction.


1.
Introduction. In the past several decades, X-ray computed tomography (CT) has been one important nondestructive testing technologies [5], and is widely used for disease detection and radiation therapy. Mathematically, the imaging model of CT system can be approximated as a discrete linear inverse problem where A ∈ R N ×N is the projection matrix, g ∈ R N is the observed projection data, u ∈ R N is the vector form of a n × n clean image (N = n 2 ) and ε ∈ R N is the random Gaussian noise. Many methods such as filtered back projection (FBP), algebraic reconstruction techniques (ART) and simultaneous algebraic reconstruction techniques (SART) [26,19,1,23] usually need high doses of radiation to produce good-quality diagnostic images. However, high radiation dose may increase the underlying risk of cancer [15]. To reduce the radiation, a direct strategy is to impose few number of projections to patient, i.e., adopting low-dose CT. For this low-dose • We propose a new constrained nonconvex truncated model for structure and contrast preservation in CT reconstruction. • We show two useful theoretical results for the local minimizers of our model. They help us to design an effective algorithm, which adopts a general linearization technique to handle nondifferentiable points of the truncated regularization function. • The convergence of the objective function and a lower bound theory for the nonzero entries of gradients of the iterative sequence are provided.
Expermental results demonstrate that our model with the proposed algorithm achieves better CT reconstruction compared to some existing regularization methods. The paper is organized as follows. In section 2, we propose our model with some basic assumptions, definations and remarks. In subsection 2.3, we prove two theoretical results for local minimizers of our model. In section 3, we present an effective numerical algorithm with some theoretical properties. Section 4 lists some numerical experiments and comparisons. Finally, we conclude the paper in section 5.
2. The proposed model and theoretical analyses.
2.1. Basic notation. For an image, let J = {(i x , i y ) : 1 ≤ i x , i y ≤ n} and J = {1, 2, . . . , N } respectively denote the index sets of its matrix form U ∈ R n×n and vector form u ∈ R N (N = n 2 ). Obviously, there is a one-to-one mapping between J and J, i.e., J → J: At a pixel i, we can define the discrete gradient operator where d x i , d y i ∈ R 1×N and d x i , d y i respectively denote the discrete horizontal forward difference operator and vertical forward difference operator [43]. The Euclidean norm is denoted as · . For a real-valued matrix A, A T is its transpose, and A is its spectral norm.
We assume that ϕ(t) has some properties: Assumption 1(c) clearly shows that 0 is a strict minimizer of ϕ(t). Three different ϕ(t) are shown in Fig. 1. The set M = {0.5} in Fig. 1(a) and Fig. 1 Fig. 1(c). Note that, Assumption 1(c) shows that the potential function ϕ(t) is flat on (max{M }, +∞). Here, the value τ := max{M } is related to edge contrasts of images. That is, if d i u at the edge is equal or larger than τ , our regularization will not have increasing penalization on d i u . Although the exact relationship between image contrasts and the truncated scheme can not be easily obtained for general cases, the truncated regularizers can give perfect restorations with contrast preserving for some interesting examples, see [30,42,18].
To conveniently analyze our box-constrained model (P), we give the following index sets, some of which are inspired by [9,40].
Definition 2.1. For any u ∈ U , we define some index sets To understand these index sets above, we explain them in Appendix 6.1 using a simple example. Moreover, according to the introduced index sets, we use (d x i ) I(u) and (d y i ) I(u) to respectively denote the subvectors whose entries lie in the position of d x i and d y i indexed by I(u). Similarly, we define (d x i ) B(u) , (d y i ) B(u) , u I(u) and u B(u) . Then, we have . These representations of d x i u and d y i u will be used in the following.

2.3.
Theoretical analyses of our model. Since F (u) is C 0 defined on a compact set in (1), there exists a solution for (P). In general, a global minimizer is difficult to obtain. Here, we assume that u * is a local minimizer. Denote , Ω * 0 = Ω 0 (u * ), as the simplified index sets in Definition 2.1. Let A I * = (a k ) k∈I * and A B * = (a k ) k∈B * be submatrices of A = [a 1 , a 2 , · · · , a N ]. Moreover, u * is rearranged as (u * Then, we construct a new minimization problem: We can verify that z * = u * I * is a local minimizer of the problem (R) by contradiction as [40].
Based on the problem (R), we give an important result about d i u * . Although the proof is motivited by [31], no constraint was considered in it.
For readability, we put the detailed proof of Theorem 2.2 in Appendix 6.2. In our model (1), ϕ(t) can be either a Lipschitz function or a non-Lipschitz function. However, non-Lipschitz ϕ(t) is more helpful to keep image edges compared to Lipschitz ϕ(t). In the following, we focus on the non-Lipschitz ϕ(t) satisfying ϕ (t)| 0+ = +∞. As discussed in [32,10], it is usually difficult to design algorithms for non-Lipscitz models. Thus, we give a theorem useful to the design of our algorithm. Theorem 2.3. If a local minimizer u * is sufficiently near to a given point u, then Ω 1 (u * ) ⊆ Ω 1 ( u).
Proof. Theorem 2.2 shows d i u * / ∈ M , which implies R(z) being C 1 on a neighborhood of z * . Then, the proof procedure is similar to the proof of Theorem 3.6 in [40]. Theorem 2.3 implies that if a local minimizer u * is sufficiently near to a given point u, then the support would not expand from u to u * . This observation is important and inspires us to design the algorithm.

Numerical algorithm and its properties.
3.1. The algorithm. Motivated by Theorem 2.3, we propose an iterative support shrinking strategy to solve (1) as [46,40] for other optimization models. Based on an approximate solution u k in the kth iteration, u k+1 is computed from the objective function with a constraint on the gradient support of the optimization variable as a subset of the gradient support of u k where Ω k 0 = Ω 0 (u k ) and Ω k 1 = Ω 1 (u k ). To handle concave function ϕ( d i u ) in (F k ), a practical approach is to construct a surrogate function by its first order Taylor approximation. Indeed, existing references like [46,50] applied this strategy to linearize concave penalties differentiable over (0, +∞). However, here, our truncated regularization function has multiple non-differentiable points like those in M . A good thing is that we can adopt a general linearization method at non-differentiable points belonging to the set M ; see the weight w k+1 i in (5). This guarantees that the apprxoimating linear function is above the original truncated regularization function, and serves as a surrogate function. Moreover, together with a proximal idea, we obtain the following iterative support shrinking algorithm with proximal general linearization (ISSAPGL).

For each
where Ω k 0 = Ω 0 (u k ) and Ω k where Ω k+1 3.2. Some properties of ISSAPGL. In this subsection, some properties of the proposed ISSAPGL are given. We first verify that the iterative sequence {F (u k )} satisfies a sufficient decrease property, which shows the convergence of the objective function sequence. Then, we establish a lower bound theory for d i u k .
Proof. It is straightforward from Proposition 1 that {F (u k )} is not only nonincreasing but also bounded. Thus, it converges as k → ∞. Accordingly, lim k→∞ u k+1 − u k = 0 from (6). Now, we establish a lower bound theory for d i u k , which shows that ISSAPGL has the ability of keeping image edges.
From the monotonicity and boundedness of {Ω k 0 } ∞ k=0 in ISSAPGL, this sequence converges in a finite number of steps as in [46,40], i.e., ∃K, such that Ω k 0 = Ω K 0 , ∀k ≥ K. Thus, the set Ω k 0 in ISSAPGL is unchanged when k is sufficiently large. Accordingly, when k ≥ K, (H k ) can be represented as , we can eliminate these constraints of (H k ) by constructing E ω and u ω such that u = E ω u ω as [46]. Thus, Once u k+1 solves (H k ), then u k+1 ω solves (H ω k ). For u k ω , we introduce some index sets, which will be used in the following proof. Definition 3.1. For any u k ω , we define the following index sets (a) The index set of u k ω is denoted as J k ω .
Theorem 3.2. There exist an integer K ≥ K and a constant η > 0 such that Proof. To overcome the difficulities from the last term in (H ω k ) (box constraints), we first construct a new optimization problem as in [40].
For u k+1 being a solution of (H k ), index sets in Definitions 2.1, 3.1 and 6.1 are simplified as . Because each row of E ω only has one nonzero element 1 and I k+1 ∩B k+1 = ∅, we can take two submatrices Together with (2), d x i u k+1 and d y i u k+1 can be represented as Define the follwing problem: }. By a similar arguement for (3), we have t x,k+1 i = t y,k+1 i = 0, ∀i ∈ Ω K 0 , and we can also verify (u k+1 ω ) I k+1 ω being a local minimizer of (9) as the discussion for (4).
Compared to (H ω k ) in (8), the first-order necessary condition of (R ω k ) is easier to be obtained, because the indicator function in R ω k (r ω ) is defined on an open set. Choosing r ω ∈ K(I k+1 0 ), we use the first-order necessary condition at (u k+1 Let z = (E ω ) I k+1 r ω . The equation above is reformulated as Since {u k } is bounded, there exists ζ > 0, which is independent of k such that (10) Based on (10), we prove the conclusion by induction. For all i ∈ Ω K 1 , we sort d i u k+1 as d i1 u k+1 ≥ d i2 u k+1 ≥ · · · ≥ d ir u k+1 > 0 withr = #(Ω K 1 ) ≤ N . We define L k+1 as a set consisting leaping pixels [47,40] with respect to u k+1 . First, we establish a lower bound for d i1 u k+1 .
1 . Similar to [46], we can find a z 1 satisfying Only one of the following two cases holds: √ Γ, and we define Assume that there exist K s−1 and η s−1 , such that d ij u k+1 > η s−1 and d ij u k > η s−1 , for any i j ∈ {i 1 , i 2 , · · · , i s−1 } with 1 < i s < ir. Note that, the constant η s−1 is independent of k. Next, our task is to find a lower bound for N , then we can similarly select a z s satisfying Similarly, only one of the following two cases holds: which is independent of k. Thus, there exists K s > K s−1 , such that d is u k+1 > η s for ∀k > K s . From Case3a and Case3b, by choosing η s = min{ η s , η s−1 }, we obtain d is u k+1 > η s .
According to the inductive above, there exist K > Kr and η = ηr, such that d i u k+1 > η, for ∀k ≥ K, ∀i ∈ Ω K 1 . This completes the proof. 3.3. The implementation details of ISSAPGL. Since (H k ) in ISSAPGL is a strongly convex optimization problem, it has a unique optimal solution. We use the idea of ADMM [43,7] to solve it. At first, we introduce an auxiliary variable v, Consequently, (H k ) is equivalent to The augmented Lagrangian functional for the optimization problem (13) is where λ is a Lagrange multiplier and r is a positive constant. The corresponding ADMM procedure for solving (13) is shown in the following algorithm.
ADMM: the alternating direction method of multipliers for solving (13) If a termination criterion is not met, then go to Step 1. Otherwise, output u k+1 = u k, k+1 . We now solve the two subproblems in the above algorithm.
We can obtain a good structure of (15) given by To solve these types of the constrained quadratic programming problems, we apply the nonmonotone projected gradient method proposed in [4,51]. 2. the v-subproblem (16): (16) can be simplied as It is clear that the optimal value of v k, k+1 i uses shrinkage operators as follows

Experimental results.
In this section, we show the performance of our model in CT reconstruction compared with two existing popular methods, TV based model [22] and the tight frame wavelet based model [51]. We respectively denote them as TV and TW-0 . We implemented TV model by ADMM, while downloaded the code of TW-0 from the authers' homepage. In all numerical examples, the quality of image recovery is measured by peak signal to noise ratio (PSNR) and structural similarity index (SSIM) [41]. All experiments are conducted in MATLAB R2016a on a desktop with Intel Core i5 CPU at 3.3 GHz and 8 GB of memory. We use the well-konwn "Shepp-Logan" phantom and NURBS-based cardiac-torso (NCAT) phantom [37] as our test images. They are shown in Figure 2, where the range of image values is [0, 1]. Throughout our experiments, the projection matrix A is based on parallel-beam scanning with different numbers of angles and ε is generated from a zero mean Gaussian distribution with three noise levels: σ = 0.005 g ∞ , 0.01 g ∞ and 0.02 g ∞ , respectively.
In the proposed algorithm, we set ρ = 0.1 and r = 40, and the stop condition is In terms of a, p and τ in our model, many experiments for the "Shepp-Logan" image with 36 projections corrupted by 0.01 g ∞ Gaussian noise show that the reasonable range of a is [3,6], of p is [0.3, 0.7] and of τ is [0.4, 0.7]; see Figure 3. For the sake of convenience, we set a = 5, p = 0.5 and τ = 0.5 in our all experiments. Thus, we only need to tune the model parameter α in our method, which is dependent on the noise level.  In the following, we conduct different experiments with various projection numbers: N p = 18, 36 and 72 under different noise levels: 0.005 g ∞ , 0.01 g ∞ and 0.02 g ∞ . For a fair comparison, the parameters of the compared methods are also tuned to achieve the best PSNR and SSIM. In Figures 4 and 5, we exhibit the reconstructions under three projection numbers with noise level σ = 0.01 g ∞ . In terms of visual inspection, one can see that our proposed model can recover piecewise constant images better than the other two models. In contrast, the TV-based model produces false edges in homogeneous parts. Although the non-convex TW-0 and our method have a stronger tendency to connect the three different ellipses (located at the bottom of the Logan phantom) compared to TV when N p = 18 and 36, our method with truncp has the similar result with TV in recovering them when N p = 72. Moreover, from zoom-in views of Figure 6, the edges reconstructed by TV are not preserved well and seem to be unclear. Although the TW-0 model can reconstruct these neat edges and behaves well in homogeneous regions, it yields some abnormal points on edges; see the results in Figure 5. This phenomenon also can be observed in Figure 6. In contrast, our model can preserve this edge structure very well, and obtain the best results qualitatively and quantitatively. More comparisons in terms of PSNR and SSIM with different numbers of projection angles and noise levels are shown in Tables 1 and 2. It is clear that the proposed method outperforms the TV-based and tight wavelet frame-based models.
To further visualize the difference between these methods, Figure 7 shows the residual errors of the reconstruction results in Figures 4 and 5, i.e., |u − u 0 | with u being the reconstruction image and u 0 being the ground truth. We observe that our model with the proposed algorithm obtains the smallest errors among compared methods. Moreover, Figure 8 lists the 60th row and 80th row of the results restored by different methods for "Shepp-Logan". One can also see that our non-Lipschitz truncp model can match the ground truth very well. Figure 9 shows the values F (u k ) of trunc-LN and truncp versus the iteration number for "Shepp-Logan" image with 36 projections. We see that when the iteration number increases and is lareger that 10, F (u k ) is decreasing and converges. This verifies the theoretical results in Proposition (1). Figure 10 shows the reconstructions for a real data of the cheat (Case courtesy of Dr Andrew Dixon, Radiopaedia.org, rID: 36676). Note that it is noiseless testing (      with projection number N = 60. In order to compare the difference, we enlarge two regions of interest (the red and green squares). One can see that the TV based model can achieve the highest PSNR among them, but it has some artifacts in the homogenous region, as shown in the amplifications of red square. Considering the texture part of vessel (the green square), our proposed trunc-LN and truncp models can preserve this structure as well as TV, better than TW-0 .

Conclusion.
In this paper, we proposed a nonconvex truncated regularization with box-constrained model for low-dose CT reconstruction. It combined the sparse recovery property of nonconvex potential functions and the contrast preservation property of truncation strategy. By showing two theoretical results for the local minimizers of our model, an iterative support shrinking algorithm was presented. Then, a general linearization technique was introduced to handle nondifferentiable points. For the proposed algorithm, we verified its sufficient decrease property of the objective function and provided a lower bound theory of the iterative sequence. Among all compared approaches, our model with the proposed algorithm performs quite well in visual and quantitative results.    Since ϕ( d i u ) in our model is not subdifferentially regular at these nondifferential points in M , it is difficult to analyze the convergence of the iterative sequence. Our future work is to improve the current algorithm to guarantee its convergence. 6. Appendix.
6.1. More index sets and some facts. We further introduce the following index sets used in our proof and give a simple example to explain them.
Definition 6.1. For any u ∈ U , we define more index sets Here, the subset B 1 (u) of Ω 1 (u) includes pixels equal to l 1 or l 2 with its left and right pixels being l 1 or l 2 . Similarly, the subset B 0 (u) of Ω 0 (u) consists of pixels equal to l 1 or l 2 with its left and right pixels being the same as itself. A simple example with the periodic boundary condition is shown in Fig. 11 to explain these index sets in Definitions 2.1 and 6.1. Here, l 1 = 0, l 2 = 1, t =  From Definitions 2.1 and 6.1, we can similarly obtain the following facts as [40]. Proof. Recall the right-side and left-side derivatives of R(·) at z in the direction of v: If R(·) is differentiable at z, then d + v R(z) = d − v R(z) = ∇R(z), v . By the first-order necessary condition at the local minimizer z * of (R), we have , ∀v ∈ K(I * 0 ) = {z ∈ R |I * | : (d i ) I * z = 0, ∀i ∈ I * 0 }.
Based on the discussion above, when v = z * , the first-order necessary condition in (18) produces From Assumption 1(b), this inequality is impossible to be true. Thus, we have Ω * 1,M = ∅, which completes the proof.