Edge-adaptive l2 regularization image reconstruction from non-uniform Fourier data

Total variation regularization based on the l1 norm is ubiquitous in image reconstruction. However, the resulting reconstructions are not always as sparse in the edge domain as desired. Iteratively reweighted methods provide some improvement in accuracy, but at the cost of extended runtime. In this paper we examine these methods for the case of data acquired as non-uniform Fourier samples. We then develop a non-iterative weighted regularization method that uses a pre-processing edge detection to find exactly where the sparsity should be in the edge domain. We show that its performance in terms of both accuracy and speed has the potential to outperform reweighted TV regularization methods.


Introduction
Data for reconstruction of piecewise smooth functions and images are sometimes acquired as nonuniform Fourier samples.This is the case in magnetic resonance imaging (MRI) and synthetic aperture radar (SAR).Since sparsity is inherent in the edge domain of piecewise smooth functions and images, 1 based total variation (TV) regularization, [28], is commonly employed for reconstruction.The development of compressed sensing, [4,5,6,11], has provided theoretical justification for using 1 regularization to promote sparsity in the appropiate domain.In some instances, however, these reconstructions are not as sparse in the edge domain as desired.This may be due to non-uniform sampling, noise, or the fact that the TV transform is not actually sparsifying with respect to a particular function, and is likely a combination of these factors.Consequently, the overall accuracy is reduced.One popular approach for correcting this problem is to use an iterative reweighting scheme, [7,9,10,23,26,33,34], which employs multiple runs of weighted p minimization (typically p = 1 but p = 2 is also an option).The main idea is to find the locations of non-zero entries in the edge domain and then apply regularization away from those locations.Iterative reweighting methods are shown to be more accurate than single-run TV methods, [7,9], and have been applied to problems where data are acquired as uniform Fourier samples, [7].The extension of these algorithms is straightforward for non-uniform Fourier data acquisition, although the implementation requires a non-uniform fast Fourier transform (NFFT), [13,24,25], and there are additional errors corresponding to the resulting fidelity term. 1 It is important to note that in this investigation we are considering that the data acquired are noisy continuous Fourier samples, which means that using the discrete NFFT generates additional model mismatch, [1].
This paper provides an alternative approach to this problem.We propose an algorithm for image reconstruction from non-uniform Fourier data that, rather than using iterative weighting, uses edge detection to indicate regions of sparsity in the edge domain and targets weighted 2 regularization appropriately.Unlike iterative reweighting, which requires multiple iterations of 1 minimization, there are only two steps to our method.The first step uses an 1 regularization based edge detection to create a mask, i.e. a weighting matrix, which dictates where non-zero entries are expected in the edge domain.The second step uses this mask to target 2 regularization only in non-edge regions, that is, regions of the function or image that are truly sparse in the edge domain.Put another way, our method uses regularization on targeted areas that we actually expect to be zero in the edge domain.Therefore, with a properly chosen mask, it is appropriate to regularize using the 2 norm, making the algorithm much more cost efficient.Moreover, in regions containing edges, our method relies solely on the fidelity term.This approach is particularly advantageous when noise is added since we can weigh the fidelity term lightly against the regularization term, which encourages noise reduction in non-edge regions.We call this method edge-adaptive 2 regularization.
There are several benefits to our proposed algorithm.First, it compares favorably in terms of accuracy (pointwise error) to iteratively weighted 1 regularization methods.It also provides better resolution around jumps than these methods.It is also more efficient to implement.Reweighted 1 methods take multiple iterations to identify the sparse regions.Our method uses only a single 1 minimization in the pre-processing edge detection step followed by a single 2 minimization in the main reconstruction step.Further, we are able to use faster conjugate gradient descent optimization methods available for 2 regularized problems.Finally, there is a closed form solution to our problem, which may be valuable in some settings.
The rest of the paper is organized as follows: Section 2 covers the necessary background in image reconstruction from non-uniform Fourier data.Section 3 applies an iteratively reweighted 1 regularization method to this problem.Section 4 describes the edge-adaptive approach and its benefits over the iterative method.Section 5 looks at numerical results.Conclusions and future work are in Section 6.

Preliminaries
In the one-dimensional case, we consider a piecewise smooth function f : [−1, 1] → R. Suppose we are given a finite sequence of non-uniform Fourier samples of f , where λ k ∈ R and k = −M, . . ., M .Specifically, we look at jittered sampling, defined by where ξ k ∼ U ([0, 1]).We will also consider the case where the underlying Fourier data in (1) are noisy, given by f , meaning η k is a complex Gaussian random variable with mean 0 and variance σ 2 .
In two dimensions, we analogously consider the piecewise smooth function f : [−1, 1] 2 → R. Suppose we are given a finite sequence of non-uniform Fourier samples of f , where The non-uniform jittered sampling pattern for λ k is given by where The sampling patterns in ( 2) and ( 5), displayed in Figure 1, simulate Cartesian grid samples with slight deviations that sometimes occur in real world measurement systems.We will also consider noisy two-dimensional Fourier data, f η (λ k ), defined analogously to (3).
For ease of presentation, we begin by describing some known techniques for piecewise smooth function reconstruction in the one-dimensional case, where the acquired data are given in (1).These methods are easily extended to reconstruct two-dimensional images, which will be demonstrated in Section 3.
Let f = {f (x j ) : j = −J, . . ., J} and f = { f (λ k ) : k = −M, . . ., M }.Since the underying function f is piecewise smooth, it is sparse in the edge domain.Hence 1 regularization provides an effective means for its reconstruction.In particular, f can be determined on a set of discrete grid points by solving the unconstrained optimization problem given by Here F N is the NFFT matrix (see e.g.[24,25] for details about NFFT solvers), λ > 0 is the 1 regularization parameter, and T is a transformation to the edge domain.The choice for regularization parameter, λ, is typically problem dependent, [27].We note that in this investigation we used single digit accuracy for the NFFT algorithm.If f is a piecewise constant, for example a cross section of the Shepp Logan phantom (with increased contrast for visual perception) seen in Figure 13, then a standard choice for reconstruction is the solution to the TV-regularized optimization problem which is frequently written as Here D is simply the matrix that encodes the entry information from the sum in (7).Note that the sparsifying transformation in ( 8) is an approximation to the first derivative, effectively penalizing high gradients in the function and therefore encouraging sparsity in the edge domain.Applying TV regularization causes the well known staircase effect, whereby the solution is held to be piecewise constant regardless of the smoothness of the underlying function.If f is a sparse signal, e.g. a spike train, then a standard reconstruction is the solution to the 1 regularized optimization problem If f can be modeled as a piecewise polynomial, then a suitable regularization choice is high order total variation (HOTV), [1,8], yielding Here L m is the m th order polynomial annihilation (PA) transform, [1,2]. 2 For example, when m = 3 we have In general, L m can be viewed as a normalized appproximation of the m th derivative.In particular, L 0 = I, i.e. ( 10) is equivalent to (9), while using L 1 yields (7).
For two-dimensional images, we regularize in the x and y directions separately and solve where the regularization term ||L m g|| 1 penalizes gradients in the x direction and the regularization term ||g(L m ) T || 1 penalizes gradients in the y direction. 3s noted previously, using (10) is effective in reconstructing piecewise smooth functions and images in a large number of applications.For a variety of reasons, however, the assumption that the function or image is sparse in the edge domain is often flawed.One reason is noise, which will immediately degrade the edge sparsity of the solution to (10).For TV regularization, the assumption that the transformed image is sparse is often inadequate due to smooth variation away from jumps.This is somewhat mitigated by HOTV regularization.However, if due to lack of resolution the image has variation not accounted for away from discontinuities, even high order transformations will not produce the desired sparsity.Another source of error is non-uniform sampling, since some compromise in accuracy is necessary to maintain the efficiency of the NFFT.Finally, all 1 based methods suffer from the fact that the 1 norm penalizes large magnitudes more heavily.
A popular approach to mitigating error from issues such as noise, lack of resolution, and magnitude dependence is to use a scheme that employs iteratively reweighted (IR) regularization, [7,9,10,23,26,33,34].In these methods, multiple passes of TV or HOTV regularization with weighted p norms are used to "narrow in" on spikes in the edge domain.The weight at each point on the spatial grid is typically inversely proportional to the magnitude of that point in the edge domain of the previous iteration.That is, the regularization is more strongly enforced at points deemed as non-jumps and more weakly enforced at those identified as jumps.In this way, iterative reweighting more democratically penalizes high gradients and regularizes based on the spatial distribution of the sparsity.These iterative methods are typically more accurate than single pass methods, [7,9].We will use the next section to discuss IR methods in more detail.

Iteratively reweighted regularization methods
As explained in [4], reconstructing an image or function via solving where || • || 0 counts non-zero values, promotes the most sparsity in the edge domain of the image.However, this combinatorial problem is NP-hard.As in (10), the 1 term acts as a convex surrogate for the 0 term, making the problem easier to solve.But it does not encourage sparsity in the edge domain as much.Naturally, this begs the question of whether there are better surrogates that generate solvable optimization problems.The approach of [7] is to regularize using the log-sum function, a concave penalty function that more closely resembles the 0 norm and is therefore more sparsity-inducing.That reconstruction would be the solution to the optimization problem where > 0 is a parameter to stay within the domain of the logarithm.Since the log-sum function is nonconvex, ( 14) is difficult solve.Instead, we can approximate it with a series of weighted 1 based minimizations of the form where W is a diagonal matrix of weights.The main idea is that large weights can be used to discourage non-zero entries in the edge domain, while small weights can be used to encourage non-zero entries.Hence this method will penalize non-zero edge domain magnitudes more fairly, removing the magnitude dependence of unweighted 1 regularization.To achieve this, weights inversely proportional to the edge domain magnitudes of the previous iteration are used.In this way, IR methods encourage sparsity in the edge domain by regularizing less in areas with jumps and more in areas without jumps.Algorithms 1 and 2 are modified versions of the iterative weighting method used in [7] for one-and two-dimensional functions, respectively.They include the PA transform L m so that the staircasing effect from standard TV can be avoided.When m = 1, we will refer to Algorithms 1 and 2 as reweighted total variation (RWTV) methods.For m ≥ 2, they will be called reweighted high order total variation (RWHOTV) methods.
As explained in [7], the main advantages of this method are increased accuracy using the same number of Fourier coefficients and removal of magnitude dependence of the unweighted 1 norm.A chief example of when this method works very well is found in Section 3.6, in particular Figure 10, of [7].These advantages are balanced with some disadvantages.The runtime is increased max times for this method, since each iteration requires an 1 minimization step.In addition, this method introduces another parameter .No comprehensive method for choosing this parameter is provided in [7] and the success of this algorithm depends on an appropriate choice.Finally, there still appear to be clear sources of error generated by this method.
As prototype examples to test the IR method, we consider Algorithm 2 Iteratively reweighted (IR) 1 regularization reconstruction in two dimensions i,j = 1 for i, j = −J . . .J and w (0) i,j = 1 for i, j = −J . . .J. Fix the regularization parameter ρ > 0, the weighting parameter > 0, and an appropriate total variation order m.2: Solve the weighted regularization minimization problem 3: Update the weights.For each (i, j) such that i, j = −J, . . ., J, v 4: Terminate on convergence or when attains a pre-specified maximum number of iterations max .Otherwise, increment and go to step 2.
Example 1 Example 2 and Example 3 One issue with Algorithms 1 and 2 is that when noise is present there are non-zero weights being applied in areas of smooth variation (and no variation), causing false jump identifications and ultimately oscillation in the reconstruction.Effectively, the oscillations caused by noise in the initial TV solution are propagated through all the iterations via the weighting matrix.The algorithm has no way to validate whether these oscillations are from actual variation, a jump, or noise.Figure 2 demonstrates the use of Algorithm 1 for f 1 (x) and f 2 (x) when the given Fourier data (1) is noise free and when complex zero-mean Gaussian noise is added.Notice how the oscillations in the reconstruction increase where the function has more smooth variation.
The source of this error is from points weighted between 0 and 1 .These weights can indicate either a small jump or variation in a smooth region that is beyond the resolution of the problem, which could be attributable to noise, or more simply the variation of the function itself.If there is a small jump, the iterative reweighting strategy still regularizes at that point, albeit relatively less.If it is just noise or normal variation, then the algorithm regularizes less at that point for no reason.This will automatically reduce the algorithm's ability to separate the true scales of the underlying image by causing false jump identifications, leading to an overall less accurate reconstruction.A two-dimensional example of this phenomenon can be seen in Figure 3, which shows a reconstruction via Algorithm 2 of the 257 × 257 pixel function f 3 (x, y).cause of these inaccuracies.The left image shows the ideal weights w i,j as in (19).By ideal, we mean computed using (19) directly from the exact function f 3 (x, y).The goal of Algorithm 2 is to converge to these weights, since that would mean the algorithm generated f ( ) very close to the true f 3 (x, y).On the right we display the actual weights w i,j computed by Algorithm 2. The black or gray points in the right weighting matrix different from those obtained by the ideal left weighting matrix are being identified by Algorithm 2 as either a jump or variation beyond the resolution of the problem.Since there are no jumps outside of the black area in this function, this weighting matrix shows that the algorithm is falsely identifying many jumps.In this context, that means that this algorithm is regularizing less in areas it shouldn't be, which leads to an overall less accurate reconstruction.
In what follows we demonstrate how Algorithms 1 and 2 can be improved upon in terms of accuracy, simplicity, efficiency and robustness.

Edge-adaptive 2 regularization
As discussed in [7], without prior information about the non-zero elements in a sparse signal (or similarly the locations of edges in an image), it is effective to choose the 1 regularization weights iteratively.However, when starting with Fourier data the weighting scheme adopted by Algorithms 1 and 2 is likely not the most direct way to penalize non-zero locations in the edge domain.Therefore we take a more direct approach.Specifically we locate the edges directly from the Fourier data, as in [17], and then construct the regularization weights to be zero-valued anywhere an edge is detected and non-zero at all other points.Unlike iterative reweighting, which requires multiple 1 minimizations, there are only two minimization steps in our new method.First, we perform an 1 regularization based edge detection to determine where the support is in the sparsity (edge) domain.We then create a mask, i.e. a weighting matrix, based on these regions.This mask allows us to target 2 regularization only to non-edge regions of the function, which is the second minimization step.In this way, our method uses regularization on regions that are assumed to be zero.Therefore the usual compressed sensing arguments for using the 1 norm as a surrogate for the 0 norm are not needed.In particular, it is just as appropriate to use the 2 norm to minimize something that is supposed to be zero, and it is much more cost efficient than using the 1 norm.We note that the method relies solely on the fidelity term in the (localized) support regions.

Edge detection from non-uniform Fourier data
The edge adaptive 2 regularization image reconstruction technique depends heavily on the selection of a weighting mask, which is explicitly determined by the edges recovered from the given non-uniform Fourier data.While there have been a number of algorithms designed to extract edges from (non-uniform) Fourier data, we will use the method introduced in [17].It is briefly described below.
Let us first consider a one-dimensional piecewise smooth function f : [−1, 1] → R. We define the jump function, [f ], as the difference between the left-and right-hand limits of the function: In smooth regions, [f ](x) = 0.At a discontinuity, [f ](x) is the value of the jump.Suppose we are given 2J + 1 grid points, x j = j J , j = −J, . . ., J. Assuming that the discontinuities of f are separated such that there is at most a single jump per cell, I j = [x j , x j+1 ), we can write where the coefficients [f ](x j ) is the value of the jump that occurs within the cell I j and δx j (x) is the indicator function with The concentration factor (CF) edge detection method, [15,18,19,20], approximates (24) from the first 2M + 1 uniform Fourier coefficients given in (1) where Here σ = σ(k) M k=−M , coined the concentration factor in [18], satisfies certain admissibility conditions.The convergence of (25) depends on the particular choice of σ.
The CF edge detection method cannot be extended directly to non-uniform Fourier coefficients because {e πiλ k x } M k=−M is not an orthogonal basis.It is also not as effective when bands of data may be missing or corrupted, [32].Therefore, as in [17] (see Algorithm 4), we approximate [f ] as the solution to the optimization problem That is, we fit the given Fourier data (here diag(σ) f ) using a forward NFFT operator A of the jump function vector g = {[f ](x j )} J j=−J and regularize with the sparsity of the jump function vector g, with µ > 0 being the regularization parameter.
As is discussed in [16,17,18], one way to develop the CF edge detection method is to first observe that for a set of jump discontinuities {ξ l } L l=1 , we have the first order approximation where and r ξ (x) = r(x − ξ) for ξ ∈ (−1, 1).Here a l is the corresponding jump value for ξ l .While ( 27) is not a very good approximation of f (x), it is perfectly reasonable to use to compute [f ](x).Specifically, from (27) we have This yields the j th row of A in ( 26) as e −iπx j k r where r are the Fourier coefficients of (28).However, as stated above, to improve efficiency we use the NFFT algorithm, [24,25].The CF vector σ is dependent on how δx j in ( 24) is regularized, which is necessary since δx j (x) has nontrivial values on a set of measure zero and therefore does not have a non-trivial Fourier expansion.As discussed in [17], the concentration factors can be determined as where hx j (x) ≈ δx j (x) and ĥx j (λ k ) are the corresponding Fourier coefficients at each λ k .For simplicity, in our experiments we choose each element σ(k) = 2iπλ k 2M +1 corresponding to regularization We note that other options may provide better convergence in some examples.In particular, the Gaussian function can provide smoothing in the presence of noise.In our testing with jittered data, however, we found no difference in performance, and in general used (31).We refer readers to [17] for a detailed analysis of the terms in (26).Figure 5 demonstrates the use of ( 26) on f 1 (x) and f 2 (x).In two dimensions, the jump functions in the x and y directions may be approximated by the respective solutions to the optimization problems where σx(k) = for each point (xt, ys) in the grid.Figure 6 shows the results for approximating [f 3 ](x, y) in (22).We note that the definition of jump value in (23) does not carry over to two dimensions.Indeed the approximation in ( 34) can be replaced as the average values from (33) or something else entirely.
The actual value of the jump is not critical for our purposes since we need only identify where the jump function is non-zero.Edge detection in and of itself can be an important tool in identifying physical structures in images or signals.In MRI, edge detection helps tissue boundary identification.In SAR, it can improve target identification.Unlike other algorithms, a potentially useful byproduct of our new algorithm is that we can also output an accurate edge map that we compute en route to the reconstruction.In particular, it can act as a cross-validation for the reconstruction.For the purposes of our reconstruction, we need only a binary edge map.In one dimension, this is easily generated as Here g * is determined from (26).The resolution-dependent threshold τ is typically inversely proportional to the number of grid points, meaning that with better resolution we should be able to detect jumps of smaller magnitude.When noise is present, the threshold is also dependent on the SNR.In two dimensions, we generate the binary edge maps as and where g * x and g * y are determined from (33).We note that another potential benefit of utilizing the edge map is that it only ever needs to be computed once per image.This means that if we have an edge map from another experiment, no matter how it was obtained, it can be used here too.
As will be described below, edge detection and the generation of the binary edge map are critical in determining the weighting mask for the regularization term, and is even more important when the given data are noisy.In particular, edge information enables us to adapt the regularization parameter to weight relatively lightly on fidelity in non-edge areas and heavily on fidelity in edge areas where we are in general more confident of non-zero values in the sparsity domain.

Edge-adaptive 2 regularization image reconstruction algorithm
Once the binary edge map in (35) or ( 37) is formed, the next step of the edge-adaptive 2 regularization method is to create a weighting matrix, or mask.This process is detailed in Algorithm 3 for the one-dimensional case and Algorithm 4 for the two-dimensional case.When the PA transform is used in the regularization term for the reconstruction procedure, the corresponding regularization mask must include the stencil corresponding to the degree of the PA transform used.This is because the PA transform forms an oscillatory response in the stencil surrounding the jump.For example, when using L 3 in ( 11), there is an oscillatory response in 4 points surrounding a detected jump (including the pixel value associated with the jump).We note that the stencil size of any weighting mask would directly correspond to the degree of the corresponding sparsifying transform operator (e.g.wavelets).
The mask is M = diag(z).

Algorithm 4 Mask creation in two dimensions
1: Starting from Fourier data as in (4), reconstruct the jump functions in the x and y directions using equations (33) as g * x and g * y and combine into a single edge map, g * , using (34).2: For each index i, j = −J, . . ., J such that |(g * x ) i,j | > τ , set x i,j = 1.Else, x i,j = 0. 3: For each index i, j = −J, . . ., J such that |(g * y ) i,j | > τ , set y i,j = 1.Else, y i,j = 0. 4: For each index i, j = −J, . . ., J such that |(L m x) i,j | > τ , set M x i,j = 0. Else, M y i,j = 1.The x direction mask is the matrix M x .5: For each index i, j = −J, . . ., J such that |(y(L m ) T ) i,j | > τ , set M y i,j = 0. Else, M y i,j = 1.The y direction mask is the matrix M y .
Observe that unlike the iterative weights described in Section 3, the weighting masks generated by Algorithms 3 and 4 are binary.There are three advantages to our technique: (i) accuracy is improved since in general we do not falsely identify jumps; (ii) our method is more efficient since we do not have to perform expensive iterations to locate the edges; and (iii) we improve the efficiency even further since we do not need to use 1 regularization in the reconstruction step.This is because we have reframed the optimization problem in (10) as: where δ > 0 is a threshold on the fidelity term.Observe that minimizing ||M L m g|| 2 in (38) is equivalent to setting up the usual sparsity constraint, which requires L m g to have only a few nonzero values, or more precisely, values above a chosen threshold.Algorithm 5 demonstrates how the Algorithm 5 Edge-adaptive image reconstruction in one dimension 1: Construct the mask, M , using Algorithm 3.

2:
The edge-adaptive 2 regularization image reconstruction is the solution to the optimization problem, where λ > 0 is the regularization parameter.
constrained optimization problem (38) can be solved by converting it into an equivalent unconstrained optimization problem.Algorithm 6 details the algorithm for two-dimensional functions and images.We note that (39) has a closed form solution This closed form may be valuable in some contexts.As the size of the problem increases, however, the inversion in (40) becomes more computationally expensive, so in Section 5 we take advantage of the conjugate gradient descent method [22].
Algorithm 6 Edge-adaptive image reconstruction in two dimensions 1: Construct the mask, M , using Algorithm 4.

2:
The edge-adaptive 2 regularization image reconstruction is the solution to the optimization problem, where λ > 0 is the regularization parameter.

Numerical results
In the numerical experiments that follow we compare the edge-adaptive 2 regularization image reconstruction given by Algorithms 5 and 6 to the iteratively reweighted method of Algorithm 1 and 2. We use the Split Bregman method, [21,35] to implement the minimization step in Algorithms 1 and 2. We follow the recommendation in [7] in choosing the parameter to be slightly smaller than the expected nonzero magnitudes of L m f , since this will provide the necessary stability to correct for inaccurate coefficient estimates while still improving upon the unweighted TV algorithm.We note that in [9] it is shown that updating in each iteration yields superior results for the problem of sparse signal recovery.However, since this is not applicable for functions with more variation, we did not consider this adaptive approach.Algorithms 5 and 6, which only require 2 -regularized minimization, are performed using conjugate gradient descent, [22].In what follows we look at examples in both one and two dimensions and the results using these algorithms for different regularization parameter.We also vary the PA order m, add noise to the initial data, and limit the amount of initial data.In all cases we compare the accuracy and efficiency of each algorithm.Finally, we demonstrate the success of our new algorithm on synthetic aperture radar (SAR) data, [12].

One-dimensional test case
Figure 7 compares the results of Algorithm 1 and Algorithm 5 for f 1 (x) in (20), where the acquired data are 257 noise-free jittered Fourier samples given by (1).We computed the relative error, for each algorithm, resulting in RE = .0446using Algorithm 1 and RE = .0155using Algorithm 5.In addition to improving the overall accuracy, it is evident that due to the precise jump identification yielded using (35), there is improved resolution and reduced error in the neighborhood of the jump.Similar results are obtained in two dimensions, as is confirmed in Figures 8 and 9, which compare the results using Algorithms 2 and 6 for f 3 (x, y) given in (22).The data acquired are 257 2 noise-free jittered Fourier samples given by (4).It is evident that Algorithm 6 yields both better overall accuracy in terms of relative error, RE = .0414for Algorithm 6 versus RE = .0616for Algorithm 2, as well as improved resolution near the edges of the image, exhibited by the smaller white ring in its error plot.

Robustness of regularization parameter
Choosing the regularization parameter for the minimization step of optimization-based reconstruction methods, e.g. as in (10), is typically difficult and problem-dependent, [27], yet crucial to the success of the algorithm.Using the edge-adaptive 2 method, we observe a robustness with respect to the choice of this parameter.This is shown in Figure 10 (right), which displays the pointwise error plots comparing Algorithm 1 (RWTV) and Algorithm 5 for reconstructing f 1 (x) in (20) for various values of λ.Observe that our edge-adaptive 2 method outperforms the RWTV reconstruction for a wide range of λ.Such robustness is critical since in many applications reliable ground truth information is not available.In terms of relative error, Algorithm 1 yielded RE = .0446,while even in the worst case, λ = 100, Algorithm 5 produced RE = .0478.Moreover, it is evident that for all choices of λ, there is improved resolution using our algorithm in neighborhoods of the jump.These results are particularly impressive when compared with the robustness of Algorithm 1 with respect to the regularization parameter ρ as seen in Figure 10 (left).There we see that the accuracy varies strongly with ρ, in particular around the jump.Regardless of the choice of λ, Algorithm 5 outperformed Algorithm 1, especially near the jump.

Comparison of PA order
We now consider f 2 (x) in (21).Because of the increased variation between the jumps, we test different values of m, the order of the PA method.Recall that the polynomial annihilation (PA) method annihilates polynomials of degree m − 1 in smooth regions.Figure 11   using m = 1, 2, 3. We see improved overall accuracy and lower error around jumps using Algorithm 5 in each case.

One-dimensional examples with noise
Noise in the data acquisition process of imaging systems has the potential to seriously degrade the quality of a reconstruction.Figure 12 compares each algorithm for both f 1 (x) and f 2 (x) when our data consists of 257 noisy jittered Fourier samples, (3).Here the noise is assumed to be zero-mean complex Gaussian.While the edge-adaptive 2 no longer provides significant improvement in the overall error, it is still evident that the functions are resolved better in the neighborhoods of the jumps.

Limited data (compressed sensing) example
To test our algorithm's performance when starting from limited data, we consider the Shepp-Logan phantom, [31], shown in Figure 13.In this experiment, we randomly select just part of the initial Fourier data to use.Starting from 257 × 257 jittered Fourier modes as in (4), Figures 14, 15 and 16 respectively show the results using roughly a fourth, half, and three fourths of these modes.It is evident that the edge adaptive algorithm is particularly effective when the edges are close together, that is, it appears in general to have better resolution properties.Further theoretical and numerical study is needed to determine precisely the maximum compression ratio achievable by this method. 4ynthetic Aperture Radar (SAR) example As a final example, we consider the synthetic aperture radar (SAR) phase history data of a vehicle given in [12].SAR is an all weather, night or day imaging modality whereby an image is reconstructed from electromagnetic scattering data.In SAR we assume only a sparse number of isotropic point scatterers, so the standard 1 regularized reconstruction solves where Θ * is a diagonal phase extraction matrix yielding Θ * f ≈ |f |.This is needed because the phase of f is not sparse.(See e.g.[29] for details on the construction of Θ.)The edge-adaptive 2 method for this application is then where M = M x + M y is the mask found through Algorithm 4. Since we are now using 2 regularization, the phase extraction matrix Θ * is unnecessary.SAR data have a significant amount of noise, [14].Nevertheless we are able to locate the edges with relatively high confidence.Algorithm

Efficiency of Edge-Adaptive 2 minimization
Our new edge-adaptive 2 method was shown to be more efficient than Algorithm 2 in all experiments.This is to be expected since in general 2 regularized problems are much easier to solve than 1 minimizations.1: Run time comparison between Algorithms 2 and 6 for reconstructing f 3 (x, y).We used max = 5.The run time includes the time to perform Algorithm 4.
Table 1 shows a comparison of the runtimes 5 for Algorithm 2 using max = 5 and Algorithm 6 including the mask generation in Algorithm 4 for f 3 (x, y).For smaller images, e.g.those reconstructed on a 129 × 129 pixel grid given 129 × 129 Fourier samples, the runtime for Algorithm 6 is in seconds, compared to minutes for Algorithm 2. Note that this means that Algorithm 6, including edge detection, is faster than even a single iteration of Algorithm 2. These gains are even more significant as the images increase in size.For example, given 1025 × 1025 Fourier samples reconstructed on 1025 × 1025 grid points, our new algorithm computes the results in about 6 and a half minutes, while Algorithm 2 took over 3 hours, which is over 30 minutes per iteration.We note that we did not implement accelerated homotopy-based algorithms for reweighted 1 methods as in [3], which may increase computational speed.In addition, the iteratively reweighted least squares method developed in [9] would also run more efficiently, since it also uses an 2 norm in the regularization.However, we would expect this method to also suffer from the same inaccuracies that arise from iteratively finding edges.

Conclusion
The edge-adaptive 2 regularization image reconstruction method introduced in this paper compares favorably in terms of image quality, sharpness around jumps, and noise reduction to 1 based and iteratively reweighted 1 based regularization reconstructions.It is also more efficient, requiring just a single 1 minimization solution that only needs to be performed once for an image.In fact, if the edges are already known from some other experiment, they can directly be used in our algorithm.After the edge detection, we can rely on faster conjugate gradient descent methods to solve the easier 2 minimization problem.For some applications it may be useful to use the edge map produced in Algorithm 4 as a cross-validation of the image.The results for compressed imaging are promising, although more work is needed to determine how much compression is possible.Future investigations will also include a variable (rather than binary) map, which may be important when the intensity of the images vary widely in scale.Moreover, this will allow us to generalize our technique to any problems for which separation of scales may be advantageous, that is, not just for identifying edges.To this end, recent work in [30] on weighted p regularization methods might be useful.We also will extend our new algorithm to multi-measurement vector (MMV) applications, as the efficiency gained by our method would be even more significant in this case.Finally, we will test our method on other types of acquired data as well as other sparsifying transform operators, such as wavelets, which may be advantageous in some applications.
Figure 4  elucidates the

Figure 4 :
Figure 4: (Left) Ideal weighting matrix for the y-direction edges computed from (19) using the exact 257 × 257 two-dimensional function f 3 (x, y); (Right) final weighting matrix for the y-direction edges produced by Algorithm 2. The minimum weight of 0 is indicated by black while maximum weight of 1 ≈ 1.11 is indicated by white, while gray indicates a weight in between 0 and 1 .

Figure 9 :
Figure 9: Comparison of errors using Algorithms 2 (left) and 6 (right) for reconstructing f 3 (x, y), using the same parameters as Figure 8.
compares the results

Figure 17 :
Figure 17: (Top) Reconstruction comparison of SAR vehicle data using (left) equation (43) and (right) Algorithm 6. (Bottom) A close up of the lower right tire.