LOCAL BLOCK OPERATORS AND TV REGULARIZATION BASED IMAGE INPAINTING

,


1.
Introduction. Image inpainting is an important and active topic in image processing and computer vision. The purpose of image inpainting is to fill in the information for damaged or occluded regions of an image and to restore the integrity of the image by using the observed information. There are many applications in protection of cultural relics, restoring aged or damaged photographs and films, text removal and scratch removal, and so on.
Most inpainting techniques which have been proposed in recent years can be classified into three groups: partial differential equation (PDE) based methods, block based methods and sparsity based methods.
PDE based methods try to fill in the missing region by propagating information from the known region into the missing region. In [4], the first inpainting model (known as the BSCB inpainting model) based on the PDE method was presented by Bertalmio et al., who aim to smoothly propagate information from the surrounding areas in the isophotes (i.e.,level lines of equal gray values) direction. This encouraged Chan and Shen [11] to develop a total variation (TV) inpainting model for cartoon images, which is closely connected to the classical TV denoising model of Rudin, Osher and Fatemi [25]. In [12], Chan and Shen proposed the Curvature-Driven Diffusions (CDD) inpainting model, aiming at using the curvature of the isophotos to realize the Connectivity Principle. In [13], Chan, Kang and Shen presented a variational inpainting model based on Euler's elastica energy, which combines CDD and BSCB. Other works include the Mumford-Shah-Euler inpainting model [17], inpainting methods based on the Cahn-Hilliard equation [5] [7], and inpainting with the TV-stokes equation [26]. In general, variational and PDE based methods show good performance in geometric structure images (cartoon or structure) with small image gaps, but fail in the presence of texture, since the diffusion manner will oversmooth the missing region and cause a blurring effect.
Block based methods basically fill in the missing regions by copying blocks from the known regions. Block based methods are non-local in the sense that the whole image may be scanned in search of a matching block. They provide impressive results in texture inpainting. In [14], Criminisi et al. presented an exemplar based inpainting algorithm in which filling order is critical to propagate both texture and structure information. Aujol et al. [3] illustrated the ability of the exemplar based methods to reconstruct geometric features in a global variational framework. In [10], a geometrically guided exemplar based inpainting method for the joint restoration of texture and geometry was presented by Cao et al.. In [15], Demanet et al. pointed out that the problem of block based inpainting can be regarded as finding a correspondence map F from the inpainting region to the available region. Then, each pixel value x in the inpainting region can be represented by the map F and the available region. Thus the map F should be chosen such that the block centered at y shall be similar to the one centered at F (x) as much as possible. In [21], Kawai et al. proposed an image inpainting model based on the block similarity considering brightness changes of sample textures and introducing spatial locality as an additional constraint. Meanwhile, Wexler et al. [27] presented a method for space-time video completion and synthesis using various space-time blocks. This method is capable of constructing large space-time regions of missing information containing complex dynamic scenes. In [1], Arias et al. proposed a variational framework for non-local image inpainting. In order to avoid local minima, they use several different patch sizes and incorporate the multi-scale approach in the algorithm. However, such image blocks based methods may contain some redundant information and lead to undesirable artificial texture.
In recent years, sparsity based methods have become popular for image inpainting. The main idea of sparsity based methods is to find a sparse representation of an image using appropriate basis functions, such as discrete cosine transform (DCT), wavelets, SVD. Then the missing pixels can be estimated by adaptively shrinking the sparse coefficients. Elad et al. [16] developed an image inpainting method based on morphological component analysis (MCA), which can fill in the inpainting regions with both cartoon and texture layers effectively. In [20], an adaptive inpainting algorithm using sparse reconstructions was proposed by Guleryuz et al..
In [18], Fadili et al. introduced an expectation maximization (EM) algorithm for image inpainting based on sparse representations. Xu et al. [29] proposed two types of patch sparsity of natural image and applied them to the block based inpainting method. The approach in [22] uses low-rank matrix completion for a non-local image inpainting model. In [23], Li et al. proposed a universal variational framework for sparsity based image inpainting. Theoretically they proved the convergence of the proposed algorithm by using the properties of non-expensive operators. However, fixed basis functions are adopted in most of the mentioned sparsity methods. This has some limitations for sparse representations for all kinds of image blocks. Liu et al. [24] developed an image denoising method based on the group sparsity and basis updating. Together with TV regularization, the quality of the restorations can be greatly improved. Here, we extend our previous work [24] to image inpainting with matrix completion.
Inspired by matrix completion, we propose a novel inpainting model which combines sparsity based methods with block based methods. By using the SVD of patch matrices, we can get some local basis vectors called local SVD operators. Updating these local SVD operators can effectively promote the sparsity of the representation. Using TV we can reduce the artificial ring effect caused by image block stacking. We will prove that the proposed model is well-defined. To solve the proposed model efficiently, the splitting method is applied. We can get the convergence results for the splitting algorithm using a variational inequality method.
The rest of the paper is organized as follows. In Section 2, the related inpainting work is presented. In Section 3, we outline our proposed inpainting model. The existence of a minimizer for the proposed model is proved in Section 4. Section 5 contains the algorithm and some details about the implementation. In Section 6, we show some convergence results of our proposed algorithm. Experimental results are summarized in Section 7. Finally, we conclude this work in Section 8.
2. Related work. Throughout this paper, images are denoted as matrices f ∈ N as a vector f by stacking the columns of the f . We denote the latent clear images by f or f , while g or g are the observed images. Let Ω = {1, 2, · · · , N } ⊂ R 2 denotes the image region, O ⊂ Ω is the hole or inpainting region, and O c := Ω \ O stands for the available region. The size of the image patch is √ n × √ n. We take P ij ∈ R n×N as some extra binary matrices and P ij g ∈ R n stands for a vectorized ith most similar image block to the image patch at j ∈ Ω.
2.1. Matrix completion. Matrix completion is a good inpainting tool for lowrank images and textures. In [9], Emmanuel et al. demonstrate that a low-rank matrix can be perfectly recovered from an incomplete set of sampled entries. For an incomplete low-rank matrix g, one can recover the desired matrix f by solving the following constrained optimization problem: N , and the constraint condition requires that f should be closed to g in the known region O c .
Since this optimization problem is non-convex and NP-hard, it is not easy to be optimized by the known algorithms. Thus, the nuclear norm is usually considered as an alternative to approximate the rank of matrix. The related optimization problem is given by where · * is the nuclear norm defined as the sum of the singular values of f . This convex problem can be efficiently solved by singular value thresholding [8].

2.2.
Variational framework for image group and block based method. In [24], Liu et al. developed an image denoising method based on group sparsity and basis updating. By putting all the similar patches together, the restoration problem for a local image group centered at j can be written as where M j = [P 1j g, P 2j g, ..., P Ij j g], X j = [P 1j f, P 2j f, ..., P Ij j f ], and I j is the number of similar patches which is obtained by a block matching method. Applying , one can get the following equivalent problem where Σ M j ∈ R n×Ij is a diagonal matrix, U j ∈ R n×n and V j ∈ R Ij ×Ij are orthogonal unitary matrices, and the coefficient matrix Σ X j may not be a diagonal matrix on the basis U j and V j . Using the definition of Σ M j and the property of the Kronecker product, then a local SVD operator can be denoted as where v j i is the i-th column of (V j ) . Liu et al. proposed the following denoising variational model where p can be chosen as 0 or 1. Moreover, in order to enforce the sparsity representation, the basis functions U j and V j can be set to iteratively update during the implementation. It has been shown that the quality of restorations can be greatly improved.
3. The proposed method. Inspired by matrix completion and the variational framework for group and block based methods, we can construct a new inpainting variational model. The inpainting problem is different from the early mentioned Gaussian noise removal since it is more ill-posed. In fact, the inpainting problem is related to the salt or pepper noise removal. To apply the Gaussian noise removal methods, we shall split the inpainting problem into denoising and synthesis subproblems. In the first iteration of denoising subproblem, we utilize cubic interpolation to fill in the intensity at missing regions to get an initial guess for the inpainting result and regard this initial result as a noisy image, then we can use block matching method to get a better denoised image. In block matching, we can use l 2 distance to measure the similarity between the reference patch and all the patches in a search window and arrange the patches whose distance from the reference are smaller than a threshold value. In our current implementation codes, we set this threshold as 10σ 2 n, where σ 2 is an estimated noise variance in our denoising subproblem and n is the square of image patch size. Then a given number of patches are selected as the most similar patches for the reference patch. After getting the denoising results, the synthesis of the known and missed pixels can be well guided by the second subproblem. These two steps can be updated iteratively and thus we can get some good inpainting results.
In the next, we shall give the proposed model. Let g ∈ R N be a vector version of the observed image and P ij ∈ R n×N be a patch extractor. Entries of P ij are 0 and 1 and there is one and only one 1 in each row. Suppose there are I j blocks that are close to the jth reference patch, we form a patch matrix as Then, the inpainting problem for each image block centered at j can be written as where W j is the matrix with element entries 1 for the indices of P ij g in O c and 0 in O. In order to get a well-defined inpainting model, we replace 0 with a very small positive number ε in O to get a better theoretical property. In the numerical implementation, ε can be set to a very small value such as 10 −20 . Here X j = [P 1j f, P 2j f, ..., P Ij j f ], µ j is a regularization parameter, and the symbol "• " stands for the multiplication of the corresponding elements of two matrices.
This convex problem does not have a closed-form solution due to existence of W j . To solve this optimization problem efficiently, we can add a new auxiliary variable X j to substitute the transformed coefficients X j , and derive the following inpainting model by penalty techniques where σ is a penalty parameter and X j = [P 1jf , P 2jf , ..., P Ij jf ]. The SVD of X j is given by where Σ X j ∈ R n×Ij is a diagonal matrix and U j ∈ R n×n , V j ∈ R Ij ×Ij are orthogonal unitary matrices. If we choose U j and V j as basis, then we get for any X j ∈ R n×Ij . Note that the coefficient matrix Σ X j may not be a diagonal matrix. Then inspired by problem (2), we can rewrite the model as Next, we will give a global variational formulation for image inpainting based on the above local image blocks restoring problem (3).
For an imagef ∈ R N , let us review X j = [P 1jf , P 2jf , ..., P Ij jf ] and Σ X j = and the vectorized form as follows Here vec is a operator defined at the beginning of section 2. Now, we define the local SVD operator as In addition, we can define T j w as follows Based on the above analysis, we propose the following image inpainting model: where µ, µ j are positive regularization parameters, p can be set as 0 or 1, and T V is the discrete isotropic TV operator with the following discrete expression are two 1D difference matrices with respect to x and y directions.
Here T j w is an image patch based operator for the fidelity and T j is a patch based transform operator which can make T j f to be sparse. Let us consider the properties of T j w and T j . The main properties are summarized in the following propositions.
These three propositions can be easily obtained by directly calculating the matrix tensor products. We omit them here.
4. Existence of a minimizer. In this section we will prove the existence of a minimizer for the proposed model. Let us consider the energy functional for p = 1. In this case, the model is convex. Otherwise, for p < 1, it becomes non-convex and this problem would be more difficult. Let us set It is easy to see that problem (5) is equivalent to the following problem We can show the existence of a minimizer for the above minimization problem in the BV space by some standard discussions since it is a generalized ROF model.
The proof of this theorem is a standard discussion which can be found in many references such as [2]. For completeness of this paper, we list some keynotes in Appendix 8.

5.
Algorithms. To solve model (5), we add some auxiliary variables α = [α 1 , α 2 , · · · , α J ] ∈ R nIj ×J and α j ∈ R nIj , and we get a constrained minimization problem We use the standard augmented Lagrangian method to produce the following scheme: By applying an alternating minimization algorithm, (7) becomes These subproblems can be solved as follows.

1)f −subproblem: This has a closed-form solution
2) α−subproblem: In the case of p = 1, it can be solved by soft thresholding. The optimal result is given by where S is a shrinkage operator and S(f, µ) = f |f | max{|f | − µ, 0}. In the case of p = 0, it can be solved by hard thresholding. Similarly, where H(f, µ) = 0, |f | ≤ µ f, |f | > µ .
3) f − subproblem: This can be solved easily by the split Bregman iterations. We can rewrite it as a ROF model with a local SVD operator, where ). We summarize the proposed algorithm in algorithm 1: Algorithm 1 (local SVD operator based image inpainting). Input: g and inpainting mask. Output: inpainting result f,f . Setting initial value f 0 = g, γ 0 j = λ 0 j = 0. Let k = 1, do step 1. Block matching and basis updating: Find I j for each block centered at j according to f k−1 with block matching method, and one can get a new T j and T j w . step 2. Image inpainting: restoringf k by (9). step 3. Sparsity Regularization: computing α k according to (10) or (11). step 4. TV Regularization: updating f k by solving (12). step 5. Update Lagrangian multiplier: renewing λ k j and γ k j by formulation < tolerance or the maximum iteration number of this algorithm is reached, then the algorithm is terminated; else, k = k+1 and go to step 1.
6. Convergence analysis. In this section, we will prove the convergence of our algorithm for p = 1 if the local block operators T j , T j w are fixed. As for p = 0, the model would become non-convex and our numerical experiments show it is still convergent during practical computation. Now set We have the following convergence results.
then the sequence (α k ,f k , f k ) generated by the iteration scheme (7) is convergent and lim The details of proof also can be found in appendix 8.
7. Experiment result. In this section, we numerically demonstrate the superior performance of our proposed inpainting model on natural inpainting problems. For comparison, we shall compare it with five representative and related inpainting methods: cubic interpolation (matlab function, short for Cubic), a classical TV inpainting model [11], a coherence transport method [6] (short for CTM, the source codes are available at https://github.com/maerztom/inpaintBCT), the Beta-Bernoulli process factor analysis method [30] (short for BPFA), and a sparsity based method IDI-BM3D in [23]. All the experiments are run under Windows 7 and MATLAB R2012b with Intel Core i5-5200U CPU@2.80GHz and 8GB memory.
7.1. Parameter selection. In general, we set penalty parameters η 1 = 9, η 2 = 1 × 10 −3 , and time steps δ 1 = 1 × 10 −6 , δ 2 = 9 × 10 −3 , which satisfy the convergence condition. The TV regularization parameter µ is set as 0.1. Besides, in order to speed up the algorithm, firstly, we adopt the same method as IDI-BM3D to initialize the proposed model using the cubic interpolation results. In addition, we set the One may note that we choose greatly different values for penalty parameters η 1 , η 2 . Here both η 1 and η 2 are penalty parameters, but they play much different roles in our inpainting problem. η 1 is a penalty parameter to force T j f to equal to α j , and this term together with sparsity constraint can make f smooth (denoising). η 2 is a penalty parameter which controls a fidelity term between f and f . In the fsubproblem (denoising), sparsity plays a dominant role and thus we take a relatively large value for η 1 . On the other hand, in thef subproblem (synthesis),f k can be regarded as the weighted average of g and f k−1 , and η 2 controls the weight between them. We found numerically that when η 2 is small, then the weight of f k−1 will be small and the results will be good and stable. 7.2. Uniformly distributed missing pixels. In this section, we mainly consider an inpainting model with 80% of the pixels missing in a random manner. This problem is equivalent to removing the Salt and Pepper noise with 80% noise level. Only 20% of pixels are available. We compare our proposed-1 and -0 methods corresponding to p = 1 and p = 0 with Cubic, CTM, BPFA and IDI-BM3D. The first row of Fig. 1 shows the test images: Monarch, Lena and Barbara. The contaminated images are shown in the second row. The missing pixels are indicated in black. We observe that Cubic, CTM and BPFA keep image edges well, but the results contain obvious artifacts near edges. The results of IDI-BM3D compare well, but our proposed-0 method gives better PSNR (see Table 1), especially for images containing more texture information (see Fig. 2 for the local details of Barbara). Moreover, it can be seen that the proposed-0 method outperforms the proposed-1 method, which confirms that a non-convex 0 model could generate much better restoration results than convex models. Our proposed-0 method shows better performance than all the test methods in both visual result and quality index PSNR. 7.3. Text removal and scratch removal. In this section, we demonstrate the performance of our proposed method on text and scratch removal. We mainly compare the proposed method with some related method such as Cubic interpolation method, TV [11] based PDE method, dictionaries learning based BPFA [30] (a parameter free Bayesian algorithm, that learns dictionaries for inpainting from corrupted data) and block matching and sparsity based IDI-BM3D [23] method. The test images include Barbara, Hill and Baboon. The inpainting region is represented by white colour as shown in the second row of Fig. 3. The results of Cubic, BPFA and TV restore smooth areas well, however, they can not keep details on texture regions. We observe that IDI-BM3D and our proposed-0 inpainting model have better visual quality, but our proposed-0 method gives more texture information,  such as the kerchief of Barbara and the windows of Hill (see more details in Fig.  4), and gets higher PSNRs (see Table 2).
In Fig.5, we show the relative error for the proposed-1 model. The first column, i.e. Fig5 (a),(c),(e). illustrates the relative error curves of Monarch, Lena and Barbara in section 7.2, respectively. In the second column, the relative error curves of Barbara, Hill and Baboon in section 7.3 are displayed in Fig.5 (b),(d),(f), respectively. Similar results are demonstrated in Fig.6 for the proposed-0 method. 8. Conclusion. In this paper we propose a novel inpainting model based on block sparsity and TV regularization. The local SVD operator is effective in promoting a sparse representation. To solve the proposed model efficiently, we present a splitting inpainting algorithm. We mathematically prove the existence of a minimizer for the proposed inpainting model. Moreover, convergence analysis is given by a variational inequality method. The numerical experiments demonstrate that our proposed inpainting model performs well compared to some state-of-the-art inpainting results.
However, our method is not very fast due to the block matching and a number of SVD decompositions. Generally speaking, for an image with size 256 × 256, our (e) BPFA [30] (f) IDI-BM3D [23] (g) Proposed-1 (h) Proposed-0 (i) Clean (j) Mask (k) Cubic (l) TV [11] (m) BPFA [30] (n) IDI-BM3D [23] (o) Proposed-1 (p) Proposed-0 proposed model takes about 16 seconds to perform one iteration with our unoptimized matlab codes and CPU implementation. A possible research is to employ a training method to produce a good basis to greatly accelerate the algorithm. Appendix.
Proof of Theorem 1.
Proof. Let {f n } be a minimizing sequence for ( ), i.e. a sequence such that Since E(f ) is coercive, we get that f n is bounded in BV (Ω). Thus, there exists a subsequence f n (which we still relabel by n) and f in BV (Ω) such that f n  Figure 6. The relative error curves as functions of the iteration number on our experiments for the proposed-0 method.
that is, Therefore, f is minimizer of E(f ). The uniqueness can be easily obtained by the fact that the functional is strictly Proof of Theorem 2.
Taking norms and squaring on both sides of these two equations and summation with respect to j from 1 to J, we have By the second inequality in (13), (α * ,f * , f * ) is a minimizer of L(·, ·, ·; β * , ξ * ). According to the well known variational inequality, we have µT Similarly, the first equation in (14) is characterized by the variational inequality µT