NUMERICAL METHOD FOR IMAGE REGISTRATION MODEL BASED ON OPTIMAL MASS TRANSPORT

. This paper proposes a numerical method for solving a non-rigid image registration model based on optimal mass transport. The main contri- bution of this paper is to address two issues. One is that we impose a proper periodic boundary condition, such that when the reference and template im- ages are related by translation, or a combination of translation and non-rigid deformation, the numerical solution gives the underlying transformation. The other is that we design a numerical scheme that converges to the optimal transformation between the two images. As an additional beneﬁt, our approach can decompose the transformation into translation and non-rigid deformation. Our numerical results show that the numerical solutions yield good-quality trans- formations for non-rigid image registration problems.

1. Introduction. In many applications, one has to compare two images T (template) and R (reference) which display the same object, but the object inside the images is not spatially aligned, or the devices that record the two images are different. The image registration problem is to find a coordinate transformation φ which transforms the image T to another image T φ , such that T φ is similar and thus comparable to the image R.
One important application of image registration is to compare medical images of the same patient, such as CT (computed tomography) and MRI (magnetic resonance imaging) images of a damaged brain, which gives guidance for diagnosis and surgery [35,29]. Image registration can also be used for image fusion [30]. Multiple images of the same object are taken, registered and then merged together, such that the integrated image is clearer than the original ones. We refer readers to [26] for more discussion on applications.
This paper considers a non-rigid image registration method based on Monge-Kantorovich mass transport [27,28,23,44,12,39]. The approach was first proposed in [27,28]. This image registration model treats two images R and T as two mass densities. The goal is to find a mapping which transforms one mass density T to the other R with mass conservation. Such transformation is non-unique. By defining a transformation-dependent cost function and minimizing it, we can obtain a unique optimal transformation. This optimal transformation has desirable properties. For instance, it is usually diffeomorphic and does not introduce foldings and crossings.
The primary advantage of this image registration model is that, unlike many other non-rigid methods that are only applicable for images with small deformations, this model can be applied to images with large deformations. See Figures 1-2 in [27] for an example of images with large deformations. Indeed, given any R and T , the transformed image T φ under the mass transport formulation can be equal to R [39].
Numerical methods have been developed for solving the image registration model based on optimal mass transport. In [27] and [28], the authors construct an initial mass-preserving mapping φ 0 by solving a nonlinear partial differential equation (PDE), and obtain a second mass-preserving mapping φ s by solving another nonlinear PDE system, such that φ 0 •φ s is the optimal transformation. The entire process involves many intermediate steps. Also, in general, a nonlinear PDE (or PDE system) has multiple solutions. An immediate challenge is that the nonlinear PDE system in [27] and [28] can give multiple transformations between R and T , which may not be the optimal transformation 1 . To the best of our knowledge, it is unclear whether the numerical scheme in [27] and [28] gives the optimal transformation. An alternative approach for solving this image registration model is to solve an equivalent Monge-Ampère equation, which is also a nonlinear PDE and thus has multiple solutions. Among the multiple solutions, there exists a unique globally convex solution. The gradient of this solution corresponds to the optimal transformation between R and T [27,31]. The convex solution itself is usually called a scalar potential that generates the optimal transformation. Some literature has investigated numerical schemes for the Monge-Ampère equation arising from the image registration model [23,44,12]. In [23], the author proposes a finite difference scheme for the Monge-Ampère equation, and proves that the resulting transformation between R and T is optimal. However, in order to guarantee the optimality, the computational cost per pixel must increase to infinity as the image size increases [19]. Hence the method is not practical for large images. Other numerical schemes for the image registration model via the Monge-Ampère equation can be found in [44] and [12]. We remark that although the methodology in [12] is gradient descent, it ends up begin equivalent to solving the Monge-Ampère equation using explicit pseudo-timestepping. Whether the transformations given in [44] and [12] are optimal remains an open question.
Another issue in the existing literature is boundary conditions. We note that the transformation on the boundary of R and T cannot be determined from the mass transport model and thus needs to be specified by model users. Since the transformation on the boundary influences the transformation inside the images, it is crucial to specify a proper boundary condition, such that the quality of the resulting transformation is good. Here the transformation of good quality means that the resulting transformation should reflect the physical movement of the object inside R and T . For instance, if the object inside R and T is related by translation, then the resulting transformation should be able to recover the underlying translation.
A common boundary condition considered in the literature, such as in [23], is Neumann boundary condition, where the pixels on the boundary of one image stay on the boundary of the other under the transformation. Alternatively, assuming that R and T are periodic images, the authors in [44] use a periodic boundary condition on the solution of the Monge-Ampère equation (namely, on the scalar potential). However, under either of these boundary conditions, when R and T are related by translation, or by a combination of translation and non-rigid deformation, the resulting transformation may not be the underlying transformation. To the best of our knowledge, no existing literature using the mass transport model has registered translated images.
In this paper, we develop a numerical scheme for the image registration model based on optimal mass transport by solving the equivalent Monge-Ampère equation. We resolve the two major issues in the existing literature -imposing a proper boundary condition, and ensuring that the resulting transformation by a numerical method is the optimal transformation.
More specifically, we propose to use another periodic boundary condition, where we assume that the transformation (or the displacement of the pixels) on the boundary of the images is periodic. Thus, for the Monge-Ampère equation, it means that the gradient of the solution is periodic. This is different from [44] where periodicity is imposed on the solution itself. We will show that, in contrast to the commonly used Neumann boundary condition, our periodic boundary condition will recover the underlying transformation for the images that are related by translation or by a combination of translation and non-rigid deformation.
In order to ensure that our numerical scheme yields the optimal transformation between R and T , we adopt the theory of viscosity solution [16,15]. Viscosity solution of the Monge-Ampère equation is the solution that corresponds to the optimal transformation between R and T [23,24]. In this paper, we design a finite difference scheme based on our previous work [13] and the theoretical framework in [1]. Notably, we prove that our numerical solution converges to the viscosity solution of the Monge-Ampère equation. Consequentially, the resulting transformation computed by our numerical scheme is the optimal one.
In addition to these two contributions, our numerical scheme can automatically decompose the transformation between R and T into translation and non-rigid deformation components. This extra information is useful in visualizing and understanding the underlying transformation.
This paper is organized as follows. In Section 2, we describe the image registration model based on optimal mass transport. In particular, we have a detailed discussion on boundary conditions. Section 3 describes a finite difference discretization for the Monge-Ampère equation arising from the mass transport image registration model. In Section 4, we propose a modified Levenberg-Marquardt algorithm to solve the discretized system. In Section 5, we prove that our numerical scheme converges to the optimal transformation between R and T . Experimental results in Section 6 show that our numerical scheme gives the optimal transformations, and the results under our periodic boundary condition outperform those under the commonly used Neumann boundary condition. Section 7 concludes the paper.
2. Image registration model based on optimal mass transport.
2.1. Image registration. In this paper, we use the following notation: The objective of image registration problem is to find a coordinate transformation φ that minimizes the difference between ρ T φ and ρ R , which is usually measured by some function such as sum of squared differences . For simplicity, we restrict our discussion on two square images of the same size, or more specifically, Ω T = Ω R = [0, 1] × [0, 1]. In this paper, unless specified, we will drop the superscripts in Ω T and Ω R and denote them as Ω.

2.2.
Mass transport formulation. The mass transport problem considers transforming one pile of soils T to another pile R with mass preservation. Mathematically, let ρ T and ρ R be two bounded positive density functions on Ω T and Ω R respectively. Suppose ρ T and ρ R have the same total mass The goal is to find a coordinate transformation φ : Ω R → Ω T , or φ(x) =x, such that ρ T is transformed to ρ R while the total mass is preserved: or equivalently, where "det" is the determinant and Dφ(x) ∈ R 2×2 is the Jacobian of the transformation φ(x). Back to image registration, if we view the two images T and R as two piles of soils with the densities ρ T and ρ R , then the image registration problem can be interpreted as a mass transport problem. This idea motivates people to formulate the image registration problem as follows [27,28,39]: solve the mass transport problem (3) for the transformation φ(x), such that the template image ρ T (x) is transformed to a new image, defined by which equals to the reference image ρ R (x). There are two remarks regarding the mass transport registration model. One is that ρ T φ and ρ R are always equal for any T and R. In other words, the error measure, such as sum of squared differences (1), is zero. As a result, this method can transform any template image T to any reference image R [39].
The other remark is that, according to (4), the image registration based on mass transport consists of two components. One component is the movement of pixels from x to φ(x), which transforms the image to ρ T (φ(x)). The other component, called morphing effect, changes the intensity at each moved pixel φ(x) by a factor det[Dφ(x)], or more specifically, changes the intensity from ρ T (φ(x)) to det[Dφ(x)]ρ T (φ(x)). We note that morphing effect is an essential part of the mass transport model for two reasons. One is that if pixels can be treated as masses, then after a movement of masses, the accumulation/dissipation of masses in certain regions will inevitably cause an increase/decrease of the intensity. The other reason is that, although the movement of pixels alone registers T and R, it is the morphing effect that further corrects the registration error (e.g. sum of squared differences) to zero and makes T φ equals to R.

2.3.
Optimal mass transport. The mass transport registration in Section 2.2 is ill-posed. More specifically, there exist multiple transformations that move the soil ρ T to the same ρ T φ . Among all possible transformations, one of them requires "least cost", which is desirable. Following [27,28,2], we aim at finding the optimal transformation φ * (x) that minimizes the following cost function In other words, the optimal transformation φ * (x) minimizes the total squared displacements weighted by the masses, where the transport cost of heavier soil particles is higher. In essence, (5) regularizes the mass transport registration and makes the transformation between ρ T and ρ T φ unique.

2.4.
Monge-Ampère equation with convexity constraint. It has been proved in [31] that the optimal transformation that minimizes the cost function (5) can be written as Here u ∈ C(Ω R ) is a convex scalar potential field, where its gradient ∇u generates the optimal transformation φ * . Substituting (6) into (3), we have u is convex, (8) where D 2 u(x) ∈ R 2×2 is the Hessian matrix of u(x). Equation (7) is called Monge-Ampère equation.
We note that since the Monge-Ampère equation (7) is nonlinear, the equation itself without the convexity constraint (8) can have multiple solutions [19,3]. However, the solution of (7) that satisfies the convexity constraint (8) is unique [23], which we will denote as u * whenever we need to distinguish it from the other solutions. We emphasize that the convexity of u * is equivalent to the optimality of the transformation φ * = ∇u * [27,23].
The convexity of u * implies the following desirable properties for the optimal transformation φ * : (i) φ * does not introduce foldings or invert the order of pixels. For simplicity, we consider one-dimensional images. The solution u * being convex means φ * x = u * xx > 0 on the entire computational domain. Then for any x 1 < x 2 , their corresponding new coordinates satisfy φ * (x 1 ) < φ * (x 2 ). This implies no foldings or inversion of pixel order. We note that this property is a necessary condition for φ * to be bijective.
(ii) φ * is diffeomorphic under the assumption that ρ T , ρ R are continuous and φ * is bijective. Since u * is convex (or more precisely, strictly convex), the Jacobian of the transformation Dφ * = D 2 u * is positive definite and thus non-singular on the entire computational domain. Hence, φ * is invertible. Meanwhile, if ρ T , ρ R are α-Hölder continuous, then u * ∈ C 2 (Ω) [23,10], which implies that φ * = ∇u * is differentiable. As a result, φ * is diffeomorphic. Diffeomorphism is observed in all of our experiments.
2.5. Boundary conditions. Up to this point, the image registration model is still incomplete. The transformation on the boundary cannot be derived from the mass transport formulation, and it needs to be specified by model users. In terms of the Monge-Ampère equation (7), it means that a boundary condition for the PDE is yet to be specified.
A natural attempt is to specify the value of the solution u on the boundary, called Dirichlet boundary condition. However, in the image registration context, u is a scalar potential, and it is not clear what value of scalar potential to specify on the boundary.
Alternatively, one can specify the value of the transformation φ * on the boundary. For instance, [23] assumes that for any x ∈ ∂Ω R , the new coordinate φ * (x) ∈ ∂Ω T , where ∂Ω R and ∂Ω T are the boundaries of Ω R and Ω T , respectively. This means that any pixels that are on the boundary of R must stay on the boundary of T under the mapping. Since φ * = ∇u, and we have assumed that Ω R = Ω T = [0, 1] × [0, 1], this boundary condition can be rewritten as (9) u x | x=0 = 0, u x | x=1 = 1, u y | y=0 = 0, u y | y=1 = 1.
Equation (9) is called Neumann boundary condition. However, under the Neumann boundary condition (9), the quality of the resulting transformation φ * may not be good. More specifically, assuming that R is a translation of T , we expect the resulting φ * to be the underlying translation, but in numerical experiments, the resulting φ * is not a translation; see Figure 1. The reason is that under a translation, the boundary of Ω R = [0, 1] × [0, 1] cannot be the boundary of Ω T = [0, 1] × [0, 1], which violates the assumption of the Neumann boundary condition (9). More generally, when R and T are related by a combination of translation and non-rigid deformation, the resulting φ * may not reflect the underlying transformation, which will be shown in Section 6.
It seems that specifying non-zero values in (9) might recover the underlying transformation between T and R. However, feeding correct values for (9) requires knowing in advance what the underlying transformation is, and normally we do not know the answer (this is exactly what image registration tries to find), especially when the two images are related by both translation and non-rigid deformation. (c) Underlying transformation between T and R, which is a pure translation.
(d) Transformation given by the Neumann boundary condition.
We remark that [44] considers another boundary condition "u(x) − 1 2 |x| 2 is periodic on ∂Ω". Similar to (9), this boundary condition does not work for the registration problems that are mentioned in the previous paragraphs.
Our goal is to impose a proper boundary condition that will recover the underlying transformation when R and T are related by translation or by a combination of translation and non-rigid deformation. We note that the backgrounds of many images, especially medical images, display only one single color (black or white). Such images can be viewed as periodic density functions, if the domain of the images are extended to R 2 . Motivated by this fact and a very brief discussion in [27] and [28], we assume that the displacement of a pixel, φ * (x) − x, is periodic on ∂Ω.
To take one step further, we substitute φ * = ∇u and obtain (10) ( Equation (10) is a periodic boundary condition with respect to the gradient of u.
To the best of our knowledge, this paper is the first application of the gradientlike periodic boundary condition (10) in the context of solving the Monge-Ampère equation (7) for the image registration model.

3.
Finite difference discretization. Next, we develop a numerical scheme for the Monge-Ampère equation (7) under the convexity constraint (8), which yields the optimal transformation φ * = ∇u. The convex solution of (7) is indeed a type of solution that has been studied extensively, called viscosity solution [23,24]. The theory of viscosity solution can be found in [16,15]. Our goal is to develop a numerical scheme that converges to the viscosity solution of the Monge-Ampère equation (7). Different numerical approaches have been developed to solve Monge-Ampère equations det(D 2 u) = f . Some methods, such as [3,41,40,24,4,5,17,20], consider the case where f = f (x) does not depend on the gradient ∇u. Other methods, including [8,9,23,44], consider the case where f = f (∇u). However, it is not obvious whether many of these methods can converge to the viscosity solution.
Very little work has been done for developing numerical schemes that are convergent to the viscosity solution of the Monge-Ampère equation. A finite difference method that can converge to the viscosity solution is proposed in [40,24,23]. However, the computational cost per grid point must increase to infinity as the grid size increases in order to achieve convergence [19].
In our previous work [13], we propose a numerical scheme that converges to the viscosity solution of the Monge-Ampère equation det( does not depend on ∇u and the boundary condition is Dirichlet. In this paper, we plan to extend our numerical method in [13] to the Monge-Ampère equation (7), where f = f (∇u) and the boundary condition is periodic on ∇u. The basic idea is to (i) convert the Monge-Ampère equation (7) with the convexity constraint (8) into an equivalent Hamilton-Jacobi-Bellman (HJB) equation; (ii) perform a mixed standard 7-point stencil and wide stencil finite difference discretization on the equivalent HJB equation; and (iii) solve the discretized system. Steps (i) and (ii) follow our previous work [13]. However, Step (iii) requires a significant change, which will be discussed separately in Section 4.

HJB formulation of the Monge-Ampère equation.
Step (i) is to convert the Monge-Ampère equation (7) with the constraint (8) into an equivalent HJB equation [32,34]: Let u ∈ C 2 (Ω R ) be convex, and let ρ T ∈ C(Ω T ) and ρ R ∈ C(Ω R ) be two positive functions. Then the Monge-Ampère equation (7) with the convexity constraint (8) is equivalent to the following HJB equation ) is a set of admissible controls, and is a second order differential operator with the coefficients We refer interested readers to [13,34].
The HJB equation (11)-(12) turns out to be more manageable for numerical computation than the Monge-Ampère equation (7). The reason is that if we assume that the control pair (a, θ) is given, then the differential operator (12) becomes linear, which is less difficult than the nonlinear operator det(D 2 u) in (7) in terms of finite difference discretization.
3.2. Standard 7-point stencil and wide stencil discretizations. Next, we discretize the HJB equation (11)-(12) using finite difference. Interested readers can find details in our previous work [13]. To sketch the idea, we denote the N ×N grid points by respectively. In addition, we introduce the forward/backward difference operators: According to the theory in [1], in order to converge to the viscosity solution, we intend to design a finite difference scheme that is monotone [1,48]. We approx- Regarding the monotone discretization of the cross derivative u xy (x i,j ), one can verify the following: (i) If the following condition is satisfied: then u xy (x i,j ) can be discretized monotonically by the standard 7-point stencil Then the discretization of L a(xi,j ),θ(xi,j ) u(x i,j ), namely the differential operator (12) at x i,j , reads The stencil points of (15) are shown in Figure 2(a).
(ii) If the following condition is satisfied: then u xy (x i,j ) can be discretized monotonically by another standard 7-point stencil The stencil points of (17) are shown in Figure 2 In general, however, neither Condition (14) nor (16) is satisfied. In order to obtain a monotone scheme, we consider performing a local coordinate rotation on L a(xi,j ),θ(xi,j ) u(x i,j ) to eliminate the cross derivative. As shown in Figure 2(c), we apply a clock-wise local coordinate rotation by the angle θ i,j , which rotates the local axes ((e x ) i,j , (e y ) i,j ) to new axes ((e z ) i,j , (e w ) i,j ). Then the differential operator (12) is transformed into where u zz and u ww are the directional derivatives along ((e z ) i,j , (e w ) i,j ) axes. Then u zz (x i,j ) can be approximated by central differencing: Inverse Problems and Imaging Volume 12, No. 2 (2018), 401-432 The grey stars are the stencil points xi,j ± √ h(ez)i,j and xi,j ± √ h(ew)i,j. The unknowns at these stencil points are approximated by the bilinear interpolation from the neighboring points (black dots).
Here the stencil length is chosen as √ h to maintain consistency, which will be proved in Section 5. The stencil points x i,j ± √ h(e z ) i,j may not coincide with any grid point, so we approximate the unknown values at these stencil points using bilinear interpolation, denoted as I h U | xi,j ± √ h(ez)i,j ; see Figure 2(c) for details. We note that u ww (x i,j ) can be discretized in a similar fashion as D + w D − w U i,j . As a result, L a(xi,j ),θ(xi,j ) u(x i,j ) can be discretized monotonically by Since the stencil lengths of h, which is greater than h, we call (18) wide stencil discretization.
The advantage of the wide stencil discretization (18) is that it is unconditionally monotone. One may apply the wide stencil discretization at every grid point. However, we will prove that (18) is only first order accurate, while the standard 7-point stencil discretizations (15) and (17) are second order accurate. Considering this, we will only apply the wide stencil discretization (18) at the grid points where neither (14) nor (16) is satisfied. For the other grid points where (14) and (16) are fulfilled, we will apply the standard 7-point stencil discretizations (15) and (17), such that monotonicity still holds and the numerical scheme is as accurate as possible.
3.3. The complete discretized system. A complete finite difference discretization of (11)-(12) based on (15), (17) and (18) gives rise to a discrete system. Define a vector of the unknowns U ≡ {U i,j | 1 ≤ i, j ≤ N } ∈ R N 2 ×1 , and correspondingly, vectors of the controls a ∈ R N 2 ×1 , θ ∈ R N 2 ×1 . Similar to [13], we can write the discretized system into the following matrix form: Here A[a, θ] ∈ R N 2 ×N 2 is the matrix that assembles the coefficients of {U i,j } in (15), (17) and (18), which depends on the controls (a, θ).
4. Solving the discretized system. In this section, we solve the nonlinear discretized system (19) using iterative methods. Suppose the approximation of the solution U at the k-th iteration is U (k) , and the corresponding optimal controls are This is a nonlinear optimization problem. Interested readers can refer to [13,22] for how to solve this problem. We define the residual of (19) as We also define the Jacobian matrix of (19) as the Jacobian can be simplified as where δF a (k) ,θ (k) [U (k) ] represents the Fréchet derivative of F with respect to the vector U (k) , given the fixed controls (a (k) , θ (k) ). A standard approach of solving (19) is to apply Newton's method, which reads ]. It turns out that Newton's method fails to converge, since the Jacobian is singular.
To explain the reason behind singular Jacobian, we note that the differential operator L a,θ u under the periodic boundary condition (10) has two linearly-independent zero kernels. One is the function v 1 (x, y) ≡ x, which satisfies L a,θ v 1 = 0. The other is the function v 2 (x, y) ≡ y, which satisfies L a,θ v 2 = 0. v 1 and v 2 correspond to the translations along x and y directions respectively 2 . These two translation kernels are desirable, since they allow pure translation to be a solution of the image registration problem where R and T are related by translation. Furthermore, since the Jacobian J[U ] is related to the discretization of L a,θ u, it has discrete translation kernels 3 This results in a singular Jacobian.
In the event of a singular Jacobian, one may consider replacing the inverse of the Jacobian by its pseudo-inverse. Then the iteration becomes More generally, we may consider introducing a regularizer: This indeed leads to a known algorithm, called "Levenberg-Marquardt algorithm" [33,36]. One advantage of using Levenberg-Marquardt algorithm (24) is that convergence of this iterative solver has been proved [36]. The parameter λ can be changed dynamically to make the algorithm not only convergent, but also more efficient. More specifically, when the approximated solution U (k) is not close to the solution U , λ can be increased, such that the algorithm behaves like a gradient descent and is less likely to diverge. Conversely, when U (k) is close to U , λ can be decreased, such that the algorithm behaves like Newton's method and can converge faster [36]. In particular, Levenberg-Marquardt algorithm is able to converge even when the Jacobian becomes singular.
Despite the fact that Levenberg-Marquardt algorithm (24) converges, it may not converge to the solution of the discretized system (19). To explain this, we note that the algorithm is indeed designed for solving nonlinear least square problems. In the context of solving (19), the algorithm attempts to find a solution that minimizes the residual, which means that it attempts to solve U = arg minÛ R(Û ) 2 . Its global minimum, which satisfies R(U ) = 0, is the solution of (19). It is known that Levenberg-Marquardt algorithm may converge to a local minimum rather than the global minimum, depending on the initial guess U (0) . In terms of image registration, this issue is observed when the two images are related by translation. The algorithm may be stuck in a local minimum solution that does not correspond to translation.
The local minimum issue using Levenberg-Marquardt algorithm (24) can be traced back to the two translation kernels V 1 and V 2 . If we define the "translation components" of a vector U as the projections of U to the translation kernels, then an initial guess U (0) and the solution U usually have different translation components. Levenberg-Marquardt algorithm turns out to be inefficient in correcting the translation components.
In order to fix this problem, we add an additional step before each Levenberg-Marquardt iteration (24). In this additional step, we explicitly correct the translation components of the approximated solution U (k) . The amount of correction is added such that the corrected solution, denoted as U (k+ 1 2 ) , minimizes the residual R(U ). This gives rise to our final algorithm for solving the discretized system (19); see Table 1 for the algorithm.
We remark that in Line 5 of the algorithm, (c 2 ) is the amount of corrected translation components along the x and y directions. In order to find (c 2 ), we can use simple search, which means to discretize Π into a discrete set and then search for the minimum on the discrete set. Also, in practice, if (c Compute (a (k+ 1 2 ) , θ (k+ 1 2 ) ) by (20).

11:
12: translation components of the approximated solution is completed, then Line 4-7 can be skipped for future iterations. It turns out that Line 4-7 only needs to be implemented for very few steps (typically, around 1-3 steps in our experiments).
An additional benefit of this algorithm is that it decomposes the resulting transformation φ * = ∇u into a pure translation component and a non-rigid deformation component. The pure translation component is given by the accumulation of the corrections of the translation kernels, or more precisely, the accumulation of (c Subtracting φ tran from the resulting transformation yields the remaining non-rigid deformation component. Finally, to link the theory with the experiments, we summarize the complete implementation of our numerical scheme as follows: 1: Normalize ρ R and ρ T in order to satisfy (2). 2: Discretize the Monge-Ampère equation (7) as described in Section 3, yielding the discrete system (19). 3: Solve (19) using the modified Levenberg-Marquardt algorithm (Table 1), yielding the convex scalar potential u. 4: Compute the transformation φ * = (u x , u y ) T . 5: Decompose the transformation into translation component φ tran and the nonrigid deformation component by (25). 6: Compute the transformed image (4).

5.
Convergence to the optimal transformation. In this section, we prove that the transformation between R and T , given by our numerical scheme, is the optimal transformation that minimizes the cost function (5).
Lemma 5.1 (Consistency). The numerical scheme (19) is locally consistent. In other words, when h → 0, the local truncation error of (19) goes to 0.
Lemma 5.2 (Monotonicity). The numerical scheme (19) is a monotone scheme. More specifically, the left hand side of (19) at a grid point x i,j is an increasing function of U i,j and a non-increasing function of U p,q for any (p, q) = (i, j).
Interested readers are referred to the Appendix for the proofs of Lemma 5.1 and 5.2. The key point of introducing these two lemmas is that it leads to our main theoretical result: Theorem 5.3 (Optimal transformation). Suppose u is given by the numerical solution of (19) arising from the finite difference scheme in Section 3, and assume that u is bounded in the max norm 4 . Then the transformation φ * = ∇u is the optimal transformation that minimizes the cost function (5).
Proof. Since the numerical scheme (19) satisfies consistency (Lemma 5.1), monotonicity (Lemma 5.2) and the boundedness of the solution, by the theorem in [1], the solution is guaranteed to converge to the viscosity solution of the Monge-Ampère equation (7). We refer readers to [1] for a complete proof. Then by [23,24], the viscosity solution u satisfies the convexity constraint (8). Hence, φ * = ∇u is the optimal transformation. 6. Experimental results. This section includes several experimental results of image registration. 6.1. Evaluation criteria. One basic error measure for registered images is sum of squared differences (1). For mass transport registration model, (1) is always zero. The reason is that ρ R equals to ρ T φ * by (3) and (4). In other words, the transformed image T φ * and the reference R are always the same. This can be seen in Examples 2-7 in the following sections, where the mass transport registration under the periodic boundary condition is applied. Table 2 reports the sum of the squared differences for the images before registration ρ T − ρ R L2(Ω) , and after registration ρ T φ * − ρ R L2(Ω) . From Table 2, we notice that the error after the registration is significantly smaller than before the registration. Also, the error after the registration is not exactly but close to zero in practice, although it should be zero in theory. The reason is that Levenberg-Marquardt algorithm is implemented to solve the Monge-Ampère equation, which requires choosing a stop tolerance for convergence 5 .
Other than sum of squared differences, another criteria of evaluating the quality of the registration is the quality of the resulting transformation φ * . We expect φ * to reflect the underlying transformation between the two images, especially when T and R are related by translation, or by a combination of translation and nonrigid deformation. We will evaluate the resulting transformations qualitatively in Sections 6.2, 6.3 and 6.4, and also discuss the quantitative evaluation separately in Section 6.5.

Examples
Example 2 Example 3 Example 4 Table 2. Sum of the squared differences before registration ρ T − ρ R L 2 (Ω) , and after registration ρ T φ * − ρ R L 2 (Ω) . The values are computed for Examples 2-7 in Sections 6.3, 6.4 and 6.6. For each example, ρ R has been normalized to 1. The approach is the mass transport registration under the periodic boundary condition.  An additional criteria to consider is morphing effect, which is the change of intensity at each moved pixel. We refer readers to the end of Section 2.2 for a description of morphing effect. We emphasize again that morphing effect is an essential component of the mass transport registration. However, in some image registration applications where the physical object inside the two images is incompressible and cannot be treated as masses, morphing effect may misinterpret the physics of the deformation, and is thus undesirable. Considering this, it is good to suppress morphing effect under the framework of mass transport. We will see that this can be achieved by imposing our periodic boundary condition (10).
To quantify the morphing effect, we define morphing magnitude: which is the (logarithmic) ratio of the intensity before and after morphing at a moved pixel x → φ * (x). We visualize the morphing magnitude µ using color scale in Figure 4-10 in the following sections. We note that µ = 0, or white color, means no morphing effect; the larger |µ| is, or the more intense the red/blue color is, the more severe the morphing effect is. Table 3 summarizes some basic features of morphing effect, which can be derived from the theory in Section 2.2.

Optimal versus non-optimal transformations.
Example 1 (Figure 3). In this simple example, we assume that both the template image T and the reference image R are the same constant image, or, ρ T = ρ R = 1 on the entire Ω = [0, 1] × [0, 1]. The transformation given by our monotone numerical scheme is the optimal transformation, which is the identity mapping φ * (x) = x and maps a square grid to itself. However, a non-monotone finite difference scheme in [3] gives a non-optimal transformation, which severely deforms a square grid.  (c) The non-optimal transformation φ obtained by a non-monotone finite difference scheme in [3]. The figure shows that a square grid is severely deformed under φ.

Periodic versus Neumann boundary conditions.
Example 2 (Figure 4). We revisit the case where R is a translation of T . Here we let the true underlying translation φ * true be 6 (φ There is a clear distinction between the transformation φ * returned by the two boundary conditions. The periodic boundary condition yields a constant translation on the entire image, as expected; see Figure 4(d). Figure 4(e) shows that a square grid is translated under (φ * ) −1 and remains a square grid. In particular, we observe the translation of the boundary of [0, 1] × [0, 1], highlighted by the thick black lines. Also, the entire Figure 4(e) is white, which indicates that no morphing effect occurs. This is a natural consequence of φ * being a pure translation. On the contrary, under the Neumann boundary condition, the resulting transformation in Figure  4(f) is not a constant translation. In Figure 4(g), we observe that in some regions, red/blue color is intense and square elements are severely compressed/expanded. This indicates that the registration under the Neumann boundary condition relies heavily on morphing effect. Example 3 ( Figure 5). We consider two images R and T that are related by a combination of translation and dilation. We specify a true underlying transformation φ * true that satisfies with the scaling (or dilating) factor otherwise.
The resulting transformation under the periodic boundary condition is shown in Figure 5(d1). To better understand this resulting transformation, we use (25) and decompose it into translation and non-rigid deformation components, represented by green and red arrows respectively in Figure 5(d2). We observe that the non-rigid 6 The reason why we define φ * true in terms of (φ * true ) −1 is that it is more intuitive to express the mapping from Ω T to Ω R , whereas φ :    (g) A deformed grid obtained by applying the transformation (φ * ) −1 on a square grid. (φ * ) −1 is computed under the Neumann boundary condition.  Example 4 ( Figure 6). We consider two images R and T where the true underlying transformation is a combination of translation and rotation. Similar to the explanation in Example 3, Figure 6(d2) shows the translation and non-rigid deformation components under the periodic boundary condition. The non-rigid deformation (red arrows) is a local rotation. More explicitly, these red arrows rotate the pixels from p 1 to p 2 and from p 3 to p 4 around the center of rotation O. We note that the resulting transformation is not a global rotation. In other words, rotation does not occur in the black background area. Since the mass accumulates at p 2 and p 4 , the intensity will further increase after the pixels are moved toward them. Conversely, the intensity at p 1 and p 3 will further decrease after the pixels are moved away from them. Figure  6(e) illustrates such morphing effect. As a comparison, for the Neumann boundary condition, Figures 6(f) and 6(g) do not reflect the underlying combination of translation and rotation. Once again, morphing effect under the periodic boundary condition is much less severe compared to the Neumann boundary condition. One may consider incorporating global rotation into the mass transport registration model. However, since the optimal transformation field φ * = ∇u is curl-free, it would require modifying the mass transport model, which is beyond the scope of this paper.

Mass transport registration versus empirical two-step registration.
In this section, we apply mass transport image registration on medical images and compare it with the empirical two-step registration, which consists of a preregistration (rigid transformation) followed by an elastic registration.
Example 5 (Figure 7). To evaluate the performance of our approach, we perform the following test: We first specify a certain underlying transformation φ * true ; see Figure 7(e). Then we apply this underlying transformation to an image T , which gives the reference image R; see Figure 7(a)-(b). Once obtaining T and R, we register these two images using mass transport registration with the periodic boundary condition. We expect that the numerical scheme gives a transformed image T φ * that is the same as R and recovers the pre-specified underlying transformation. As expected, the resulting transformed image T φ * is the same as R, where the magnitude of the error is 10 −4 ; see Figure 7(c)-(d). Moreover, the resulting transformation given by our numerical scheme, as shown in Figure 7(f), is a good approximation of the pre-specified underlying transformation.
We also register T and R using the empirical two-step registration, which consists of a pre-registration (rigid transformation) followed by an elastic registration. It is implemented by FAIR package [38]. The results are shown in Figure 8. The error between the transformed image T φ and the reference image R by the empirical approach is not as small as mass transport model. The reason is that mass transport registration applies an additional morphing, which further corrects the registration error to (nearly) zero. The resulting transformation in Figure 8(c) also recovers the underlying transformation in Figure 7(e), although the numerical transformation in the black background area is not the same as the underlying one.
Compared to the more empirical approach, our approach does not require twostep process. In addition, our approach makes the non-rigid model more universal (capable of handling both translation and non-rigid deformation) than the individual method employed in each step of the empirical approach.
6.5. Quantitative evaluation for the transformations. To quantitatively measure the accuracy of the resulting transformations, we compute which is the difference between the resulting transformation φ * and the true underlying transformation φ * true in L 2 norm. We note that the evaluation of (26) requires knowing in advance the true underlying transformation, which is usually unavailable in practice. However, in Examples 2-5, we pre-specify the underlying transformations between T and R, which allows us to perform such evaluation. Table 4 reports the errors of the transformations. In each example, the error given by  (d) Difference between transformed image T φ * and R.
(e) Pre-specified underlying transformation between T and R.
(f ) Transformation given by the numerical scheme, which is a good approximation to the pre-specified underlying transformation in (e).
(g) A deformed grid obtained by applying the transformation (φ * ) −1 on a square grid. the mass transport registration with the periodic boundary condition is compared against either Neumann boundary condition or two-step empirical registration. Table 4 shows that the errors of the motion fields by the mass transport registration with the periodic boundary condition are the smallest. We note that the error for Example 4 is relatively large even under periodic boundary condition, since the mass transport model does not recover global rotation. 6.6. More examples. This section includes three more image registration examples that use the mass transport registration under the periodic boundary condition.
In these examples, the true underlying transformations are unknown so we focus on the qualitative evaluation only.
Example 6 ( Figure 9). We register two images taken from the brain of a patient. The transformed image T φ * (Figure 9(c)) is the same as the reference image R (Figure 9(b)). Regarding the resulting transformation, the numerical scheme can automatically translate T toward R, followed by a non-rigid deformation; see  the pixels alone may not be able to make these segments in T look the same as their counterparts in R. The mass transport model let pixels flow into/out of these segments, resulting in an increase/decrease of intensity, such that the corresponding segments of T and R can look the same.
Example 7 ( Figure 10). In this example, the two medical images are very different. Consequentially, the registration between them requires big non-rigid deformation and big morphing effect. This can be seen in Figures 10(d)-(e). A noticeable fact is that under the transformation, the boundary of Ω = [0, 1] × [0, 1], marked by the thick black lines in Figure 10(e), is not only translated, but also curved. Such curvature shows that the deformation from the actual object (dark background excluded) in T to its counterpart in R is smoothly propagated into the dark background. In other words, the transformation between the actual objects inside R and T is not influenced by the events in the dark background. This is different from the Neumann boundary condition, where the transformation between the actual objects is affected by the transformation imposed artificially on the boundary. Example 8 ( Figure 11). This example is from [27]. The objective is to register the template image, Lena, to the reference image, Tiffany. Figure 11 shows that our proposed algorithm is able to translate Lena's face towards Tiffany's face before performing non-rigid deformation. The resulting transformed image of Lena is the same as the reference image, Tiffany, even if the original images are distinct. Reference [27] also achieves this. We note that to the transformed and reference images look very similar due to the combined effect of displacement of pixels and morphing effect, as described in Section 2.2. 6.7. Unmorphed images. In some registration problems, morphing effect may be prohibited. In this case, we may want a transformed image without morphing correction, namely, ρ T unmorph φ * (x) ≡ ρ T (φ * (x)). When the morphing magnitude of the underlying non-rigid deformation is small (e.g. |µ| < 0.2), the unmorphed transformed image T unmorph φ * may still look similar to the reference image R, and the difference between T unmorph φ * and R is visually negligible. Figure 12 shows that the unmorphed image T unmorph φ * by the mass transport model under the periodic boundary condition has the smallest error, compared to Neumann boundary condition or rigid registration only.        (e) A deformed grid obtained by applying the transformation (φ * ) −1 on a square grid. (φ * ) −1 is computed under periodic boundary condition. A modification of the mass transport model, which aims at leaving out the morphing effect, is proposed in [28] and [39]. However, modifying the mass transport formulation for the purpose of dropping morphing effect is beyond the scope of this work.
6.8. Computational complexity. We first test the computational complexity of the modified Levenberg-Marquardt algorithm (Table 1) for Example 3 with image sizes of 100 × 100, 200 × 200, 400 × 400 and 800 × 800 using MATLAB. The number of steps for convergence is roughly a constant 3 as the image size increases. We record the CPU time of (i) corrections of translation kernels (Line 4-7 of Table 1), and (ii) the primary nonlinear solver (Line 8-12 of Table 1) separately. Table 5 shows that as the image size (number of pixels) increases by 4, the CPU time for (i) and (ii) increases by a factor of 8 and (approximately) 16 respectively. The total CPU time for 800 × 800 image is around 20 minutes.
We also test the computational time for different images of the same size 600×600. Since the translation components across different images are different, which is difficult to compare fairly, we skip the comparison of the CPU time for (i). Table 6 shows that for the images where the non-rigid deformation components are relatively larger, the number of steps for convergence and the CPU time for the primary nonlinear solver are also larger.
In addition, we report the convergence of residual versus the number of iteration k; see Table 7.
7. Conclusion. The objective of this paper is to propose a numerical scheme together with a boundary condition for the mass transport registration model. The main contributions of this paper include that • our numerical scheme gives an optimality-guaranteed transformation for the mass transport image registration; and • our periodic boundary condition incorporates both translation and non-rigid deformation into the mass transport model.  10 19 CPU time for the primary nonlinear solver (sec) 627 1613  We note that the standard mass transport model using the conventional boundary conditions can only do non-rigid registration. Our computation method can also provide translation component as a by-product.
In the course of achieving these two contributions, we design a mixed standard 7-point stencil and wide stencil finite difference discretization, and prove mathematically that the solution u is guaranteed to produce the optimal transformation. Also, due to the periodic boundary condition, Jacobian for the discretized system becomes singular. To address this, we propose a modified Levenberg-Marquardt algorithm.
In the experimental results, we first show that our numerical scheme yields the optimal transformation, whereas a non-monotone finite difference scheme gives a non-optimal transformation. In addition, the deformed grid images (Figure 4(e), 5(e), 6(e), 7(g), 9(e) and 10(e)) are smooth and do not contain foldings or crossings. Since these properties are possessed by the optimal transformations, they provide evidences that the resulting transformations are optimal.
We also show in the results that the periodic boundary condition outperforms the Neumann boundary condition. Under the periodic boundary condition, the transformed images T φ * are the same as the reference images R. Moreover, φ * recovers the underlying combination of translation and non-rigid deformation. In addition, the morphing effect is mild compared to the Neumann boundary condition.
There are two major limitations of the mass transport registration model. One is that morphing effect may misinterpret the physics of the deformation in some applications. The other is that the mass transport model does not recover rotation globally, although it recovers rotation locally. Fixing these limitations is beyond our objective of developing numerical scheme and choosing boundary condition. It would require modifying the mass transport model itself. Nevertheless, model modification, and furthermore, a numerical scheme for the modified model, could be future works. For instance, one possible approach to incorporate global rotation into the mass transport model would be to introduce Killing energy minimization, as investigated by the coauthor of this paper in [11].
Regarding the numerical scheme, the computation could be sped up by multilevel approaches [46,38] or multigrid algorithms [49], which is another potential future work.
Appendix A. Proofs of Lemma 5.1 and 5.2.
Proof of Lemma 5.1. The proof follows our previous work in [13]. For the standard 7-point stencil discretizations (15) and (17) (18). We will also prove that the discretization of the periodic boundary condition (10) does not introduce any truncation error. Without loss of generality, we only check the discretization of (u x − x) x=0 = (u x − x) x=1 , which is D + x U 0,j = D + x U N,j − 1.. The associated truncation error is then Notice that the periodic boundary condition (10) already implies that ϕ xx (x 0,j ) = ϕ xx (x N,j ), ϕ xxx (x 0,j ) = ϕ xxx (x N,j ), etc. In fact, such equalities hold for any derivatives equal to or higher than second order. As a result, the entire expression equals to 0, or, the discretization of the periodic boundary condition (10) induces zero truncation error. Based on these truncation error analysis, we can follow the proof in [13,22] and conclude that the local truncation error of the numerical scheme (19) goes to 0 when h → 0.
Proof of Lemma 5.2. Following the procedure in [13], one can verify the following: for the discretized differential operators (15), (17) and (18) under any admissible controls (a i,j , θ i,j ) ∈ Γ, the coefficient of U i,j is positive, while the coefficients of {U p,q | (p,q) =(i,j) } are non-positive. Hence, the discretized differential operators (15), (17) and (18) are monotonically increasing in U i,j , and monotonically decreasing in {U p,q | (p,q) =(i,j) }.
Eventually, we can follow [13,22] to conclude that the complete discretized system (19) is monotone.