Forward and backward filtering based on backward stochastic differential equations

In this paper we explore the problem of reconstruction of blurred and noisy images. The idea presented here provides a new methodology based on advanced tools of stochastic analysis which can be successfully used to solve the inverse problem. In order to solve this problem we use backward stochastic differential equations. The reconstructed image is characterized by smoothing noisy pixels and at the same time enhancing and sharpening edges. Our experiments show that the new approach gives very good results and compares favourably with deterministic partial differential equation methods.


1.
Introduction. Let D be a bounded, convex domain in R 2 , u : D → R be an original image and u 0 : D → R be the observed image of the form where η stands for a white Gaussian noise and B is a linear operator representing the blur. We assume that u 0 ∈ C 1 (D). We are given u 0 , the problem is to reconstruct u. This is a typical example of an inverse problem [2].
Problem of image reconstruction using fully automatic and reliable methods is one of the most important issues of digital image processing and computer vision. Efficient and effective reconstruction of images is an essential element of most image processing and recognising algorithms. Reconstruction algorithms allow us to make initial treatment of data for further analysis, which is very important, especially in astronomy, biology or medicine.
The most important methods in the last decade in image processing have been methods driven by nonlinear diffusion equation. Image processing based on nonlinear diffusion equation not only smooth out the image, but also keep some important features, such as edge and texture. Perona and Malik firstly applied nonlinear diffusion equation to image smoothing and proposed the famous PM model [32]. Their model can smooth out the image and preserve the edge properties well, but the result of the denoising of isolated noise is not satisfactory. Near the region's boundaries where the magnitude of the gradient is large, the regularization is stopped and if the initial data is very noisy then model cannot distinguish between true edges and false edges created by the noise. In order to overcome the drawbacks of PM model, Catte et al. [10] improved it and proposed a regularized version wherein the coefficient is a function of a smoothed gradient. This equation diffuses only if the gradient is estimated to be small. The nonlinear diffusion filtering using the variations calculus was proposed by Rudin-Osher-Fatemi [35]. They proposed denoising images by minimizing the total variation norm of the estimated solution and their algorithm was derived from a constrained minimization method as a time dependent nonlinear diffusion equation. Weickert [39] generalized the diffusion equations where the diffusion process is not controlled by a scalar diffusivity, but by a diffusion tensor, and the smoothing in some directions may be preferred. In the so called edge-enhancing variant, diffusion in the gradient direction is still controlled by a diffusivity function, but a maximum smoothing is always allowed in the coherence direction (i.e. perpendicular to the direction of maximum gradient and along edges). In [40] Weickert proposed a coherence enhancing flow, which preserves and enhances textures that exhibit a homogeneous structure tensor. While many diffusion filters act edge-enhancing, there are also so called coherence-enhancing diffusion filters. They are designed for the enhancement of oriented, flow like structures, appearing e.g. in fingerprint images. The basic idea is to diffuse anisotropically along the flow field such that gaps can be closed.
The forward and backward (FAB) diffusivity differs from the other diffusivities by the fact that it may even attain negative values. First diffusivities of such a type have been proposed by Gilboa et al. [21]. This FAB anisotropic diffusion is a non-linear feature preserving smoothing technique that efficiently eliminates the image noise and weak textures from the image while preserving and sharpening the edge information. Another example of FAB diffusivity was proposed in [18]. This model takes the advantages of mean curvature motion equation to smooth out the image and inverse heat diffusion equation to sharpen the image. Besides these works, some interesting papers have been published on theoretical foundations and the application of FAB diffusion to colour images [36,41].
In order to express the FAB filters in stochastic terms one needs to use backward stochastic differential equations (BSDEs).
The backward stochastic differential equations were introduced by Pardoux and Peng [29] who proved the existence and uniqueness of adapted solution under suitable assumptions. Independently, Duffie and Epstein [15,16] introduced stochastic differential utilities in economics models, as solution of certain BSDEs. Since then, it has been widely recognized that BSDEs provide a useful framework for formulating many problems in mathematical finance [24]. They also appear to be useful for problems in stochastic control and differential games [23]. Many papers (for instance [31]) show the connections between BSDEs driven by a diffusion process and solutions of a large class of quasilinear parabolic and elliptic partial differential equations (PDEs). These results may be seen as a generalization of the celebrated Feynman-Kac formula. Through all these results, a formal dictionary between BS-DEs and PDEs can be established, which suggests that existence and uniqueness results which can be obtained on the one side should have their counterparts on the other side. However in our case, we treat BSDEs as a starting point in the creation of the reconstruction model. In our opinion a stochastic description is more intuitive and what is more important in order to solve the BSDE model we use tools of stochastic analysis giving us completely new methodology.
In image processing one can find only theoretical results and some practical aspects [3,4,6] of BSDE-based applications. The papers [3,4,6] explore the inverse problem in the following form: where the blurring operator is omitted. In [3] the problem of reconstruction of the noisy chromaticity is considered. Presented model of denoising is expressed in terms of Skorokhod problem associated with the solution of BSDE and an epsilon neighbourhood of two dimensional sphere. In that paper BSDE is driven by a trivial drift function (f ≡ 0). This mean that presented equation is a model of forward filtering and has properties of smoothing filters. In [4] problems of reconstruction of the noisy grayscale image (smoothing filters) and enhancing of the blurred grayscale image (enhancing filters) are presented. Smoothing filter, similarly as in [3], models on anisotropic forward diffusion with BSDEs with f ≡ 0. Enhancing filters presented in [4] are driven by BSDE with non trivial drift function and correspond to inverse heat equation. This equation is a model of backward filtering (not forward) and in consequence this model fails to edge enhancing of the noisy image. The paper [6] is a generalisation of [4] to images with values in R n , in particular to colour images.
In the following we propose the model of FAB filtering expressed in terms of BS-DEs. Considering backward stochastic diffusion associated with forward stochastic diffusion can model forward anisotropic filtering in perpendicular to gradient direction and inverse anisotropic filtering in gradient direction. Finally we propose reconstruction method of blurred and noisy image where effects of smoothing, enhancing and sharpening join.
This paper places emphasis on the novel way of using backward stochastic differential equations as modelling tool of FAB filters.
The paper is constructed as follows. Section 2 contains definitions and fundamental facts of stochastic analysis. In Sections 3, 4 we recall basic ideas of filtering in terms of BSDEs taken from [4]. Section 5 provides new theoretical results of BSDE enhancing of noisy images. We define here the numerical scheme and prove a convergence of this scheme to the continuous model. This section is also devoted to presenting a forward and backward model as a conclusion of previous sections. Section 6 contains details of numerical approximation of the proposed method. Finally, in section 7 experimental results and comparison to other methods are presented.

Mathematical preliminaries.
In what follows, by (Ω, F, P) we will denote the probability space and by W = {W t ; t ∈ [0, T ]} an n-dimensional Wiener process starting from zero. We assume that we are given a point x 0 ∈ D and a function σ : [0, T ] × R n → R n × R n . Consider the stochastic diffusion process with reflection with values in domain D The equation (1) (reflected SDE) characterizes the behaviour of the continuous time stochastic process X as an Itö integral. A heuristic interpretation is that in a small time interval of length σ the stochastic process X changes its value by an amount that is normally distributed with variance σ(t, X t ) and is independent of the past behaviour of the process. This is so because the increments of a Wiener process are independent and normally distributed. The function σ is called the diffusion coefficient. The term K D t is the minimal push needed to keep process X in D. For each fixed ω ∈ Ω the function t → X t (ω) is called a trajectory of X and is denoted by X(ω). The proof of the existence and uniqueness of the solution to reflected SDEs can be found in [38].
The process X one can approximate using the following numerical scheme [33,37]: .., m, m ∈ N and Π D (x) denotes a projection of x on the set D. Since D is convex, the projection is unique.
Let (F t ) be a filtration generated by an l-dimensional Wiener process W , ξ ∈ L 2 (Ω, F T , P, R k ) be a square integrable random variable and let f : See [30] for the proof of the existence and uniqueness of the solution to BSDEs. The backward stochastic differential equation is solved starting from t = T until t = 0. Note that at time zero Y 0 the process is a deterministic value. A drift function f causes the correction of trajectories in expected strength and direction. Later on, we will see that the interesting for us is the process Y , i.e. the first component of the solution to the BSDE. The fact, that every martingale on the space with Wiener filtration has representation as a stochastic integral implies that and process Y can be approximated using the formula [27] where Here by E we denote the expected value and by F m the filtration generated by the discretization of a Wiener process.
A general model of the image reconstruction is the following: where ξ depends on noisy image u 0 and the process X. Note that the process X has values in domain of the image D and is driven by some function σ. The process Y has values in codomain of the image and is driven by some function f . The value of the process Y at time t = 0 is the reconstructed pixel u(x).
3. Forward filtering in terms of BSDEs.

Stochastic representation of solution to the heat equation.
Before presenting a general method, we will illustrate our ideas by constructing a model which is equivalent to a commonly used filter, namely, the convolution of the noise image with two-dimensional Gaussian mask. We suppose for a while that the image is given by a function defined on the whole plane. Put where W x is a Wiener process starting from x ∈ D. From the equation (4) we can deduce that A value of the process Y at time t = 0 is the reconstructed pixel u(x). Therefore, by (5) the image is the convolution of the noisy image with two-dimensional Gaussian mask. While discussing the above example, we assumed that the image is the function given on the whole plane. Since we want to consider the image as a function defined on the bounded, convex set, we have to introduce a new assumption for the process X. We assume that the process X is a stochastic process with reflection with values in D. In this case process X is a Wiener process with reflection, which we can write as In Figure 1 the role of the parameter T is presented. Greater values of T smooth out the image more. 3.2. Anisotropic diffusion. In the case of smoothing filters we will consider BS-DEs associated with ξ = u 0 (X T ) and f (t, y) = 0, where X is a diffusion process with reflection. Following [39] we provide a construction of a model where process X diffuses along edges. This condition may be achieved by imposing where u xi (y) = ∂u ∂xi (y). In particular Y 0 = E [u 0 (X T )]. To avoid false detections due to noise, u 0 is convolved with a Gaussian kernel G γ (in practice 3 × 3 Gaussian mask). The result from our evaluation experiment regarding the anisotropic diffusion method using BSDE (7) is presented in Figure  2 a). 4. Backward filtering in terms of BSDEs. In the case of enhancing we will consider BSDEs associated with ξ = u 0 (x) and f (t, y) = c(y − u 0 (X t )), where X is a Wiener process with reflection and c > 0 is some constant.
where W x is two-dimensional Wiener process starting from x, c > 0. If (Y, Z) is a solution to BSDE associated with ξ = u 0 (x) and f then where The proof of the above theorem and remark can be found in [4]. In accordance with the theorem 4.1 the image u(x) = Y m 0 is a combination of convolutions of the noisy image and two-dimensional Gaussian masks with coefficients a k where a 0 > 0, a k < 0, for k = 1, ..., m − 1 and m−1 k=0 a k = 1. This means that kernel of filtering is made up of positive weight for central pixel and negative weights for neighbourhood pixels and finally is a model of enhancing filter. In the picture Figure 3 we can see the shape (in one dimension) of the function for different values of parameter c. If c is greater than effect is more enhancing. Parameter T determines the size of the neighbourhood used in enhancing procedure. Since the image is a function defined on the bounded set we have to consider Wiener process with reflection in values in domain D, i.e. (8) u In Figure 4 we can see results of applying the filter (8) [4] if parameter c is greater than the result is more enhancing. Directly applying enhancing filter (8) to the noisy image fails (see Fig. 2 b)).

5.
Forward and backward filtering in terms of BSDEs. In this section we present a new theoretical result which leads to forward and backward filtering. This approach can be successfully used for deblurring and denoising of images. First, we consider the problem of edge enhancing of the blurred and noisy image.

Enhancing of edges.
In order to enhance edges of the blurred and noisy image we execute two assumptions: (i) the deblurring effect should be done in gradient direction; (ii) we should smooth out deblurring effects (from point (i)) in perpendicular to gradient direction (along edges).
These conditions may be achieved by the following model. Let S < T and σ(s, X s ) = for some nonnegative parameter c. In this model we deblur in gradient direction from time T to S and smooth out in perpendicular to gradient direction from S to 0. In Figure 5 we can see examples of trajectories of the process X for discretization 0 = t 0 < t 1 < · · · < t j ≤ S < t j+1 < · · · < t m = T . Trajectories of this process determine pixels which we use to the enhancing procedure. Using the scheme (3) we obtain the following inverse formula.
, c(t) = 0 for t < S and ξ = u 0 (X x S ), where X x is two-dimensional diffusion process with reflection with values in D and starting from x. If (Y, Z) is a solution to BSDE Proof. Note that, Remark given below guarantees that the brightness of the reconstructed image is the same as the brightness of the noisy image. Proof. In Figures 6 c) d) e) f) we can see results of BSDE (9) applied to the noisy image a). Similarly to (8), if parameter c is greater than the result is more enhancing. In this section we solved the problem of the reconstruction of edges but not noisy background. In Figure 6 f) evidently we can see the noise in the background of the image.

5.2.
Smoothing-enhancing method. Combining and applying model (7) for pixels where the image exhibits low gradient and model (9) for pixels in a neighbourhood of edges we have the following smoothing-enhancing method of the image reconstruction: The parameter d determines which pixels we will reconstruct with using smoothing BSDE (7) and which with using enhancing BSDE (9). In presented example Figure 7 we have used the parameter d = 30. 6. Approximation and implementation. Using the discretization (10) the reconstructed pixel can be approximated using the following formula (13) u where X x (ω) means the trajectory of X x starting from x such that 0 = t 0 < t 1 < · · · < t j ≤ S < t j+1 < · · · < t m = T, t i+1 − t i = T m and the parameter N is equal to the number of Monte Carlo iterations.
Using the Markov property for diffusion process X x the expected value can be written as where N 0 , N k are numbers of Monte Carlo iterations and should be chosen so that for fixed t j . Additionally if we use the property that the sum of expected values is the expected value of the sum, then by the above and (13) Smoothing out of deblurring effects in perpendicular to gradient direction , where N1 N0 ≈ tm−1−tj tj . In particular case of smoothing-enhancing method, where the function c(t) is given by formula (12) we have (15) Simulation of the trajectory. We now focus on simulation of the trajectory X x (ω).
In formula (14) the expression W t k (ω) − W t k−1 (ω) is approximated using random number generator and is equal to two independent values obtained using generator of the normal distribution with parameters N (0, t k − t k−1 ). After considering this observation we have simple method of simulation of values X x t0 (ω), X x t1 (ω), X x t2 (ω), . . . , X x tj (ω), . . . , X x tm−1 (ω), X x tm (ω). The numerical scheme (14) gives good results, but only with small value of timestep parameter h = T m (for example h = 0.1). Calculating the mean value using Monte Carlo method for small h is not effective and takes a long time (we need to do approximately N = 100 iterations). To omit this problem, we improve this scheme. Proposed modification is based on the following observations: -the reconstructed pixel given by formula (15) depends on values X x tj (ω), . . . , X x tm−1 (ω); -the values X x t0 (ω), . . . , X x t1 (ω), X x tj−1 (ω) we use to determine X x tj (ω); -the values X x tj+1 (ω), . . . , X x tm−1 (ω) we use to reconstruct pixels on edges. In practice the cardinality of deblurring times set {t j+1 , t j+2 , . . . , t m−1 } is several times less than the cardinality of smoothing times set {t 0 , t 1 , . . . , t j } (see Figure 5). After these remarks the request is the following: to fast implementation we need fast counting of the value X x tj (ω). In order to get X x tj (ω) we use the following modification of Euler's approximation taken from [5]: where by Θ we mean the condition |(G γ * u 0 )(H x t k (ω)) − (G γ * u 0 )(X x t k−1 (ω))| < p and τ j = min{k; k ≥ j and Θ is true j times}.
The condition Θ with parameter p > 0 guarantees that, if the image exhibits a strong gradient then the process X x diffuses as a process with small value of the parameter h (we need to be careful to don't lost information in the image) and at locations where variations of the brightness are weak -for background of the image, the process X x can diffuse with large value of h (for example h = 4). A terminal time τ j provides that the numerical simulation of the diffusion trajectory gives at least j values of X x (ω) which differ from the value in previous step. The Figure  8 illustrates a difference between the scheme (14) and the scheme (16). There are shown three examples of trajectories (G γ * u 0 )(X x t ) from the pixel A to the pixel B. Trajectories (I) and (III) were generated with using the scheme (14) for large and small value of the parameter h, respectively. Trajectory (II) was generated with using the scheme (16) for large h. It is easy to see, that at locations where the image is constant, trajectory (II) diffuses as trajectory (I). At locations where the image has strong gradient, the trajectory (II) is similar to the trajectory (III). Observe, that the scheme (16) works well only, if the model of the digital image G γ * u 0 is continuous. In practice, we can use linear interpolation to get value of the image G γ * u 0 , for any point x ∈ D. Continuity of the function G γ * u 0 guarantees that we will generate random numbers for which the condition Θ will be satisfied.
7. Experimental results. Some results from our evaluation experiments regarding our new method and deterministic FAB filtering [21] are presented in: Figure 9 and Figure 10. The results refer to grayscale images: pirate and cameraman blurred  and next corrupted with Gaussian noise with standard deviation σ = 10. Images Figure 9 d) e) and Figure 10 d) e) are results of the reconstruction using [21] and a new method based on backward stochastic differential equations, respectively. Results of both approaches, in the case of image background are comparable. However, the reconstruction of edges using our method produces a better result. Sharpened edges do not contain the noise, as opposed to the approach of Gilboa at al. Moreover, when comparing method with another classical denoising algorithm such as non local means (using patchwise implementation from [9]) and total variation (using split Bregman implementation from [20]) one can observe that the image created by the new method is visually more pleasant. The reason for this is that the non local means approach shows clear evidence of a halo of noise effect around the edges whereas total variation smooth edges too much. In presented example the time of the reconstruction of our method is comparable to non local means whereas the total variation PDE method is several times faster. 8. Conclusion. In this paper we have presented a new idea of image reconstruction based on backward stochastic differential equations. The obtained results demonstrate that proposed approach can provide a good alternative to PDE methods.