Generalized stability estimates in inverse transport theory

Inverse transport theory concerns the reconstruction of the absorption and scattering coefficients in a transport equation from knowledge of the albedo operator, which models all possible boundary measurements. Uniqueness and stability results are well known and are typically obtained for errors of the albedo operator measured in the $L^1$ sense. We claim that such error estimates are not always very informative. For instance, arbitrarily small blurring and misalignment of detectors result in $O(1)$ errors of the albedo operator and hence in $O(1)$ error predictions on the reconstruction of the coefficients, which are not useful. This paper revisit such stability estimates by introducing a more forgiving metric on the measurements errors, namely the $1-$Wasserstein distances, which penalize blurring or misalignment by an amount proportional to the width of the blurring kernel or to the amount of misalignment. We obtain new stability estimates in this setting. We also consider the effect of errors, still measured in the $1-$Wasserstein distance, on the generation of the probing source. This models blurring and misalignment in the design of (laser) probes and allow us to consider a discretized sources. Under appropriate assumptions on the coefficients, we quantify the effect of such errors on the reconstructions.


Introduction
We consider the inverse problem theory of the following stationary (timeindependent) linear transport equation: (1.1) The solution u(x, v) models the density of particles, such as photons, as a function of space x and velocity v. The spatial domain X is here a convex, bounded, open subset of R n for dimension n ≥ 2, with a C 1 boundary ∂X. The space of velocities V = S n−1 is chosen to be the unit sphere to simplify the presentation. The extinction coefficient σ(x, v), referred to as absorption coefficient, models particle loss due to scattering or intrinsic absorption at (x, v) while the scattering coefficient k(x, v ′ , v) models scattering from direction v ′ into direction v at position x.
The sets of incoming conditions Γ − and outgoing conditions Γ + are defined by (1.2) where ν(x) is the outward normal vector to X at x ∈ ∂X. Under a sub-criticality condition on (σ, k), we can show that the above equation admits a unique solution u(x, v) with a well-defined trace on Γ + . The albedo operator, mapping g on Γ − to u |Γ + is denoted by A so that u |Γ + = Ag.
The most general way to probe the domain X is to assume knowledge of the above albedo operator A. Indeed, any boundary measurement may be modeled as measuring the effect on Γ + of a probing signal emitted on Γ − . There is a significant literature on the analysis of such an inverse problem in the setting where A is known as an operator from L 1 (Γ − , dξ) to L 1 (Γ + , dξ) with dξ an appropriate weighted version of the Lebesgue measure on Γ ± . It is for instance known that A uniquely and stably determines (σ(x), k(x, v ′ , v)).
The main objective of this paper is to revisit the stability estimates available in the literature and obtain results for measurement errors that are more realistic experimentally. Imposing error estimates in L(L 1 (Γ − , dξ), L 1 (Γ + , dξ)) is indeed very constraining. Let us consider for instance a signal g supported in the ε vicinity of a point (x 0 , v 0 ) ∈ Γ − for some ε ≪ 1. In the absence of scattering (k ≡ 0), the solution u is then supported in the vicinity of the line segment (x 0 , x 1 ) with x 1 − x 0 collinear to v 0 and (x 1 , v 0 ) ∈ Γ + . Now let us assume that the measurements involve an amount of blurring equal to η ≪ 1. In other words, the solution u |Γ + is replaced by the convolution φ η * u |Γ + , with φ η an approximation of identity with support of width η (in each dimension). It is not difficult to convince oneself that u |Γ + − φ η * u |Γ + L 1 (Γ + ,dξ) will be of order O(1) independent of η as soon as ε is significantly smaller than η. In other words, the error between A and φ η * A (with obvious notation) will be of order O(1) independent of η (arbitrarily small positive).
Any inevitable, even arbitrarily small, blurring or misalignment of the detectors will result in an error of order O(1) when errors on A are measured in L(L 1 (Γ − , dξ), L 1 (Γ + , dξ)). The same error holds for imperfect sources, whose support is typically not of width ε for arbitrarily small ε and whose alignment may not be known with infinite precision as required by the above metric imposed on A. Standard error estimates then imply that the errors on the reconstruction on (σ, k) are of order O(1) as well, which is not very informative.
In this paper, we introduce measurement error on the albedo operator A that are significantly more forgiving and more likely to be met in practical settings. Our main objective is to show how the reconstruction of (σ, k) degrades as more general measurement errors are considered.
Heuristically, we wish to consider a metric in which small blurring and small misalignment result in small errors. As we have just seen, L p -based metrics are not adapted. Nor is the Radon norm on bounded measures. We want a metric in which the distance between δ(x − a) and δ(x − b) is small when b−a is small. A classical metric that is quite adapted to such a problem and whose manipulation is reasonably straightforward is the 1−Wasserstein metric.
We introduce the following family of Wasserstein distances: for κ a positive constant. The Wasserstein distance is typically introduced to measure the distance between probability measures µ and ν so that C, µ − ν = 0 for any constant C. The above bound on φ is then not necessary and κ, which scales linearly, is typically chosen equal to 1. In our applications, the measures of the form u |Γ + are not normalized and the above uniform bound on φ, normalized to 1, is necessary to form a metric. We verify that for each κ > 0, the above object is indeed a metric on the space of bounded measures and satisfies the typical properties of the Wasserstein metric on probability measures (such as, e.g., metrizing weak convergence of bounded measures); see, e.g., [32].
The role of κ allows us to introduce a parameter that measures confidence in the available measurements. Note that W 1,κ (δ a , δ b ) = κ|b − a|. A small error with large κ therefore means that we are quite confident in our measurement setting to distinguish between points a and b.
Our first objective is to assess the effects of measurement errors of Ag in the above Wasserstein metric on the reconstruction of the coefficients (σ, k). As we mentioned above, this allows us to model a wide class of measurement errors. We verify that the Wasserstein distance between a measure and its blurring by a kernel φ η is of order η. Errors in the position of detectors, whose locations are typically not known to arbitrary accuracy, are handled similarly. As we shall see, the same uniqueness results hold as in the setting of errors on A in the L 1 sense. However, compared to the latter case, stability estimates degrade. The reconstructions of appropriate functionals of σ and k are now obtained with an error that is Hölder and no longer Lipschitz. This means that errors ε on A in the Wasserstein sense translate into errors of order ε β for β < 1 on the coefficient reconstructions, whereas β = 1 holds when errors on A are measured in the L 1 sense.
A second objective is to consider the setting in which the source term g is replaced by another source term that is close in the same Wasserstein sense. This allows us to consider the case of errors in the alignment and design of (e.g.,) laser probes. This type of error, typically neglected in standard stability estimates, is important in practice to model probing signals that are never perfectly known. Such errors also allow us to model a discrete probing source configuration. Indeed, let g be a non-negative source with a support in the h vicinity of (x 0 , v 0 ) ∈ Γ − and integrating to 1. Then we verify that W 1,κ (g, δ x 0 (x)δ v 0 (v)) ≤ κh. Any source term may thus be approximated by a superposition of delta functions on a grid with an error proportional to the mesh size h. In such a setting, we allow the replacement of a line integral by the integral along a nearby line. It is thus clear that reconstructions may be accurate only when the absorption and scattering coefficients are sufficiently smooth. With this additional assumption, we obtain stability estimates in this setting.
What the optimal choice should be to quantify measurement and probing errors in an inverse problem is no doubt very subjective. Our choice of the 1−Wasserstein distance in inverse transport theory is based on two observations: (i) it is sufficiently versatile to provide small penalties on a large class of classical errors such as small blurring or misalignment, both at the detector and source levels; and (ii) its structure is quite amenable to relatively simple mathematical analysis and generalizations of results obtained in the setting of L 1 errors.
The inverse transport setting and the main results of the paper are summarized in section 2. The errors in the probing sources require that we analyze the propagation of Lipschitz constants by a forward transport problem. Such a forward analysis is carried out in section 3. The stability estimates are based on careful analyses of multiple scattering contributions to the albedo operator, which are presented in section 4. The detailed proofs of the results are given in sections 5 and 6.
The methodology followed here draws on the large body of works on stationary inverse transport theory as developed in, e.g., [13,14,33,3,28,24]. For concreteness, we concentrate here on the stationary inverse problem with full angular measurements. Our results, which primarily follow from the detailed analysis of the transport Green's kernel as developed in, e.g., [5,7], (most likely) adapt to other settings, such as the settings of time-dependent and/or angularly averaged measurements; see [6,4,5,8,9,18,19,29] for additional references as well as [2] for a recent review.

Inverse transport theory and main results
Let us first summarize our main assumptions on the coefficients and the geometry that allows us to define the albedo operator A.
We define the times of escape of free-moving particles from X as . On the boundary sets Γ ± , we introduce the measure dξ(x, v) = |v · ν(x)|dµ(x)dv, where dµ(x) is the surface measure on ∂X.
We say that the optical parameters (σ, k) are admissible when We say that the problem is sub-critical when either one of the following assumptions holds We now define the following Banach space with its natural norm. Under either (HSC1) or (HSC2), we obtain for admissible parameters that the albedo operator A is well posed from We are now in a position to consider the inverse transport problem, which aims to describe what may or may not be reconstructed in (σ, k) and with which type of stability. Before we describe our measurement setting, we need to decompose the albedo operator into components with different singular structures.
We first introduce the lifting operator It is proved in [14] that J is a bounded operator from L 1 (Γ − , dξ) to W . Let us next define the bounded operator for (x, v) ∈ X × V . Note that (1.1) may then be recast as (I − K)u = Jg.
With this notation, we decompose the albedo operator as Physically, the above decomposition separates the albedo operator into the ballistic component Bg, the single scattering component Sg, and the rest. In dimension n ≥ 3, the former two components are more singular than Mg when g approximates a point source. We also introduce S c = S + M the scattering component so that A = B + S c .
We are now ready to present our measurement setting. Consider two sets of coefficients (σ j , k j ) for j = 1, 2 and define A j as the corresponding albedo operator. Our objective is to analyze how errors on measurements of A j translate into errors on the reconstruction of (functionals of) (σ j , k j ).
Let g be a given (exact) source in L 1 (Γ − , dξ) normalized to g L 1 (Γ − ,dξ) = 1. We consider two approximations g j of g, which are used to probe the domain X. The resulting solutions on Γ + are thus given by A j g j . Let φ be a (test) function in L ∞ (Γ + ) such that φ L ∞ (Γ + ) ≤ 1. Then we may define the measurement ("projected" onto that test function) where ε j is a measurement error for such a projection. We assume the measurement to be given experimentally, i.e., independent of the configuration j. Taking the difference of the above relations for j = 1 and j = 2 yields (2.8) Appropriate (properly normalized) choices for g ∈ L 1 (Γ − ) and φ ∈ L ∞ (Γ + ) provide information about the coefficients (σ j , k j ). Note that the left-hand side is bounded by the error of the albedo operator A 1 − A 2 in the L 1 sense and is the starting point of all current stability estimates in inverse transport theory; see, e.g., [2]. Our objective is to generalize such estimates to the setting where g j is an approximation of g in the Wasserstein sense, and where errors between A 1 and A 2 are measured in the same sense. This implies that the test function φ can no longer be considered as a general bounded function but instead is a function bounded by 1 and with Lipschitz constant bounded by κ.
The error estimates proceed as follows. We first extract information from the ballistic part of the albedo operator: and next from the single scattering component, making sure that the ballistic contribution vanishes B j g, φ = 0, To explicit the information we obtain from the above decomposition, we need to introduce the following notation. For x and y inX, we define Let us also define for (x, v, w) ∈ X × S n−1 × S n−1 : σ(x + sw, w)ds , (2.10) the total attenuation factor on the broken line (x−τ − (x, v)v, x, x+τ + (x, w)w). Let us introduce the error term ε = |ε 1 | + |ε 2 |.
We are now ready to state our main stability results. The first one assumes that g j = g, i.e., that the probing source is exact. Since measurements errors are given in a Wasserstein sense, we assume the additional W 1,∞ regularity constraints on the coefficients and assume that L = max(Lip(σ), Lip(k)) is bounded and k vanishes in the vicinity of the boundary ∂X (see (3.7) and (3.15) for the precise assumption). To avoid uninteresting singularities arising when κ → 0, which corresponds to infinitely blurring detectors, we assume that the fixed constant κ ≥ 1.
In this setting, we have Then for a constant C that depends on k j W 1,∞ , σ j W 1,∞ and subcritical conditions, we obtain that Here, a ∨ b = max(a, b). Note the similarity with the theorems obtained in the L 1 setting; see [2]. The main difference is that Lipschitz stability estimates are typically replaced by less favorable Hölder estimates. This is expected since our setting is precisely aimed to incorporate a larger class of model and measurement errors. Note also that the errors improve as κ increases, i.e., as our confidence in our detectors to discriminate nearby phase-space points increases. If κ scales appropriately with ε, then Lipschitz stability is even restored. For instance, in (2.11) on the stability of line integrals of σ(x, v), we observe that a Lipschitz estimate (an error proportional to ε) is obtained for κ on the order of ε − 1 n−1 . Note that according to these estimates, there is no gain in increasing the resolution capabilities of the detector, i.e., increasing κ beyond this order.
Let us now consider the more constraining setting of approximate probing sources. As we indicated above, replacing g by an approximation g j in the Wasserstein sense includes replacing the "real" source by an approximation supported on a (discrete) grid. As a consequence, we do not expect to reconstruct parameters that oscillate rapidly at the grid level. We thus need to impose additional W 1,∞ regularity constraints on the coefficients and assume that L = max(Lip(σ), Lip(k)) is bounded (see (3.7), (3.15) for the precise assumptions) and we need to impose an additional constraint on the boundary ∂X (see (3.25)). Let κ ≥ 1 and let us introduce the maximal error that is made in the experimental setting to approximate the probe g. Let also δ = δ 1 + δ 2 . Then we have the following result: Under the aforementioned assumptions, we obtain using the above notation that Similarly, we obtain that in dimension n = 3 In the above theorem, the constant C in front of δ depends on the subcriticality assumption; see the proofs below.
It is now classical to deduce stability results on the coefficients (σ, k) from the stability results on the above functionals E j andẼ j k j of (σ, k) [2]. Let us consider for concreteness the setting developed in [28], which provides a simple setting to present stability estimates.
Inspection of (2.11) or (2.15) shows that the integral of σ(x, v) between any two points at the domain's boundary is stably reconstructed (and exactly when ε = δ = 0). In other words, we have access to the Radon transform of σ. When σ = σ(x, v) and k = 0, it is clearly not possible to uniquely reconstruct σ from its line integrals. The obstruction to unique reconstruction was analyzed in [28].
Note that u and φu then solve the same transport equation and satisfy the same boundary conditions. Let < σ, k > be the class of equivalence. What [28] show is that the class of equivalence is uniquely and stably characterized by the left-hand side in (2.11)-(2.13). Let e b (ε, δ) be the right-hand side in either (2.11) or (2.15) and e s (ε, δ) be the right-hand side in either (2.12) The remarks stated after Theorem 2.1 still hold for the estimates provided in Theorems 2.2 and 2.3. Let us conclude this section by a few remarks.
The estimates in Theorem 2.3 should be interpreted with some care. The stability estimates (2.11) and (2.15) provide an estimate on the reconstruction of the line integral, i.e., the X-ray transform, of σ, not of σ itself. When σ = σ(x) is independent of the variable v, such an information is sufficient to uniquely determine σ (by inverse Radon transform [26]); but not in the L ∞ sense. The error in (2.18) holds for (one element in) a class of equivalence, not for every element of that class of equivalence. For instance, when σ = σ(x), we may construct an elementσ(x, v) in the same class of equivalence by assigning toσ(x, v) the line integral of σ(x) of direction v passing through x divided by the length of that segment in X (so that the line integrals of σ(x) andσ(x, v) agree). Thenσ(x, v) is clearly reconstructed with stability in the L ∞ sense, but not σ(x) itself. The reconstruction of σ then requires inverting a Radon transform, which is a classical problem that is not stable in the L ∞ sense. The same remark holds for the reconstruction of the scattering coefficient k( The estimates in the preceding three theorems are written for functionals of the coefficients (σ, k) and, in the case of Theorem 2.3, for classes of equivalence. How such estimates can be converted to estimates on the reconstruction of (σ = σ(x), k = k(x, v ′ , v)) has already been addressed in the literature. We refer the reader to, e.g., [2,28,33] for such results.
As we indicated in the introduction, stability estimates for measurement errors in the L(L 1 (Γ − , dξ), L 1 (Γ + , dξ)) sense have been obtained in several works, for instance [2,3,28,33]. Our work [3] introduces estimates on the geometry of multiple scattering contributions to the albedo operator that are more precise than is necessary for stability estimates in the L(L 1 (Γ − , dξ), L 1 (Γ + , dξ)) sense. Such more precise estimates were used in [4] as a first attempt to consider more general errors than those that can be analyzed in the L(L 1 (Γ − , dξ), L 1 (Γ + , dξ)) setting. In the current paper, these estimates on multiple scattering contributions constitute the central tool in the derivation of the results in the above theorems. The rest of the paper is essentially devoted to the detailed, albeit lengthy, derivation of these estimates.

Forward Theory
In this section, we introduce the notation on forward transport equations that we follow throughout the paper and recall the relevant results on the forward transport theory in the L 1 setting. We next present the results we need in the dual setting of bounded functions. Finally, we consider the setting of transport solutions that are Lipschitz continuous. Such an analysis is necessary as we aim to model measurement errors in the Wasserstein distance.
3.1 Recall on the L 1 framework.
Using the notation introduced in the preceding section, we make the following assumption: The bounded domain X is assumed to be of class C 1 . We introduce the Banach space W which is the completion of C 1 (X × V ) with the norm The following trace properties hold.
Lemma 3.1 (see [14]). The following operators J : L 1 (Γ − , dξ) → W and j ± : W → L 1 (Γ + , dξ) are well defined and bounded for a.e. (x, v) ∈ X × V and for u ∈ L 1 (Γ − , dξ), and The linear Boltzmann equation is then recast as the following integral equation where K is a bounded operator in L 1 (X × V, τ −1 dxdv) and where T and P are bounded operators from L 1 (X × V ) to W and from , respectively and are given by We also have j − T = 0. We recall the following well known solvability condition for the transport equation.
Lemma 3.2 (see for instance [3]). Assume that Then (I −K) −1 J is a bounded operator from L 1 (Γ − , dξ) to W , and the albedo operator A = j + (I − K) −1 J is well defined and bounded.
We will use the following duality property The operator P * is the dual operator for P and is bounded from and we obtain The operator K * = P * T * is a bounded operator in τ −1 L ∞ (X × V ).

Consequences on the backward equation.
The backward equation is given formally by The first-order PDE is understood in the weak sense when acting on functions (3.10) Then, the following trace properties hold (see also for instance [16]): The operatorsJ andj are well defined bounded operators, wherẽ for u ∈ L ∞ (Γ + ) and for a.e. (x, v) ∈ Γ + , and Then the boundary value problem (3.8) is equivalent to the following integral equation By duality, condition (3.7) is equivalent to the invertibility of the bounded  .7), the bounded operator I −K in L ∞ (X × V ) is invertible and its inverse is given by As an application of Green's formula, we have (3.14) 3.3 Lipschitz behavior of the backward transport solutions when the optical parameters are Lipschitz.
To make the mathematical analysis reasonably straightforward, we now assume that: X is convex, T he f unctions σ and k are Lipschitz f unctions onX andX × V × V respectively, (3.15) and k vanishes in a neigborhood of ∂X × V × V.
Let us denote by supp X k the support of the function k in the x variable, i.e. supp X k := {y ∈ X | k(y, v, v ′ ) = 0 for some (v, v ′ ) ∈ V 2 }. Let us introduce a smooth compactly supported function ρ ∈ C ∞ (R n ) so that (3.16) We will use the following property. We denote by C α (X × V ) the space of Hölder continuous functions of order α, 0 < α < 1, (uniformly) on X × V .
Proof of Lemma 3.5. We still denote by ρ the multiplication operator by the function ρ: Let us extend k and σ by 0 outside X × V 2 and X × V , respectively, and still denote by k and σ these extensions. ThenK = P * T * can be written as (3.19) Under condition (3.15), Ψ is a compactly supported Lipschitz function on R n × V 2 × R andK is a bounded operator in Lip(X × V ).
We recall the explicit expression of P * K = P * T * P * : Then, we make the change of variables y = x + rv, dy = r n−1 drdv and obtain that for (x, y, θ, w, v ′ ) ∈ R n × R n × V 3 . Under condition (3.15), Φ is a compactly supported Lipschitz function on R n × R n × V 3 . Then, P * T * P * is a bounded operator from L ∞ (X × V ) to C α (X × V ) (see, for example, [17,Theorem 5.25]). In addition, we have P * T * P * is a bounded operator from C α (X×V ) to Lip(X × V ) (see, for example, [25, Chapter 9, Section 7] on weakly singular integrals after approximating Φ by a sequence of C 1 smooth functions Φ n that are compactly supported in R n × R n × V 3 with the following properties : lim n→+∞ Φ − Φ n L ∞ (R n ×R n ×V 3 ) = 0 and there exists a constant C that does not depend on Φ so that sup n∈N Lip(Φ n ) ≤ C(LipΦ + Φ L ∞ (R n ×R n ×V 3 ) )). From (3.21) it follows that P * K = ρP * K and that where Θ(x, t, w) = e − t 0 σ(x+sw,w)ds ρ(x + tw).

(3.24)
Under condition (3.15), Θ is a Lipschitz function on X × R × V . Therefore from (3.23) and the properties of P * K given above, we obtain thatK 2 is a bounded operator from Proof of Corollary 3.6. The boundedness of I −K in Lip(X ×V ) follows from the first statement of Lemma 3.5. It remains to prove that I −K is invertible as a bounded operator in Lip(X ×V ). Under condition (3.7), we already know that I −K is invertible as a bounded operator in L ∞ (X × V ). Hence I −K is still one-to-one when we restrict the operator to the subspace Lip(X × V ) of L ∞ (X ×V ). Therefore we just need to check that I −K is onto as a bounded operator in Lip(X × V ). Let g ∈ Lip(X × V ). Then by (3.7) and Lemma 3.3 there exists f ∈ L ∞ (X × V ) so that f −Kf = g. By Lemma 3.5, it follows thatK 4 f ∈ Lip(X ×V ) andf = g+Kg+K 2 g+K 3 g+K 4 f ∈ Lip(X ×V ).
From now on we will always assume in this Section that X is a strictly convex (in the strong sense) bounded domain of class C 2 . Proof of Lemma 3.7. From (3.11), it follows that for u ∈ Lip(X × V ) we havej(u)(x, v) = u(x, v) for (x, v) ∈ Γ + where u is extended by continuity toX × V . Therefore, we obtain that Lip(j(u)) ≤ Lip(u).
Proof. Under (3.25) let χ ∈ C 2 (R n ) be a defining function for X, i.e. χ(x) = 0 and ∇χ(x) = 0 when x ∈ ∂X, χ(x) < 0 when x ∈ X, χ(x) > 0 when x ∈ R n \X, and the Hessian matrix Hessχ(x) is positive at any point x ∈ ∂X. (3.26) Then, we prove that τ ± are C 1 -functions on X × V , resp. Γ ∓ , by applying the implicit function theorem on Then τ ± ∈ C 1 (Γ ∓ ) follows from (3.26) and the Taylor expansion In addition, we have and ∂ x i τ + (x, v) is not even bounded on X × V , and τ + is not a Lipschitz function onX × V . In particular, for φ the constant valued function 1 on Γ + when σ ≡ 0 on X × V and k ≡ 1 on X × V 2 (which breaks assumption (3.15)). Therefore when k does not vanish at the boundary thenK does not necessarily map Lip(X × V ) to Lip(X × V ).
Corollary 3.11 is used for the proof of Theorem 2.2. It is established under the assumption that k vanishes on the boundary of the domain (see (3.15)). Then (under the additional assumption (3.7)) solutions of the backward transport equation with Lipschitz boundary condition at Γ + remain Lipschitz inside X × V and one can take their traces on Γ − . It also allows us to easily studyK in the Banach space Lip(X × V). Whether this assumption can be weakened to allow for scattering coefficients that do not vanish on the boundary is left open.

Estimates on multiple scattering kernels
This central section presents the main technical results on the support of multiple scattering contributions in the albedo operator. The detailed proofs of the results,which revisit and generalize contributions in our earlier works, are postponed to section 6.
Throughout the section, we assume that X is a bounded convex domain of class C 1 . Under the general assumption (3.1), let us denote by α m the distributional kernel of the operator j + K m J (a map from L 1 (Γ − , dξ) to L 1 (Γ + , dξ)) and γ m the distributional kernel of the operator j + K m (a map from L 1 (X ×V ) to L 1 (Γ + , dξ)), m ∈ N. This section provides explicit expressions for these distributional kernels (Lemma 4.1) as well as estimates when the scattering coefficient k is bounded (Lemma 4.3). We next apply these estimates to control multiple scattering contributions to the albedo operator (Propositions 4.4 and 4.5).
By induction on m ≥ 2, we define the measurable function E on X m by the formula for a.e. (x 1 , . . . , x m ) ∈ X m , where the function E is initially defined as a measurable function on X 2 by formula (2.9). We use the same notation for the measurable function E defined on ∂X × X m−2 × ∂X or on ∂X × X m−1 .
Lemma 4.1. Under condition (3.1), the distributional kernel of j + K m is the measurable function γ m given on Γ + × X × V by the formula

3)
and for m ≥ 2 the distributional kernel of j + K m J is the measurable function α m on Γ + × Γ − given by The distributional kernel of K m is the measurable function β m also given by the right-hand side of (4.2) for a.e.
The proof of Lemma 4.1 is given in Section 6. In the sequel, we will use the following well known results: for some constants C, C ′ , C m , 2 ≤ m ≤ n, where the constants C and C n depend only on the diameter of the bounded domain X.
The proof of Proposition 4.5 is also given in Section 6.

Proof of Theorems 2.1, 2.2 and 2.3
Using the estimates presented in the preceding section, we are now ready to prove the main results of the paper. We first recall the following Lemma.

Proof of Theorem 2.2
We assume that assumptions (3.7), (3.15) and (3.25) are satisfied and start with the identity where C is the maximum between the norms of A 1,back and A 2,back as bounded operators from Lip(Γ + ) to Lip(Γ − ). Then the proof follows the same lines as the proof of Theorem 2.1.

Proof of Theorem 2.3
Let us denote M = max( τ σ 1 ∞ , τ σ 2 ∞ , τ σ 1,p ∞ ). Our main task is to select the right representative of (σ 1 , k 1 ). Set Then let us choose the following representative of (σ 1 , k 1 ): for (x, v) ∈ X × V . Hence using Theorem 2.1 or 2.2 we have for (x, v) ∈ X × V . Hence we obtain the first estimate of the Theorem.
After straightforward computations, we havẽ Hence for (x, v ′ ) ∈ X × V and some constant C. The second estimate is proved. Proof of Lemma 4.1. For m ≥ 2, let β m denote the distributional kernel of the operator K m where K is defined by (2.5). We first give the explicit expression of β 2 and β 3 . We then obtain the explicit expression of the kernel β m by induction and derive the explicit expression for γ m . Let ψ ∈ L 1 (X × V, τ −1 dxdv). From (2.5) it follows that for a.e. (x, v) ∈ X × V . Performing the change of variables x ′ = x − tv − t 1 v 1 (dx ′ = t n−1 1 dt 1 dv 1 ) on the right hand side of (6.1), we obtain for a.e. (x, v) ∈ X × V . Hence (6.3) and (2.5), it follows that for a.e. (x, v) ∈ X × V . Therefore performing the change of variables z = x − tv − t 1 v 1 (dz = t n−1 1 dt 1 dv 1 ) on the right hand side of (6.4), we obtain Then by induction we have This is the right hand side of (4.2). And we obtain (4.2) for γ m , m ≥ 3, the distributional kernel of j + K m . Now let ψ ∈ L 1 (Γ − , dξ). Then for m ≥ 3 we have (4.5). For m = 2, we have for a.e. (z 0 , v 0 , z n+2 , v n+2 ) ∈ Γ + × X × V .
Assume again that n ≥ 3 and set m = n + 1 in (6.13). From (4.7) and (6.12) it follows that there exist positive constants C and C ′ so that Estimate (6.15) also holds when n = 2 (see (4.7) and (6.12)). Then there exists a positive constant C 0 that depends only on diam(X) such that We apply the above inequality for "(w, θ) = (z 0 − z n+2 − tv 0 , v n+1 )" on (6.15) and obtain We will use the following Lemma, consisting of basic estimates, in the proof of Propositions 4.4 and 4.5. forη ∈ (0, +∞) and for 2 ≤ m ≤ n − 1 and some constant C m .
for a.e. (z ′′ , v ′′ ) ∈ Γ − and for some positive constant C, C ′ , C ′′ and C ′′′ that depend on the diameter of X. Combining (6.33) for m = n and this latter estimate we obtain (4.18), which concludes the proof of the proposition.