The Bayesian Formulation of EIT: Analysis and Algorithms

We provide a rigorous Bayesian formulation of the EIT problem in an infinite dimensional setting, leading to well-posedness in the Hellinger metric with respect to the data. We focus particularly on the reconstruction of binary fields where the interface between different media is the primary unknown. We consider three different prior models - log-Gaussian, star-shaped and level set. Numerical simulations based on the implementation of MCMC are performed, illustrating the advantages and disadvantages of each type of prior in the reconstruction, in the case where the true conductivity is a binary field, and exhibiting the properties of the resulting posterior distribution.

1. Introduction 1.1. Background. Electrical Impedance Tomography (EIT) is an imaging technique in which the conductivity of a body is inferred from electrode measurements on its surface. Examples include medical imaging, where the methodology is used to non-invasively detect abnormal tissue within a patient, and subsurface imaging where material properties of the subsurface are determined from surface (or occasional interior) measurements of the electrical response; the methodology is often referred to as electrical resistance tomography -ERT -in this context and discussed in more detail below. The concept of EIT appears as early as the late 1970's [15] and ERT the 1930's [28].
A very influential mathematical formulation of the inverse problem associated with EIT dates back to 1980, due to Calderón. He formulated an abstract version of the problem, in which the objective is recovery of the coefficient of a divergence form elliptic PDE from knowledge of its Neumann-to-Dirichlet or Dirichletto-Neumann operator. Specifically, in the Dirichlet-to-Neumann case, if D ⊆ R d and g ∈ H 1/2 (∂D) is given, let u ∈ H 1 (D) solve −∇ · (σ∇u) = 0 in D u = g on ∂D.
The problem of interest is then to ask does the mapping Λ σ : H 1/2 (∂D) → H −1/2 (∂D) given by g → σ ∂u ∂ν determine the coefficient σ in D? Physically, g corresponds to boundary voltage measurements, and Λ σ (g) corresponds to the current density on the boundary. Much study has been carried out on this problem -some significant results, in the case where all conductivities are in C 2 (D) and d ≥ 3, concern uniqueness [42], reconstruction [35], stability [2] and partial data [22]. Details of these results are summarised in [38].
In 1996, Nachman proved global uniqueness and provided a reconstruction procedure for the case d = 2, involving the use of a scattering transform and solving a D-bar problem [36]. The D-bar equation involved is a differential equation of the form ∂q = f , where ∂ denotes the conjugate of the complex derivative and f depends on the scattering transform. A regularised D-bar approach, involving the truncation of the scattering transform, was provided in [23,24], enabling the recovery of features of discontinuous permeabilities. The regularised D-bar approach is also used in [25], for the case when the data is not of infinite precision. Other work in the area includes joint inference of the shape of the domain and conductivity [26].
For applications, a more physically appropriate model for EIT was provided in 1992 in [40]. This model, referred to as the complete electrode model (CEM), replaces complete boundary potential measurements with measurements of constant potential along electrodes on the boundary, subject to contact impedances. The authors show that predictions from this model agree with experimental measurements up to the measurement precision of 0.1%. For this model they also prove existence and uniqueness of the associated electric potential. It is this model that we shall consider in this paper, and it is outlined in section 2.
When using the CEM, there is a limitation on the number of measurements that can be taken to provide additional information. This is due to the fact that there are only a finite number of electrodes through which current can be injected and voltages read, combined with the linear relationship between current and voltage. The data is therefore finite dimensional in the inverse problem, as distinct from the Calderón problem where knowledge of an infinite dimensional operator is assumed. As a consequence, reconstruction using the CEM often makes use of Tikhonov regularisation; such a regularisation can also be used to account for noise on the data. The paper [13] analyses numerical convergence when an H 1 or TV penalty term is used, with a finite element discretisation of the problem. We will adopt a Bayesian approach to regularisation, and this is discussed below.
A closely related problem to EIT is Electrical Resistivity Tomography (ERT), which concerns subsurface inference from surface potential measurements, see for example [28] which discussed the problem as early as 1933. Physically the main difference between EIT and ERT is that alternating current is typically used for the former, and direct current for the latter. Additionally, due to the scale of ERT, it is a reasonable approximation to model the electrodes as points, rather than using the CEM. Another difference between the two is that the relative contrast between the conductivities of different media are typically higher in subsurface applications than medical applications, which permits the approximation of the problem by a network of resistors in some cases [37]. Nonetheless, the Bayesian theory and MCMC methodology introduced here will be useful for the ERT problem as well as the EIT problem.
Statistical, in particular Bayesian, approaches to EIT inversion have previously been studied, for example in [18,19,31]. In [18], the authors prove certain regularity of the forward map associated with the CEM, formulate the Bayesian inverse problem in terms of the discretised model, and investigate the effect of different priors on reconstruction and behaviour of the posterior. The paper [31] focuses on Whittle-Matèrn priors, using EIT and ERT as examples for numerical simulation. The paper [19] presents a regularised version of the inverse problem, which admits a Bayesian interpretation.
The Bayesian approach to inverse problems, especially in infinite dimensions, is a relatively new technique. Two approaches are typically taken: discretise first and use finite dimensional Bayesian techniques, or apply the Bayesian methodology directly on function space before discretising. The former approach is outlined in [20]. The latter approach for linear problems has been studied in [12,32,33,34]. More recently, this approach has been applied to nonlinear problems [10,29,30,41]. It is this approach that we will be taking in this paper.
1.2. Our Contribution. The main contributions in our paper are as follows: (i) This is the first paper to give a rigorous Bayesian formulation of EIT in infinite dimensions. (ii) This setting leads to proof that the posterior measure is Lipschitz in the data, with respect to the Hellinger metric, for all three priors studied; further stability properties of the posterior with respect to perturbations, such as numerical approximation of the forward model, may be proved similarly. (iii) We employ a variety of prior models, based on the assumption that the underlying conductivity we wish to recover is binary. We initially look at log-Gaussian priors, before focusing on priors which enforce the binary field property. These binary field priors include both single star-shaped inclusions parametrised by their centre and by a radial function [8], and arbitrary geometric interfaces between the two conductivity values defined via level set functions [17]. (iv) Numerical results using state of the art MCMC demonstrate the importance of the prior choice for accurate reconstruction in the severely underdetermined inverse problems arising in EIT.

1.3.
Organisation of the Paper. In section 2 we describe the forward map associated with the EIT problem, and prove relevant regularity properties. In section 3 we formulate the inverse problem rigorously and describe our three prior models. We then prove existence and well-posedness of the posterior distribution for each of these choices of prior. In section 4 we present results of numerical MCMC simulations to investigate the effect of the choice of prior on the recovery of certain binary conductivity fields. We conclude in section 5.

The Forward Model
In subsection 2.1 we describe the complete electrode model for EIT as given in [40]. In subsection 2.2 we give the weak formulation of this model, stating assumptions required for the quoted existence and uniqueness result. Then in subsection 2.3 we define the forward map in terms of this model, and prove that this map is continuous with respect to both uniform convergence and convergence in measure.
be a bounded open set with smooth boundary representing a body, with conductivity σ : D → R. A number L of electrodes are attached to the surface of the body. We treat these as subsets (e l ) L l=1 of the boundary ∂D, and assume that they have contact impedances (z l ) L l=1 ∈ R L . A current stimulation pattern (I l ) L l=1 ∈ R L is applied to the electrodes. Then the electric potential v within the body and boundary voltages (V l ) L l=1 ∈ R L on (e l ) L l=1 are modelled by the following partial differential equation.
This model was first proposed in [40]; a derivation can be found therein. Note that the inputs to this forward model are the conductivity σ, input current (I l ) L l=1 and contact impedances (z l ) L l=1 . The solution comprises the function v : D → R and the vector V ∈ R L of voltages. Also note that solutions to this equation are only defined up to addition of a constant: if (v, V ) solves the equation, then so does (v + c, V + c) for any c ∈ R. This is because it is necessary to choose a reference ground voltage. D e l Figure 1. An example domain D with electrodes (e l ) L l=1 attached to its boundary.

Weak Formulation.
We first define the space in which the solution to equation (1) will live. Using the notation of [40] we set Since solutions are only defined up to addition of a constant, we define the quotient space (Ḣ, · Ḣ ) byḢ = H/R, We will often use the notation v = (v, V ) for brevity. It is more convenient to equiṗ H with an equivalent norm, as stated in the following lemma from [40]: Then · * and · Ḣ are equivalent.
We can now state the weak formulation of the problem as derived in [40]. For this let w = (w, W ).
Then if v ∈Ḣ is a strong solution to the problem (1), it satisfies We will use the weak formulation (2) to define our forward map for the complete electrode model (1). In order to guarantee a solution to this problem, we make the following assumptions.
Under these assumptions, existence of a unique solution to (2) is proved in [40] and stated here for convenience: Remark 2.5. Assumptions 2.3 (i) and (ii) are to ensure coercivity and boundedness of B(·, ·; σ). Assumption 2.3 (iii) is necessary for continuity of r(·), and physically may be thought of as a conservation of charge condition. Choosing a solution from the equivalence class corresponds to choosing a reference ground voltage.

Continuity of the Forward Map.
In what follows we will restrict to the set of admissible conductivities, which is defined as follows.
Definition 2.6. A conductivity field σ : D → R is said to be admissible if The set of all such conductivities will be denoted A(D).
Note that any σ ∈ A(D) will satisfy Assumptions 2.3(i). Assume that the current stimulation pattern (I l ) L l=1 ∈ R L and contact impedances (z l ) L l=1 ∈ R L are known and satisfy Assumptions 2.3(ii)-(iii). Then we may define the solution map M : A(D) → H to be the unique solution to (2) satisfying (3). The above existence and uniqueness result tells us that this map is well-defined.
In [18] it is shown that M : A(D) → H is Fréchet differentiable when we equip A(D) with the supremum norm. Though this is a strong result, this choice of norm is not appropriate for all of the conductivities that we will be considering. We hence establish the following continuity result.
Proposition 2.7. Fix a current stimulation pattern (I l ) L l=1 ∈ R L and contact impedances (z l ) L l=1 ∈ R L satisfying Assumptions 2.3. Define the solution map M : A(D) → H as above. Let σ ∈ A(D) and let (σ ε ) ε>0 ⊆ A(D) be such that either (i) σ ε converges to σ uniformly; or (ii) σ ε converges to σ in (Lebesgue) measure, and there exist σ − , σ + ∈ (0, ∞) such that for all ε > 0 and x ∈ D, σ − ≤ σ ε (x) ≤ σ + . It follows that In both cases (i) and (ii), we have that (σ ε ) ε>0 is bounded uniformly below by a positive constant. Hence for small enough ε, the left hand side above may be bounded below by C v ε − v 2 * . We then have by Cauchy-Schwarz If σ ε → σ uniformly, we deduce from (5) that v ε − v * → 0 and the result follows. If |σ ε − σ| → 0 in measure, then since |D| < ∞, it follows that the integrand in (4) tends to zero in measure, see for example Corollary 2.2.6 in [6]. Since σ ε is assumed to be uniformly bounded, the integrand is dominated by a scalar multiple of the integrable function |∇v| 2 . We claim that this implies that the integrand tends to zero in L 1 . Suppose not, and denote the integrand f ε . Then there exists δ > 0 and a subsequence (f εi ) i≥1 such that f εi L 1 ≥ δ for all i. This subsequence still converges to zero in measure, and so admits a further subsequence that converges to zero almost surely. An application of the dominated convergence theorem leads to a contradiction, hence we deduce that f ε tends to zero in L 1 and the result follows.
Denote the projection Π : The following lemma shows that the above result still holds if we replace M by Π • M.
Proof. We show that there exists C > 0 such that for all ( By the equivalence of · * and · Ḣ , Lemma 2.1, we have The infimum on the right-hand side is attained at Then by Proposition 2.7, we have We are interested in the inverse problem of determining the conductivity field from measurements of the voltages (V l ) L l=1 on the boundary, for a variety of input currents (I l ) L l=1 on the boundary. To this end we introduce the following version of Ohm's law. Observe that the mapping I → v , taking the current stimulation pattern to the solution of (2), is linear. Then given a conductivity field σ ∈ A(D), there exists a resistivity matrix R(σ) ∈ R L×L such that the boundary voltage measurements V (σ) arising from the solution of the forward model are related to I via V (σ) = R(σ)I By applying several different current stimulation patterns we should be able to infer more about the conductivity σ. Note however that since the mapping I → V is linear, only linearly independent stimulation patterns will provide more information 1 . Since we have the conservation of charge condition on I, there are at most L − 1 linearly independent patterns we can use.
Assume that J linearly independent current patterns I (j) ∈ R L , j = 1, . . . , J, J ≤ L − 1 are applied, and noisy measurements of V (j) = R(σ)I (j) are made: . Concatenating these observations, we write The inverse problem is then to recover the conductivity field σ from the data y. This problem is highly ill-posed: the data is finite dimensional, yet we wish to recover a function which, typically, lies in an infinite dimensional space. We take a Bayesian approach by placing a prior distribution on σ. The choice of prior may have significant effect on the resulting posterior distribution, and different choices of prior may be more appropriate depending upon the prior knowledge of the particular experimental set-up under consideration. In subsection 3.1 we outline three different families of prior models, and show the appropriate regularity of the forward maps arising from them. In subsection 3.2 we describe the likelihood and posterior distribution formally, before rigorously proving that the posterior distribution exists and is Lipschitz with respect to the data in the Hellinger metric.

Choices of Prior.
In this section we consider three priors, labelled by i = 1, 2, 3, defined by functions F i : X i → A(D) which map draws from prior measures on the Banach spaces X i to the space of conductivities A(D). Our prior conductivity distributions will then be the pushfoward of the prior measures by these maps F i . We describe these maps, and establish continuity properties of them needed for the study of the posterior later. In what follows, the space C 0 (D) of continuous functions on D will be equipped with the supremum norm · ∞ and the corresponding Borel σ-algebra.
3.1.1. Log-Gaussian prior. We first consider the simple case that the coefficient is given by the exponential of a continuous function. Let In this case, we will take our prior measure µ 0 on u to be a non-degenerate Gaussian measure N (m 0 , C 0 ) on C 0 (D). Note that the push forward of a Gaussian measure by F 1 is a log-Gaussian measure.
The covariance C is chosen such that the samples u almost surely have regularity u ∈ C s ,s− s (D) for all s < t, where from left to right t = 2, 1.5, 1, 0.5 respectively. Here the samples are generated on [−1, 1] 2 ⊇ D and then restricted to D, for computational simplicity.
3.1.2. Star-shaped prior. We now consider star-shaped inclusions, that is, inclusions parametrised by their centre and a radial function. These were studied in twodimensions in the paper [8] to parametrise domains for a Bayesian inverse shape scattering problem. In [8] the authors prove well-posedness of the inverse problem in an infinite dimensional setting through the use of shape derivatives and Riesz-Fredholm theory. Let be the continuous function representing the mapping from Cartesian to angular polar coordinates. Define the mapping A : is the space of continuous periodic functions on R d−1 . Then A(r, x 0 ) describes the set of points in D which lie within the closed surface parametrised in polar coordinates centred at x 0 by In two dimensions, we have R 1 = (−π, π] and the mapping h : R 2 → R 1 is given by where atan2 is the two-parameter inverse tangent function.
In three dimensions, we have R 2 = (−π, π] × [0, π] and the mapping h : R 3 → R 2 is given by Similar expressions for h exist in higher dimensions, though for applications we are only interested in the case d = 2, 3.
Define now the map where u + , u − > 0 are the scalar conductivity values. Again it can easily be seen that F 2 does indeed map into A(D). We claim that this map is continuous in the following sense: In order to show that a sequence of functions f ε : D → R converges to f : D → R in measure, it suffices to show that there exists a sequence of sets It follows that we have the inclusions Let ∆ denote the symmetric difference. We deduce that A(r ε )∆A(r) ⊆ A(r + γ(ε)) \ A(r − γ(ε)). 3 A sequence of functions (fε) ε>0 , fε : D → R, is said to converge in measure to a function

Now the right-hand side is given by
As ε → 0, this set decreases to the boundary set Since the graph of a continuous function has Lebesgue measure zero, we deduce that |∂A(r)| = 0. It follows that lim ε→0 |A(r ε )∆A(r)| = 0.
To conclude, note that By the distributivity of intersection over symmetric difference, we then have that ∆A(x 0 ) * . Therefore, using Theorem 1 from [39], we see that Since we assume that r is Lipschitz, the surface area H d−1 (∂A(x 0 ) * ) of the boundary of A(x 0 ) * is finite, and so it follows that lim ε→0 |A(x ε 0 )∆A(x 0 )| = 0. As before, we conclude by noting that x0)] . Now note that |A(r ε , y 0 )∆A(r, y 0 )| ≤ |A(r ε , y 0 ) * ∆A(r, y 0 ) * |.
The right hand-side is independent of y 0 by translation invariance of the Lebesgue measure. By the same argument as part (i) we conclude that it tends to zero. We then have that which tends to zero by the discussion above and part (ii).

Remark 3.3.
Above we assumed that r : R d−1 → R was Lipschitz continuous. This assumption is only used in the proof of part (ii) of the proposition. If the centre of the star-shaped region is known, this assumption may then be dropped to allow for rougher boundaries.

Level set prior.
We finally consider the case where the inclusions can be described by a single level set function, as in [17]. Let n ∈ N and fix constants . . , n − 1. Now given strictly positive functions f 1 , . . . , f n ∈ C 0 (D), we define the map F 3 : Since each f is continuous and strictly positive on a compact set D, they are uniformly bounded above and below by positive constants, and so F 3 does indeed map into A(D).
In this paper we are primarily concerned with the case of binary fields, n = 2 and f i constant above, however the theory in proved in the general case. We have the following result regarding continuity of this map, by the same arguments as in [17].
Proof. Denote by D i,ε and D 0 i,ε the sets as defined above associated with the approximating functions u ε . We can write Then we have for all x ∈ D and ε > 0 Hence for |j − i| > 1 and ε sufficiently small, By the uniform boundedness of the (f i ), for sufficiently small ε we can then write where Z ε ⊆ D is given by By the assumption that |D 0 i | = 0 for all i, it follows that |Z ε | → 0, and so the result follows from the comment at the start of the proof of Proposition 3.2.
Note that bound (6) actually above implies the slightly stronger result that, when the c i -level sets of u ∈ X have zero measure, F 3 is continuous into L p (D), 1 ≤ p < ∞, at u. The assumption that the level sets have zero measure is an important one, as illustrated by Figure 2: an arbitrarily small perturbation of u can lead to an order 1 change in F 3 (u).
In the Bayesian approach we are taking to this problem, we may choose a prior measure on u such that, almost surely, the Lebesgue measure of the level sets is zero. This is shown to hold for non-degenerate Gaussian measures in [17]. As a result, F 3 will be almost surely continuous under the prior, and this is enough to give the measurability required in Bayes' theorem, as shown in [17].  As in the log-Gaussian case, we take our prior measure µ 0 on u to be a Gaussian measure N (m 0 , C 0 ) on C 0 (D).
Example 3.6. Consider the case D = B(0, 1) ⊆ R 2 , n = 2, c 1 = 0, f 1 ≡ 1 and f 2 ≡ 2. Suppose that u is drawn from a centred Gaussian measure µ 0 = N (0, C) on C 0 (D). The covariance C is chosen such that the samples u almost surely have regularity u ∈ C s ,s− s (D) for all s < t, where from left to right t = 4, 3, 2, 1 respectively. As in the log-Gaussian case, here the samples are generated on [−1, 1] 2 ⊇ D and then restricted to D, for computational simplicity.
Remark 3.7. In the cases of the star-shaped and level set priors, we have assumed the values that the conductivity takes are known a priori. This may be appropriate in certain situations, for example in medical contexts where conductivities of different types of tissue are known [7]. However it may be preferable to have the flexibility of treating the conductivity values as part of the inference. The theory does not become significantly more involved, as it can be shown that the conclusions of propositions 3.2 and 3.5 still hold when we allow for perturbation of the conductivity values as well. We work with the case of fixed conductivity values for clarity of presentation, but we note that it is possible to place a prior upon these.
3.2. The Likelihood and Posterior Distribution. The inverse problem was introduced at the beginning of the section. Now that we have introduced prior distributions, we may provide the Bayesian formulation of the problem.
Let X be a separable Banach space and F : X → A(D) a map from the space X where the unknown parameters live to the conductivity space. Choose a set of current stimulation patterns I (j) ∈ R L , j = 1, . . . , J and let M j : A(D) → H denote the solution map when using stimulation pattern I (j) . Recall the projection map Π : H → R L was defined by Π(v, V ) = V .
The data y j from the jth stimulation pattern is assumed to arise from the map We concatenate these observations to get data y ∈ R JL given by where Γ = diag(Γ 0 , . . . , Γ 0 ) and G : X → R JL . This coincides with the setup at the start of the section, with σ = F (u).
Assume that u ∼ µ 0 , where µ 0 is independent of Q 0 . From the above, we see that y|u ∼ Q u := N (G(u), Γ). We use this to find the distribution of u|y. First note that dQ u dQ 0 (y) = exp −Φ(u; y) + 1 2 |y| 2 Γ where the potential (or negative log-likelihood) Φ : X × Y → R is given by Then under suitable regularity conditions, Bayes' theorem tells us that the distribution µ y of u|y is as given below: Theorem 3.8 (Existence and Well-Posedness). Let (X, F, µ 0 ) denote any of the probability spaces associated with any of the three priors introduced in the previous subsection, and let Φ : X × Y → R be the potential (7) associated with the corresponding forward map. Then the posterior distribution µ y of the state u given data y is well-defined. Furthermore, µ y µ 0 with Radon-Nikodym derivative dµ y dµ 0 (u) = 1 Z µ exp(−Φ(u; y)) (8) where for y Q 0 -a.s., Additionally, the posterior measure µ y is locally Lipschitz with respect to y, in the Hellinger distance: for all y, y ∈ Y with max{|y| Γ , |y | Γ } < ρ, there exists C = C(ρ) > 0 such that d Hell (µ y , µ y ) ≤ C|y − y | Γ .
In the proof of the above theorem we will make use of the following version of Bayes' theorem from [10]. Proposition 3.9 (Bayes' theorem). Define the measure ν 0 (du, dy) = µ 0 (du)Q 0 (dy) on X × Y . Assume that Φ : X × Y → R is ν 0 -measurable and that, for y Q-a.s.
We need to verify that the assumptions of this theorem are satisfied. To proceed we first give some regularity properties of the potential Φ: Proposition 3.10. Let (X, F, µ 0 ) denote any of the probability spaces associated with the priors introduced in the previous subsection. Then the potential Φ : X×Y → R associated with the corresponding forward map, given by (7), admits the following properties.
Proof. (i) From equation (5) in the proof of Proposition 2.7, we see that there exists C > 0 such that each M j : for all σ 1 , σ 2 ∈ A(D). Taking σ 2 ≡ 1, say, we deduce that Hence σ ∞ < ρ implies that M j (σ) * < C(1 + ρ). By Corollary 2.8, it follows that Π • M j : A(D) → R L is bounded on bounded sets with respect to · ∞ for all j.
(ii) Let u ∼ µ 0 and suppose F : X → A(D) is such that u ε − u X → 0 implies that F (u ε ) → F (u) either uniformly or in measure. Then Proposition 2.7 tells us that M j • F : X → H is continuous at u for each j. The projection Π : H → R L is continuous, and so G j = Π • M j • F is continuous at u for each j. In §3.1 it is shown that this is true for F = F 1 and F = F 2 for any u. For F = F 3 it is only true at points u whose level sets have zero measure, however since we are assuming u ∼ µ 0 , a Gaussian measure, it follows from Proposition 7.2 in [17] that u µ 0 -almost surely has this property.
(iii) Let u ∈ X and y 1 , y 2 ∈ Y with max{|y 1 | Γ , |y 2 | Γ } < ρ. Then we have We now consider cases separately based on the prior. In the log-Gaussian case, we may boundC using the bound from the proof of part (i). Square-integrability of C(ρ, · X ) follows since Gaussians have exponential moments.
In the star-shaped and level set prior cases, we have that |G(u)| is bounded uniformly by a constant. We may hence boundC(ρ, u) above by some C(ρ, u X ) := C(1 + ρ) that is independent of u, and so again the squareintegrability follows.
The second term tends to zero µ 0 -almost surely by continuity. For the first term, note that the sequences ( u n X ) n≥1 and (|y n | Γ ) n≥1 are bounded, by K and R respectively, say. Then we can use the local Lipschitz property to deduce that |Φ(u n , y n ) − Φ(u n , y)| ≤ C(R, K)|y n − y| Γ since C(·, ·) : R×R → R, as defined in the proof of Proposition 3.10, is monotonically increasing in both components. Therefore this term tends to zero, and we obtain the desired continuity. It follows, see for example Lemma 6.1 in [17], that Φ is ν 0 -measurable.
For the lower bound on Z µ , we consider cases separately based on the prior. First we consider the log-Gaussian and level set prior cases so that µ 0 is Gaussian. Let B ⊆ X be any ball. Fix any ρ > |y| Γ and define where K is the upper bound from Proposition 3.10(i). This supremum is finite by the continuity of K. Then we have Since µ 0 is Gaussian, µ 0 (B) > 0 and so Z µ > 0.
In the star-shaped prior case, proceed as above but take by the assumption that σ 0 assigns positive mass to balls, and so again Z µ > 0. The above hold for all y ∈ Y , and so in particular for y Q 0 -almost-surely. We may now apply Bayes' Theorem 3.9 to obtain the existence of µ y . The proof of well-posedness is almost identical to that of the analogous result Theorem 2.2 in [17] and is hence omitted.

Numerical Experiments
We investigate the effect of the choice of prior on the recovery of certain binary conductivity fields. The specific fields we consider are shown in Figure 3, where blue represents a conductivity of 1 and yellow a conductivity of 2. Simulations are performed using the EIDORS software [1] to solve the forward model using a first order finite element method; a mesh of 43264 elements is used to create the data and a mesh of 10816 elements is used for simulations in order to avoid an inverse crime [20].
In subsection 4.1 we describe the MCMC sampling algorithm that we will use. In subsection 4.2 we define the parameters we will use for the forward model and the MCMC simulations. We also describe how the data is created, and define our choices of prior distributions. Finally in subsection 4.3 we present the results of the simulation, looking at quality of reconstruction, convergence of the algorithm and some properties of the posterior distribution. 4.1. Sampling Algorithm. We aim to produce a sequence of samples from µ y on X, where µ y is given by (8). We make use of the preconditioned Crank-Nicolson Markov Chain Monte Carlo (pCN-MCMC) method. The pCN-MCMC method is a modification of the standard Random Walk Metropolis (RWM) MCMC method which is well-adapted to Gaussian priors in high dimensions. It was introduced in [5], and its dimension independent properties are analysed and illustrated numerically in [14] and [9] respectively; the pCN nomenclature was introduced in [9]. In the case of the star-shaped prior, we use a Metropolis-within-Gibbs algorithm [43], alternately updating the field with the pCN method above and updating the centre with the standard RWM method.
An advantage of these MCMC methods is that derivatives of the forward map are not needed, only black-box solution of the forward model. However in order to accurately compute some quantity of interest, such as the conditional mean, we may need to produce a very large number of samples and tuning the algorithm to minimise this effect is important. For this reason we compute the effective sample  size from the integrated autocorrelation (neglecting a burn-in period) of a quantity of interest, as in [21].

Data and Parameters.
We work on a circular domain of radius 1, with 16 equally spaced electrodes on its boundary providing 50% coverage. We take all contact impedances z l = 0.01. We stimulate adjacent electrodes with a current of 0.1, so that the matrix of stimulation patterns I = (I (j) ) 15 j=1 ∈ R 16×15 is given by The conductivity is chosen such that it takes values 1 and 2. We perturb the measurements with white noise η ∼ N (0, γ 2 I), γ = 0.0002, so that the mean relative error on both sets of data is approximately 10%. The true conductivity fields used to generate the data, henceforth referred to as Conductivity A and Conductivity B, are shown in Figure 3. In all cases we generate N = 2.5 × 10 6 samples with a burn-in of k 0 = 5 × 10 5 samples. Our priors on fields will make use of Gaussians with covariances of the form These are essentially rescaled Whittle-Matern covariances [31], with τ representing the inverse length scale of the samples, α proportional to their regularity, and q proportional to their amplitude.
In what follows, denote by A N the Laplacian with Neumann boundary conditions on [−1, 1] 2 , restricted to D, so that its domain is given by Defining the Laplacian first on a square and then restricting to D will allow for efficient generation of Gaussian samples via the fast Fourier transform. Note that if we were to consider priors of the form (9) with τ = 0, we should restrict D(A N ) further to ensure the invertibility of A N . Additionally, denote by A D the Laplacian with Dirichlet boundary conditions on R 1 = (−π, π], so that its domain is given by  Figure 6 shows the values of the misfit Φ at the different sample means. We consider two different types of mean conductivities. First the sample means are calculated in the sample spaces X i and then pushed forward to the conductivity space by the maps F i , so that we produce estimates of F i (E(u)). This preserves the binary nature of the fields in the cases of the star-shaped and level set priors. We also consider the sample means of the pushforwards of the posteriors by F i , E(F i (u)), which do not preserve the binary property but provide some visualisation of the interface uncertainty in the posterior.
For Conductivity A, the sample mean F 2 (E(u)) arising from the star-shaped prior provides a better reconstruction than the other two prior choices. This is expected, since the true conductivity was drawn from this prior. Whilst the sample means arising from the level set prior are fairly close to the true conductivity (both visually and in terms of Φ), the boundary of the interface appears to have too large a length-scale. Appropriate choice of prior length-scale is a key issue the with the level set method; treating the length-scale hierarchically as another unknown in the problem may be beneficial. On the other hand The sample means arising from the Gaussian prior fail to recover both the sharp interface and the values of the conductivity, which is reflected in the large values Φ takes. Nonetheless, general qualitative properties of the true conductivity can be seen in the sample means. As with the level set prior, reconstruction could be improved by treating hierarchically the parameters in the prior such as length-scale, marginal variance and mean. For Conductivity B, the level set prior is most effective in the reconstruction, since a specific number of inclusions isn't fixed a priori as it is for the star-shaped prior. Again the Gaussian prior fails to recover both the sharp interface and the values of the conductivity, however it appears to do a better job than the star-shaped prior at identifying the location and shape of the two inclusions.
In both of the above cases, even though the individual samples coming from using the level set prior contain many small inclusions, these do not show up in the sample means F 3 (E(u)), though they are visible in the means E(F 3 (u)).
In terms of the values of the misfit as shown in Figure 6, in all but one of the above cases the mean E(F i (u)) provides a better fit than F i (E(u)) despite the lack of the binary property. The quality of the recovery cannot be assessed solely on the misfit however; an advantage of the Bayesian approach is that both means are available (along with the full posterior), and whichever is more appropriate could depend on the context.

4.3.2.
Convergence. In Figure 7, we show the approximate effective sample size (ESS) associated with different quantities of interest. For all choices of prior, these are significantly smaller than the total 2.5 × 10 6 samples generated. Many more samples may hence be required to produce accurate approximations of the posterior mean.
The chain associated with the star-shaped prior results in the largest ESS, likely because we are only attempting to infer 2 8 +2 parameters rather than 2 14 parameters as in the log-Gaussian and level set cases.
In order to accelerate the convergence of the MCMC we can adjust the jump parameters β and δ. Larger choices of these parameters mean that accepted states will be less correlated with the current state, however the proposed states are less likely to be accepted. The choice β = 1 in pCN produces proposed states that are independent of the current state, but dependent upon how far the prior is from the posterior, very few or no states may be accepted so that the chain never moves. Similarly, smaller choices of these jump parameters mean that more proposals will be accepted, but the states will be more correlated. A balance hence must be achieved -in our simulations we choose the parameters such that approximately 20-30% of proposals are accepted, though in general the optimal acceptance rate is not known [3].
Alternatively, reconstruction may be accelerated by looking at an approximation of the posterior instead of the exact posterior, for example using the ensemble Kalman filter [16] or a sequential Monte Carlo method [4]. We could also initialise the MCMC chains from EnKF estimates to significantly reduce the burn-in period and hence computational cost. If the derivative of the forward map is available, Hybrid Monte Carlo (HMC) methods could be used to accelerate the convergence [11]. Emulators could also be used to reduce the computational burden of derivative calculation, allowing the use of geometric MCMC methods such as Riemannian Manifold Hamiltonian Monte Carlo (RHMC) and Lagrangian Monte Carlo (LMC) [27]. Figures 8-10 we show kernel density estimates for a number of quantities associated with each posterior distribution. These are calculated using subsequences of the MCMC chains, with lengths of the same order of magnitude as the effective sample sizes, in order to avoid over-fitting. The most regular densities arise in the star-shaped case, with the distribution of all quantities appearing to be very close to uni-modal. More irregularity is seen for the log-Gaussian case, especially in the joint distributions, but they are still very close to uni-modal.

Posterior Behaviour. In
The least regular, more multi-modal densities come from the level set prior. One reason for this is likely the lack of identifiability of the level set function: the forward model only 'sees' the zero level set of the state, and hence cannot distinguish between infinitely many different states. The prior can however distinguish between these states, and will weight them appropriately, which can help explain the shape of the posterior densities.

Conclusions
The primary contributions of this paper are: • We have formulated the EIT problem rigorously in the infinite dimensional Bayesian framework. • We have studied three different prior models, each with their own advantages and disadvantages based on prior knowledge and the nature of the field we are trying to recover. • With each of these choices of prior we obtain well-posedness of the problem.
We can obtain well-posedness using additional prior models, as long as the mapping from the state space to the conductivity space has appropriate regularity. • The infinite dimensional formulation of the problem leads to the use of state of the art function space MCMC methods for sampling the posterior distribution. • Simulations performed using these methods illustrate that the conditional mean provides a reasonable reconstruction of the conductivity, even with fairly significant noise on the measurements. They also illustrate the fact that the choice of prior has a significant impact on reconstruction and, in particular, that the geometric priors (star-shaped and level set) can be particularly effective for the (approximately) piecewise constant fields that arise in many applications.
Future research directions could include the following: • Sampling the exact posterior distribution using MCMC can be computationally expensive. Methods that approximate the posterior may be as effective for calculating quantities such as the conditional mean, with much lower computational load. The relative effectiveness versus cost of different methods could be studied. This could be especially important for simulations in three dimensions, where forward model evaluations are even more expensive. • When using the level set prior, the length scale of samples could be treated hierarchically as an additional unknown in the problem. • The star-shaped prior could be extended to describe multiple inclusions, either with the number of inclusions fixed or as an additional unknown. • The Calderón problem could be considered in place of the CEM. The data could then be either the full Neumann-to-Dirichlet (or Dirichlet-to-Neumann) map, or a finite number of pairs of Dirichlet and Neumann boundary values. In order to perturb the measurements with noise, the former case would require the definition of Gaussian distributions on spaces of operators, and the latter Gaussian distributions on manifolds.
0 ) and four Fourier coefficients of r. The diagonal displays the marginal densities of each quantity, and the off-diagonals the marginal densities of corresponding pairs of quantities.