RECONSTRUCTING A FUNCTION ON THE SPHERE FROM ITS MEANS ALONG VERTICAL SLICES

. We present a novel algorithm for the inversion of the vertical slice transform, i.e. the transform that associates to a function on the two- dimensional unit sphere all integrals along circles that are parallel to one ﬁxed direction. Our approach makes use of the singular value decomposition and resembles the molliﬁer approach by applying numerical integration with a reconstruction kernel via a quadrature rule. Considering the inversion problem as a statistical inverse problem, we ﬁnd a family of asymptotically optimal molliﬁers that minimize the maximum risk of the mean integrated error for functions within a Sobolev ball. By using fast spherical Fourier transforms and the fast Legendre transform, our algorithm can be implemented with al- most linear complexity. In numerical experiments, we compare our algorithm with other approaches and illustrate our theoretical ﬁndings.


1.
Introduction. The problem of reconstructing a function from its integrals along certain lower-dimensional submanifolds has been studied since the early twentieth century. It is associated with the terms reconstructive integral geometry [28] and geometric tomography [11]. In 1913, Funk [9] described the problem of reconstructing a function on the two-sphere knowing its mean values along all great circles of the sphere. The computation of these mean values is known as the Funk-Radon transform or simply the Funk transform or spherical Radon transform.
What Funk did for great circles can be generalized to other classes of circles on the unit sphere S 2 . One can consider the integration along all circles with a fixed radius, which is known as the translation operator or the spherical section transform, cf. [34,33,7]. Another example is the spherical slice transform (see [1], [17,II.1.C]), which computes the means along all circles that contain a fixed point of the sphere.
In this paper, we look at vertical slices of the sphere, i.e., circles that are the intersections of the sphere with planes parallel to the ξ 3 -axis. Let us denote with e σ = (cos σ, sin σ, 0) , σ ∈ T = [0, 2π), the point on the equator of the sphere with the latitude σ. For a continuous function f : S 2 → C, we define The vertical slice transform first arose in an article by Gindikin et al. [14] from 1994 that included an explicit inversion formula of T , which is numerically instable. It reappeared in 2010 as the circular mean transform in a problem related to photoacoustic tomography, when Zangerl and Scherzer [44] described an algorithm for inverting the vertical slice transform using a connection to the circular Radon transform in the plane.
In this paper, we consider the reconstruction of f from the point of view of statistical inverse problems, cf. [5]. More precisely, we consider discrete data g(x m ) = T f (x m ) + ε(x m ), m = 1, . . . , M, at sampling points x m ∈ X, m = 1, . . . , M , which is perturbed by an uncorrelated white noise random field ε; and aim at optimal linear estimators Ef of f given the data g(x m ). As measure of optimality, we use the mean integrated squared error (MISE) Assuming that the function f is in the Sobolev ball where s, S > 0 and H s e (S 2 ) is the Sobolev space of degree s restricted to functions that are even in the third component (cf. Section 2), we look at the minimax risk, cf. [41,5], (1.1) inf The minimax risk of spherical deconvolution has been examined in [22,20]. In this paper, we follow the ideas of [20] and restrict ourselves to a specific class of estimators that resemble the mollifier approach [24] in inverse problems, which has been successfully adopted for inverting the Funk-Radon transform in [25]. A Tikhonov-type regularization on Riemannian manifolds was examined in [39] and applied to the Funk-Radon transform using spherical harmonics.
Let ω m ∈ R, m = 1, . . . , M be some quadrature weights for the sampling points x m ∈ X and be the corresponding hyperinterpolation operator [36] of degree N for the orthonormal basis system B k n (σ, t) = 2n+1 4π e ikσ P n (t), (σ, t) ∈ X, where P n is the Legendre polynomial. Then we define for a mollifier ψ : [−1, 1] → R the estimator where denotes the spherical convolution and T † is the generalized inverse of T : L 2 (S 2 ) → L 2 (X). Our aim is to find optimal mollifiers ψ * for which the infimum in (1.1), restricted to the class of estimators E N,ψ , is attained asymptotically, as the number of sampling points goes to infinity.
Outline. This paper is structured as follows. After collecting some basic facts about spherical harmonics in Section 2, we introduce in Section 3 the vertical slice transform T and give in Theorem 3.3 its singular value decomposition as well as the asymptotic behavior of the singular values. Section 4 introduces the hyperinterpolation operator and discusses the relationship to the underlying quadrature rule. The main result of the paper is formulated in Section 5 in Theorem 5.4, which gives an asymptotically optimal mollifier for the vertical slice transform T in dependency of the number of sampling points, the noise level and the Sobolev ball F (s, S). The proof of Theorem 5.4 has been postponed to Section 6, where we derive lower and upper bounds for the bias as well as for the variance part of the MISE. Section 7 is devoted to numerical tests illustrating our theoretical findings. First of all we give a fast algorithm for our optimal estimator. This algorithm is based on the fast spherical Fourier transform [23] in conjunction with a fast Legendre transform and has the numerical complexity O(M log 2 M ). In the subsequent section, we compare our algorithms with two algorithms for the inversion of the Radon transform in the plane. This is possible since, by orthogonal projection, the vertical slice transform of a function f can be expressed as the Radon transform of the projected function multiplied with the weight (1 − |ξ| 2 ) −1/2 , ξ ∈ R 2 . The main drawback of this approach is that the weight increases to infinity at the boundary of the disc. This becomes clearly visible when inverting the Radon transform with a backprojectiontype algorithm [21] and a little less visible when an algorithm based on orthogonal polynomial expansion on the unit disc, cf. [42], is used. The reason for this improvement is that the latter algorithm uses, similarly as our algorithm, a sampling grid that becomes more dense close to the boundary. 2. Preliminaries and notation. In this section we are going to summarize some basic facts on harmonic analysis on the sphere as it can be found, e.g., in [7]. We denote with Z the set of integers and with N 0 the nonnegative integers. We define the two-dimensional sphere S 2 = {ξ ∈ R 3 | |ξ| = 1} as the set of unit vectors ξ = (ξ 1 , ξ 2 , ξ 3 ) in the three-dimensional Euclidean space and make use of its parametrization in terms of the polar angles ξ(θ, ρ) = (cos ρ sin θ, sin ρ sin θ, cos θ) , θ ∈ [0, π], ρ ∈ [0, 2π).
Let f : S 2 → C be some measurable function. With respect to polar angles, the surface measure dξ on the sphere reads Spherical harmonics. We define the associated Legendre polynomials for all (n, k) ∈ I := {(n, k) ∈ N 0 × Z | |k| ≤ n} . They satisfy the three-term recurrence relation [15, 8.735.2] (2.1) The Legendre polynomials P n = P 0 n of degree n ∈ N 0 form a system of orthogonal polynomials in L 2 ([−1, 1]) and satisfy the three-term recurrence relation for n ≥ 1 with the initialization P 0 (t) ≡ 1 and P −1 (t) ≡ 0. An orthonormal basis in the Hilbert space L 2 (S 2 ) of square integrable functions on the sphere is formed by the spherical harmonics Accordingly, any function f ∈ L 2 (S 2 ) can be expressed by its Fourier series and it satisfies Parseval's equality Sobolev spaces. A function p : S 2 → C that has a finite representation with respect to spherical harmonics is called a spherical polynomial of degree N ∈ N 0 ifp(N, k) = 0 for some k. For such a spherical polynomial p of degree up to N and some s ∈ R, we introduce the Sobolev norm As usual, the Sobolev spaces H s (S 2 ) are defined as the completion of the space of all spherical polynomials with respect to the Sobolev norm · H s (S 2 ) , cf. [8]. If s > 1, the Sobolev space H s (S 2 ) is embedded in the space of continuous functions C(S 2 ). We define H s e (S 2 ) as the subspace of H s (S 2 ) that contains only functions that are even in the third coordinate. This definition is equal to saying that H s e (S 2 ) is the subspace of functions whose Fourier coefficients vanish outside the set I e = {(n, k) ∈ I | n + k even}.
Asymptotic analysis. We define some symbols comparing the asymptotic growth. Let (a n ) n∈N0 and (b n ) n∈N0 be two sequences of real numbers. We say a n ∼ b n for n → ∞ if there exist constants c 1 , c 2 ∈ (0, ∞) and n 0 ∈ N 0 such that c 1 b n ≤ a n ≤ c 2 b n for any n ≥ n 0 . We say a n b n if lim n→∞ a n /b n = 1. Furthermore, we write a n > b n if lim sup n→∞ a n /b n ≤ 1 and analogously a n ? b n if lim inf n→∞ a n /b n ≥ 1.
3. The vertical slice transform on the sphere. In the following definition, we denote by T = [0, 2π) the 2π-periodic torus.
Definition 3.1. For f ∈ C(S 2 ), we define the vertical slice transform where · denotes the scalar product. For simplicity, we write The vector ξ (π/2, σ) = (cos σ, sin σ, 0) is located at the equator of the sphere and has latitude σ. The vertical slice transform T computes the average values along all circles that are parallel to the ξ 3 -axis. Since all these circles are symmetric with respect to the ξ 1 -ξ 2 -plane, T f vanishes for functions f that are odd in the third coordinate, i.e.
Another symmetry property of the vertical slice transform is given by where equality is to be understood modulo 2π in the σ variable. Hence, we can define an equivalence relation ∼ on X by saying (σ, t) ∼ (σ + π, −t). Then the quotient space of X by this equivalence relation is isomorphic to the Möbius strip.
An explicit inversion formula for operator T was shown in [14]. It is based on the inversion formula [17] of the Radon transform in the plane applied to the functioñ , for ξ 2 1 + ξ 2 2 < 1 0, otherwise.
Theorem 3.2 (Gindikin et al. [14]). Let f : S 2 → C be even in the third coordinate. Then where the integral with respect to q has to be understood in the sense of Cauchy principal value and where T f (σ, t) = 0 for |t| > 1.
The inversion formula (3.2) is unfavorable for numeric computations. The singular value decomposition of T , which is given in the following theorem, promises more numerical stability.  T Y k n = λ k n B k n , (n, k) ∈ I, consisting of the spherical harmonics Y k n from (2.3), the basis functions 2n + 1 4π e ikσ P n (t), and the singular values (n, k) ∈ I e := {(n, k) ∈ I | n + k even} and λ k n = 0 for (n, k) ∈ I\I e , where n!! = n · (n − 2) · (n − 4) · . . . · 1 denotes the double factorial with (−1)!! = 1. Furthermore, we have for (n, k) ∈ I e (3.6) where · 2 denotes the ceiling to the next multiple of two.
Proof. Let f ∈ C(S 2 ), (n, k) ∈ I and (σ, t) ∈ X. The following generalization of the Funk-Hecke formula was proven in [4, Section 4.2], Hence, the vertical slice transform applied to a spherical harmonic Y k n gives In the next step, we compute the associated Legendre polynomials P k n at zero. According to the recurrence relation (2.2) for the Legendre polynomials, we have P 0 n (0) = P n (0) = −(n − 1)P n−2 (0) n = 1 + (−1) n 2 (−1) n/2 (n − 1)!! n!! for n ≥ 1 as well as P 0 (0) = 1. With the recurrence relation (2.1) of the associated Legendre polynomials, we obtain for n, k ≥ 1, For k < 0, we have by the symmetry relation which proves (3.3) and the left part of (3.5). The right side of (3.5) follows by replacing the double factorials with factorials using the well-known equalities (2m)!! = 2 m m! and (2m − 1)!! = (2m)!/(2 m m!) for m ∈ N 0 . Next, we prove (3.6). Let (n, k) ∈ I e , then λ k n = (−1) n λ −k n . For 0 ≤ k ≤ n − 2, we obtain the recurrence relation Hence, λ k n ≤ |λ n n |. If n and thus also k is even, we observe that λ k n ≥ λ 0 n . If n is odd, we have λ k n ≥ λ 1 n . Analogously to the above recurrence, we see that The application of Stirling's formula n! √ 2πn n n e −n yields for n → ∞ The functions {B k n | n ∈ N 0 , k ∈ Z} from (3.4) form an orthonormal basis in L 2 (X). The fact that the singular values λ k n vanish for all (n, k) ∈ I\I e shows that the nullspace of T consists of all functions that are odd with respect to the third coordinate. Hence, the vertical slice transform T : with quadrature nodes x m ∈ X and the respective weights w m , m = 1, . . . , M . For the orthonormal system of functions B k n from (3.4) in the space L 2 (X), we define an analogue to the trigonometric interpolation operator, namely where In this case, L N is a projection operator.
Since the functions B k n are orthonormal, the last equality holds if and only if for all (n, k), (l, j) ∈ I e N , where δ denotes the Kronecker delta.
We are going to impose the following condition on the hyperinterpolation L N . We call this sequence applicable if the following three conditions hold: It should be noted that for quadrature rules with constant weights w N m = M (N ) −1 that are exact of degree 2N , the condition (4.2) is satisfied with γ 1 = γ 2 = 1. Unfortunately, we could not find any such quadrature rules with M (N ) ∈ O(N 2 ) nodes, which is due to the fact that constant-weight quadratures on the unit interval that are exact of degree N have at least ∼ N 2 nodes, cf. [13].
In the following theorem, we give an example of an applicable sequence of hyperinterpolations. These hyperinterpolations of degree N ∈ N 0 make use of the tensor product of the (2N + 1)-point equidistant quadrature on the torus T with the (2N + 1)-point Fejér quadrature [12] on the unit interval [−1, 1]. Theorem 4.3. We define the Fejér hyperinterpolation of degree N ∈ N 0 for g ∈ C(X) where for j = 1, . . . , 2N + 1. Then the so-defined sequence of hyperinterpolations L N is applicable.
Proof. The proof is included in Section A of the appendix.

Reconstruction.
5.1. The mollifier method. We want to reconstruct a function f given its vertical slice transform g = T f . The singular value decomposition in Theorem 3.3 shows that the nullspace of T contains exactly the functions that are odd in the third coordinate. Hence, only the even part of f can be reconstructed. In the following, we assume that f is even in its third coordinate. Like it is the case for many other problems of this type, the inversion of the vertical slice transform is an ill-posed problem. We consider the discrete noisy data at data points x m ∈ X, m = 1, . . . , M . We assume that the data error ε(x m ), m = 1, . . . , M , is a complex-valued random vector satisfying the following three conditions for all m, l = 1, . . . , M with m = l, where E denotes the expected value.
Remark 5.1. The consistency theorem of Daniell and Kolmogorov [3, p. 303] ensures the existence of a probability space (Ω, Σ, P ) and a random field ε : Ω × X → C whose marginal distributions satisfy i) -iii) on every finite subset of X. Throughout this paper, we will suppress the dependence of ε on the sample space Ω in our notation. An example of such marginal distributions are zero-mean Gaussian distributions, whose covariance matrix is the identity matrix multiplied with δ 2 ∈ R.
To deal with the ill-posedness of the inverse problem, we use the mollifier method, cf. [24]. We consider a polynomial ψ : [−1, 1] → R of degree N and aim at computing the convolution For a sufficiently good choice of the mollifier ψ, the convolution ψ f is an approximation of the desired function f . The Funk-Hecke formula [10,16] states that For g = T f , the singular value decomposition from Theorem 3.3 yields Replacing the integral in the last equation with the quadrature rule from Section 4, we define for any g ∈ C(S 2 ) the estimator 5.2. Asymptotically optimal mollifiers. As a measure for the accuracy of the estimator E N,ψ , we use the mean integrated squared error (MISE) defined by . We want to bound the MISE over Sobolev balls where s > 1 and S > 0 are some constants and H s e (S 2 ) is the Sobolev space of functions that are even in the third coordinate, cf. (2.5). The condition s > 1 ensures that f and therefore T f is continuous and hence the estimator E N,ψ (T f ), which requires point evaluations, is well-defined. We are interested in the maximum risk We call a mollifier ψ * optimal for degree N , if it minimizes (5.4) amongst all mollifiers ψ ∈ L 2 ([−1, 1]). If a mollifier is optimal, it achieves the minimax error We call a sequence of mollifiers ψ * N , N ∈ N 0 , asymptotically optimal, if the maximum risk (5.4) for the mollifier ψ * N is asymptotically equal to the minimax rate for We will show that a class of asymptotically optimal mollifiers is given in the following definition.
Remark 5.3. The mollifiers from Definition 5.2 can be seen as a generalization of some well-known mollifiers, which are depicted in Figure 1. They lie between the Fejér kernel, whose Legendre coefficients decrease linearly from 0 to L like those of ψ 1 L , and the Dirichlet kernel, which is the limit of ψ s L for s → ∞. Furthermore, these mollifiers are similar to the CuP (cubic polynomial) scaling functions, whose Legendre coefficients are given byψ(n) = (1 − n/L) 2 (1 + 2n/L) for n ≤ L and 0 elsewhere, see [26, p. 195]. The Fourier coefficients of the CuP functions are smoother at the upper end of the bandwidth n = L compared to the mollifier (5.5).
The following theorem shows that the family of mollifiers (5.5) is asymptotically optimal for the class F (s, S). , N ∈ N 0 , is asymptotically optimal for the estimator E N,ψ for the inversion of the vertical slice transform T of functions f belonging to the class F (s, S). In particular, we have for N → ∞ Furthermore, the minimax risk of the MISE is bounded asymptotically for N → ∞ by .
Remark 5.5. The mollifier approach (5.1) can be generalized to Fourier multiplications, where the coefficientsψ(n) in the Funk-Hecke formula (5.2) are replaced by coefficients that depend not only on n but also on k. However, Theorem 5.4 would not change for this approach since the shape of the optimal mollifier ψ s N depends only on the Sobolev space H s (S 2 ), which we will see in Lemma 6.7.
6. Proof of Theorem 5.4. In following, we will prepare the proof of Theorem 5.4, which marks the very end of this section.
Theorem 6.1. The MISE can be decomposed into a bias and a variance term Proof. Since the estimator E N,ψ is linear, we have where the last summand vanishes because E(E N,ψ ε) = 0.
The decomposition (6.1) is well-known, see [41, Section 1.2.1]. In the following two subsections, we derive bounds for the variance and bias error. In Section 6.3, the proof of the optimality of the mollifiers (5.5) follows eventually. 6.1. Variance error.
, and the hyperinterpolation L N be applicable in the sense of Definition 4.2. If N is sufficiently large, the variance term of (6.1) can be estimated by Proof. By Parseval's equality (2.4) and the uncorrelatedness of the noise ε from Section 5.1, The applicability condition (4.2) implies (6.2). where d s is given in (5.8).
Proof. Let (n, k) ∈ I e . At first we calculate an asymptotic expression of the singular values λ k n . The following version of Stirling's formula was proven in [32]. For n = 1, 2, . . . n! = √ 2πn n+1/2 e −n e r(n) , where 0 < r(n) < 1/(12n). The application of this formula to the singular values yields for |k| < n with the error term which is for k ≥ 0 bounded by |R(n, k)| < 2 3(n − k) .
In the next step, we insert the approximation (6.5) and the Taylor series of the exponential function in order to compute the sum (6.6) The first summand of the right hand side of (6.6) gives π 2 n k=−n 2|(n+k) For the second summand of the right side of (6.6), we see that Since the other summands are even smaller than the second, these parts can also be bounded by O(n). Now we can insert the Fourier coefficients of the mollifier (5.5) and see that for L → ∞ which proves (6.4).

Bias error.
With the triangle inequality, we split up the bias error into a smoothing and an aliasing part, Smoothing error. If f ∈ H s S 2 for s ≥ 0, then Parseval's equality (2.4) and (5.2) imply that the smoothing error is bounded by In particular, for the mollifier ψ s L from (5.5) and a function f ∈ F (s, S), we have the smoothing error Lemma 6.4. Let g ∈ C(X), N ∈ N 0 , and the hyperinterpolation L N from (4.1) be exact. Denote with P N the orthogonal projection onto the range of L N . Then The previous lemma was proven in [36]. The constant 4π equals the surface area of the manifold X. The following lemma and the theorem thereafter are based on [18], where they were proven similarly for the Sobolev spaces H s (S 2 ) on the two-sphere. Lemma 6.5. For s > 1, the Sobolev space H s T (X) can be embedded continuously into the space of continuous functions C (X). Let N ∈ N 0 and denote with P N the L 2 -orthogonal projection onto span{B k n | (n, k) ∈ I e N }. Then there exists a constant c > 0 independent of N such that for all g ∈ H s T , . We will show that the Fourier series of g − P N g converges uniformly. Let (σ, t) ∈ X. By the definition of B k n , the Cauchy-Schwartz inequality and since |P n (t)| ≤ 1 for all n ∈ N 0 , we have .
Since s > 1, the last sum converges and it is bounded by O((N + 1/2) 2−2s ). Hence, the Fourier series of g − P N g converges uniformly and we conclude the existence of a constant c independent of N and g such that which implies (6.10). Analogously, (6.11) follows by the calculation Now we are able to prove the following upper bound for the aliasing error.
Theorem 6.6. Let s > 1, f ∈ H s e (S 2 ) and g = T f . Furthermore, let for every N ∈ N 0 the hyperinterpolation L N be exact. If the mollifier ψ satisfies ψ (n) ≤ 1 for all n ∈ N 0 , then there exists a constant c > 0 that is independent of f and N so that Proof. We first note that, because s > 1, g is continuous and hence L N g is welldefined. We will show the estimate from which the claimed formula (6.12) follows by the assumptions on ψ and (5.2). Let P N denote the orthogonal projection of L 2 (S 2 ) onto the range of L N . Since L N is a projector, we have L N P N = P N and therefore (6.13) Now we examine both summands of (6.13). For the second summand, we obtain by (6.11) from the previous lemma (6.14) . For the first summand of (6.13), we have by (6.8) By (3.6), there exists a constant c 1 independent of g and N such that By Lemma 6.4, (6.16) L N (g − P N g) L 2 (X) ≤ √ 4π g − P N g C(X) .
By (6.10), there exists a constant c 2 independent of g and N such that . Combining (6.13), (6.14), (6.15), (6.16), and (6.17) yields where c is some positive constant. Since P N is a projection, g − P N g H s T (X) ≤ g H s T (X) , which finally proves the theorem. 6.3. Proof of the optimality. Lemma 6.7. Let ψ ∈ L 2 (S 2 ), s > 1 and S > 0. Furthermore, let the estimator E N,ψ and the noise ε be as in Section 5.1, and the hyperinterpolation L N be exact.
Then there exists a largest number L * ≥ 0 such thatψ s L * (n) ≤ ψ (n) for all n ∈ N 0 . Moreover, for every sufficiently large N , the maximum risk (5.4) is bounded from below by Proof. We first show the existence of L * . The set is not empty because 0 ∈ L . Since ψ ∈ L 2 (S 2 ), we have lim n→∞ψ (n) = 0. On the other hand, lim L→∞ψ s L (n) = 1 for all n ∈ N 0 . Thus, the set L is bounded. Hence, there exists L * = sup L . Let L j , j ∈ N 0 , be a sequence of elements of L converging to L * . Since L →ψ s L (n) is continuous and by definitionψ s Lj (n) ≤ |ψ(n)| for all n ∈ N 0 , the latter must also hold for L * instead of L j , which proves that L * ∈ L .
We choose an integer n * ∈ N 0 for whichψ s L * (n * ) = |ψ(n * )|. We define the polynomial Hence f * H s (S 2 ) = S and we have f * ∈ F (s, S). By (6.1), sup . These two summands are the bias and variance error with respect to f * . We choose N to be larger than n * . Since f * is a polynomial of degree n * , we have E N,ψ (T f * ) = ψ f * . Using the definition of ψ s L * , the bias error reads by the definition of f * . Since |1 − z| 2 ≥ (1 − |z|) 2 for all z ∈ C, we have By the definition of L * , we have |ψ(n)| ≥ψ s L * (n) for all n ∈ N 0 . Therefore, we obtain by (6.3) Now we can finally prove the main theorem. The idea behind this proof is, to calculate the parameter L such that the variance and smoothing error are about equal and to show that the aliasing error is asymptotically smaller if the smoothness parameter s is sufficiently large.
Proof of Theorem 5.4. We note that if N → ∞, also the number of points M = M (N ) must go to infinity since L N is assumed to be exact. If N and thus M is sufficiently large, then, by Lemma 6.7, there exists an L ≥ 0 such that where the last line is obtained by plugging in (6.4) and (6.2). The constant d s is given in (5.8). We want to minimize the right-hand side of (6.19) with respect to L and denote the optimal argument with L(M ).
In order to minimize (6.20), we compute the zeros of its derivative with respect to L, which simplifies to Hence, (6.20) is asymptotically minimized for M → ∞ by the choice Plugging this value L 1 (M ) for L into (6.20) yields the following asymptotic lower bound of the maximum risk for M → ∞, which gives the first inequality of (5.7). Here we also see that, for the choice L 1 (M ), both terms of the sum in the right side of (6.20) decrease of the order M −2s 2s+3 .
In the second part of the proof, we derive the upper bound of the maximum risk with the mollifier ψ s L . By the decomposition (6.1) combined with the upper bound from (6.7), we have for f ∈ F (s, S) and for any L > 0 By (6.4) and Theorem 6.6, there exists a constant c > 0 such that

M .
Now we plug in L 1 (M ) from the first part of the proof and we obtain 2s+3 .

(6.23)
By the applicability of the hyperinterpolations, we have M = O(N 2 ). We have also assumed that s > (1 + √ 10)/2, which implies 3/2 − s < −s/(2s + 3). Hence, only those terms in the sum on the right-hand side of (6.23) that grow like M −2s/(2s+3) play a significant role as M goes to infinity. Thus, the term c(N + 1/2) 3/2−s S, which comes from the aliasing error, is asymptotically negligible for M → ∞. This yields the upper bound as N and thus M (N ) approaches infinity, which proves the second inequality of (5.7).
It is left to show the asymptotic optimality (5.6) of the family of mollifiers ψ s L to complete the proof. When we insert the optimal value L(M ) into both (6.19) and (6.22), those two bounds coincide except for the aliasing term, but since L(M ) grows of the same asymptotic order as L 1 (M ), the aliasing part is again negligible. 7. Numerical tests.

Spherical Fourier algorithm.
The algorithm. The estimator E N,ψ g can be computed numerically in a fast and stable way. To this end, we decompose (5.3) into a three-step process. Suppose we have given a quadrature rule on X with the nodes x m ∈ X and the weights w m as well as function values g(x m ) for m = 1, . . . , M . i) Compute the Fourier coefficients of g: If the quadrature uses the tensor product structure like the one in Theorem 4.3, this computation can be accelerated yielding to a complexity of O(N 2 log 2 N ) operations, which we will show below. ii) Do the inversion and regularization in the Fourier space: which needs O(N 2 ) arithmetic operations. iii) Compute the estimator at some nodes ξ j ∈ S 2 from its Fourier coefficients: This computation can also be done in O(N 2 log 2 N ) steps using the nonequispaced fast spherical Fourier transform (NFSFT) from [23], provided that the number of evaluation points J ∈ O(N 2 ). This algorithm has an overall numerical complexity of O(N 2 log 2 N ).
The FFT of length 2N + 1 has a numerical complexity of O(N log N ). We compute N + 1 such FFTs, which yields a complexity of O(N 2 log N ).
In equation (7.3), the sum over j is a discrete Legendre transform of length N + 1, which is done 2N + 1 times (for every k = −N, . . . , N ). Discrete Legendre transforms of length N + 1 can be computed in O(N log 2 N ) operations [29]. So the computation of (7.1) has a complexity of O(N 2 log 2 N ).

7.2.
Inversion via the Radon transform in the plane. The vertical slice transform on the sphere is closely related to the Radon transform in the plane, which was first described in [31]. Before we show this connection, we need the following lemma, that gives a parametric formula for computing the vertical slice transform T .
Proof. Let t ∈ [−1, 1]. We first look at the case σ = 0. Then the integration domain of (3.1) is a circle perpendicular to the ξ 1 -axis with center (t, 0, 0) and radius √ 1 − t 2 . Note that the definition of T is normalized with respect to the circumference of this circle. We have The claimed formula follows by rotation around the ξ 3 -axis with the angle σ.
For a function f : S 2 → C that is even in the third coordinate, denote withf its weighted orthogonal projection onto the ξ 1 -ξ 2 -plane. Then The functionf can be interpreted as follows: the function f defined on the sphere is projected orthogonally onto the ξ 1 -ξ 2 -plane, and then divided by the weight π 1 − ξ 2 1 − ξ 2 2 . This projection is well-defined since f is even with respect to ξ 3 .
Proof. According to Lemma 7.1 and because f is even in the third component, T f can be expressed in terms off by Performing the substitution This theorem gives a nice way of reconstructing a function f ∈ C(S 2 ) given its vertical slice transform T f , namely , where we first compute the inverse Radon transform of T f and then multiply it by π 1 − ξ 2 1 − ξ 2 2 . However, if the function f does not vanish on the equator, the respective functionf (ξ 1 , ξ 2 ) grows infinitely near the boundary of the unit disc.
The inversion of the Radon transform has been treated by many authors, see e.g. [31,35,21,27,30,42,37]. For our comparison we focus on two algorithms: the standard filtered back projection (FBP) algorithm [21], and an algorithm based on orthogonal polynomial expansion on the unit disc (OPED) [42]. One important difference between both algorithms is that in the standard filtered back projection algorithm the Radon transform Rf (σ, t) is sampled uniformly in t while the OPED algorithm requires a sampling at Chebyshev points. For the standard backprojection algorithm, we use the iradon routine from the Matlab imaging toolbox, and for the algorithm based on orthogonal polynomial expansion on the unit disc, we use the so-called fast OPED algorithm [43]. The numerical complexity of both algorithms is O(N 3 ), provided we have sampled Rf (σ, t) at O(N 2 ) points and we compute f at the same number of points. 7.3. Comparison of the algorithms. In this section, we present numeric results to compare our NFSFT algorithm based on spherical harmonics with algorithms based on the Radon transform in the plane.
Smooth test function. We chose the test function (7.6) f (ξ) = sin(9(ξ 1 − ξ 2 2 )) exp(−ξ 4 1 − ξ 2 2 ) cos(5ξ 2 ) + 1, ξ ∈ S 2 . Since f is even with respect to ξ 3 , it suffices to consider the projection of f onto the ξ 1 -ξ 2 -plane, which is depicted in Figure 2. Its vertical slice transform T f is computed by quadrature applied to (3.1). Although the above algorithms we want to compare could easily be inverted, we apply a simple quadrature-based method for the computation of the forward transform to avoid inverse crime.
At first, we apply the three algorithms to undisturbed data. The NFSFT and FBP algorithm are applied to same number of M = 513 · 257 = 131 841 data points, while the OPED algorithm uses 513 2 = 263 169 points. However, while the grid for the FBP algorithm is equally spaced; the grids for the OPED algorithm as well for our NFSFT algorithm based on spherical harmonics are denser close to the boundary t = ±1.  In Figure 3, the reconstruction error for the three methods is illustrated. One can clearly see the main disadvantage of the FBP algorithm, which is due to the functionf being unbounded near the boundary on the unit disc. Even thoughf is multiplied by 1 − ξ 2 1 − ξ 2 2 , the error of the reconstruction is still considerably larger near the boundary. The smaller error of the OPED reconstruction close to the boundary of the disc compared to the FBP algorithm is due to the different sampling and the fact that the former is specifically designed for functions supported on the disc. Nevertheless, the OPED reconstruction does not meet the accuracy of the NFSFT algorithm based on spherical harmonics.
For our next test, we add some Gaussian noise with a standard deviation of 0.01 to the data. In Figure 4, one can see the reconstruction error for the NFSFT algorithm without any regularization, which is the same as using the Dirichlet kernel of degree N = 256 as mollifier. Figure 4b shows the error with regularization where the regularization parameter L is chosen optimally to reduce the L 2 -error. The reconstruction with the FBP algorithm shown in Figure 4c looks slightly better than the one with the unregularized NFSFT algorithm, as long as we stay away from the boundary. For the FBP algorithm, we used the Hann window, which  To illustrate the minimax rate of the MISE from Theorem 5.4, we did the following. We chose the smoothness parameter s = 2 and computed the norm S = f H 2 (S 2 ) ≈ 166 numerically. Then, for different degrees N , we calculated the parameter L as in (6.21) with γ 1 = γ 2 = 1, reconstructed the function f at the Gauss-Legendre nodes using the NFSFT algorithm as we did above with the calculated value of L, and computed the integrated squared error f − E N,ψ 2 via quadrature. In order to obtain an estimate of the MISE, we repeated this procedure 20 times with a new instance of the random error ε each time, computed the mean of the integrated squared error and compared it with the theoretical minimax rate (5.7). As one can see in Figure 5, our achieved error almost coincides with the minimax rate. Actually, the achieved error is slightly higher than the minimax rate, which may be explained by the two facts: that we set γ 2 = 1, where this value actually might be slightly bigger than one, and that inaccuracies caused by rounding errors occur during the numerical computation.
Non-smooth test function. For this test, we chose a function whose vertical slice transform is known analytically. In [40], explicit formulas for the Radon transform of some simple functions are given. If we take such a functionf (ξ 1 , ξ 2 ) supported in the unit disc ξ 2 1 + ξ 2 2 < 1 in the plane, we can use the construction from Theorem 7.2 to define a test function on the sphere by h(ξ 1 , ξ 2 ) = 1 − ξ 2 1 + ξ 2 2 , ξ 2 1 + ξ 2 2 < 1 0, otherwise, whose graph is a circular cone with height 1 and radius 1, centered at the origin. Sinceh is radially symmetric with respect to the origin, its Radon transform Rh(σ, t) is independent of σ. Hence, we can choose σ = 0 and see that Rh(σ, t) is the integral along the line ξ 1 = t and obtain for all (σ, t) ∈ X, where Arcsch denotes the inverse hyperbolic cosecant function, see [2,Section 4.6]. To define the functionf , we first scale h with the factors a, b > 0 in direction of ξ 1 and ξ 2 , then rotate it through the angle α about the origin and shift it to (ζ 1 , ζ 2 ) ∈ R 2 . We set the functioñ f (ξ 1 , ξ 2 ) :=h (ξ 1 − ζ 1 ) cos(α) + (ξ 2 − ζ 2 ) sin(α) a , −(ξ 1 − ζ 1 ) sin(α) + (ξ 2 − ζ 2 ) cos(α) b (7.7) Figure 6. Non-smooth test function f 0 (left) and its vertical slice transform T f 0 (right) for (ξ 1 , ξ 2 ) ∈ B 2 . Then, by [40,Appendix B], its Radon transform is given by We made a particular choice off 0 as the linear combination of functions of the form (7.7), which is illustrated in Figure 6. Since the Radon transform is linear, we also have an explicit formula for its vertical slice transform T f = Rf . The reconstructions of f with the three algorithms for exact data are plotted in Figure  7. The FBP algorithm yields much better results than before. The overshoot of the FBP reconstruction around the boundary of the disc does not occur here because the test function f vanishes around the equator (or equivalentlyf vanishes at the boundary of the unit disc). However, it still does not achieve the accuracy of the NFSFT method. Moreover, the error of the FBP reconstruction is concentrated where f is not smooth, while this concentration is not as strong in the NFSFT reconstruction. The accuracy of the OPED algorithm is comparable to the FBP algorithm.
Appendix A. Proof of the applicability of the Fejér quadrature. This section is about the proof of Theorem 4.3. Due to the exactness of the Fejér quadrature and the equidistant quadrature on the torus and the unit interval, respectively, the hyperinterpolation L N from (4.3) is also exact. Furthermore, we have M (N ) = (2N + 1) 2 ∈ O(N 2 ). So the first two conditions of Definition 4.2 are satisfied. Proving the last condition requires more work. Since B k n (σ, t) 2 = 2n+1 4π |P n (t)| 2 , we observe that  Figure 7. Reconstruction of the non-smooth test function from exact data In order to prove (4.2), we have to show that (A.1) is equal to 4π/(2N + 1) 2 up to some constants. We are going to do this with the help of the following two lemmas. sin ((2r − 1) θ) 2r − 1 , θ ∈ (−π, π) , which is equal to the sum in (4.4) if θ = θ 2N +1 j . The function S N is the N -th partial sum of the Fourier series of the Heaviside step function θ → π/4 · sgn(θ), θ ∈ (−π, π), where sgn denotes the sign function, see [15, 1.442]. So S N (θ) converges for N → ∞ to the constant π/4 for all θ ∈ (0, π), but this convergence is not uniform in θ. In fact, S N (θ) oscillates heavily around 0 and π, which is known as the Gibbs phenomenon. In [19], it was shown that S N (θ 2N +1 j ) = γ N,j ·π/4 with some constants γ N,j satisfying γ 1 < γ N,j < γ 2 for all j = 1, . . . , 2N + 1, and N ∈ N 0 . Hence, we The Gauss-Chebyshev quadrature of the second kind (see [2]) uses the same nodes cos θ 2N +1 j as the Fejér quadrature and the weights π 2N +2 (sin θ 2N +1 j ) 2 . Because the (2N + 1)-point Gauss-Chebyshev quadrature of the second kind is exact of degree 4N +1 for the integration with respect to the measure √ 1 − t 2 dt, we have for n ≤ N Proof. This proof is based on the asymptotic approximation of the Legendre polynomials by Stieltjes' generalization of the Laplace-Heine formula, cf. [38, 8.21], which states that for n → ∞ P n (cos θ) = 2 πn cos n + 1 2 θ − π 4 √ sin θ + O (n sin θ) −3/2 , 0 < θ < π.
Using this equation and the integral formula [15, 2.532.1], we obtain for n ≥ 1