The radial mass-subcritical NLS in negative order Sobolev spaces

We consider the mass-subcritical NLS in dimensions $d\geq 3$ with radial initial data. In the defocusing case, we prove that any solution that remains bounded in the critical Sobolev space throughout its lifespan must be global and scatter. In the focusing case, we prove the existence of a threshold solution that has a compact flow.


Introduction
We consider the large-data critical problem for the mass-subcritical nonlinear Schrödinger equation (NLS): (i∂ t + ∆)u = µ|u| p u, u| t=0 ∈Ḣ sc (R d ). (1.1) Here u : R t × R d x → C for some d ≥ 3, µ ∈ {±1}, and s c := d 2 − 2 p . This problem is critical in the sense that the space of initial data is invariant under the rescaling that preserves the class of solutions, namely, u(t, x) → λ 2 p u(λ 2 t, λx). Mass-subcriticality then refers to the restriction 0 < p < 4 d , in which case s c < 0. The problem (1.1) is known to be ill-posed inḢ sc (R d ) (cf. [3,19]). However, if one restricts to radial (i.e. spherically-symmetric) data, one can recover local well-posedness for d ≥ 3 and p > 4 d+1 (see [14,15]). For technical reasons to be detailed below, our main results do not address this entire range for all dimensions d ≥ 3. Instead, we consider p 0 (d) < p < 4 d , where (1.2) Equation (1.1) with µ = 1 is called defocusing, while µ = −1 corresponds to the focusing case. Our main result for the defocusing equation is the following theorem.
Then I max = R and u scatters in both time directions, that is, there exist u ± ∈Ḣ sc x such that lim t→±∞ u(t) − e it∆ u ± Ḣ sc x (R d ) = 0. Remark 1.2. See Definition 2.14 for the precise notion of solution used here. The scattering result in Theorem 1.1 is a consequence of the fact that the solution u will be shown to satisfy critical space-time bounds of the form for some function C(·), where u S(R) denotes the scattering norm defined in Section 2.4.
We will prove Theorem 1.1 via the concentration-compactness approach to induction on energy. As such, this result constitutes the first successful application of this type of analysis in negative-regularity Sobolev spaces.
While the well-posedness theory for the mass-subcritical NLS is well-studied (see, for example, [2] for a textbook treatment), there are very few long-time largedata critical results available and none in the setting of negative-regularity Sobolev spaces. Our previous work [20] established a large-data critical result in weighted spaces; see also [32]. Specifically, in [20] we proved that any solution u to (1.1) with d ≥ 1 and max{ 2 d , 4 d+2 } < p < 4 d that satisfies |x| |sc| e −it∆ u(t) L ∞ t L 2 x < ∞ on its maximal lifespan must be global and scatter in the sense that |x| |sc| e −it∆ u(t) converges in L 2 as t → ±∞. In fact, this result was shown to hold in both the defocusing and focusing settings.
We would like to point out that when compared with Theorem 1.1, the result in [20] has strictly stronger hypotheses and conclusions. Indeed, by unitarity of e it∆ , Sobolev embedding, and Hölder's inequality, one has |∇| sc u(t) L 2 x e −it∆ u(t) L dp 2 ,2 x |x| |sc| e −it∆ u(t) L 2 x . However, employing Theorem 1.1 and persistence of regularity type arguments, we can improve the result in [20] for a range of nonlinearities in the radial defocusing case in the following sense: if u 0 is radial with |x| |sc| u 0 ∈ L 2 and the corresponding solution remains bounded inḢ sc x throughout its lifespan, then the solution scatters in the weighted norm. For more details, see Section 2.6 below.
As mentioned above, we prove Theorem 1.1 via the concentration-compactness approach to induction on energy. To this end, we introduce the following quantity (cf. [23,31]): where the infimum is taken over all radial maximal-lifespan solutions that do not scatter forward in time. Note that the small-data theory implies E c ∈ (0, ∞] (cf. Corollary 2.17). Theorem 1.1 may then be rephrased simply as E c = ∞ for the appropriate range of d, p. The strategy of proof will be to derive a contradiction under the assumption E c < ∞.
The advantage of working with the quantity E c is that it allows us to treat the defocusing and focusing problems in parallel. Note that in the focusing case, one does have E c < ∞. Indeed, if Q is the ground state solution to −∆Q + Q = Q p+1 , then e it Q(x) is a global non-scattering radial solution to (1.1) with µ = −1 that is uniformly bounded inḢ sc x . In particular, E c ≤ Q Ḣsc . We will prove that in this case, there exist minimizers to (1.3).
Then E c ≤ Q Ḣsc , and there exists a solution that attains E c .
Moreover, we will prove that there exists a 'soliton-like' minimizer of (1.3); see Theorem 1.7 below. However, the connection between these solutions and the ground state Q (if any) is not clear at this time. We discuss a related question after the statement of Theorem 1.7 below.
In both the defocusing and focusing cases, we will show that if E c < ∞, then there exist almost periodic solutions that attain E c . Definition 1.4 (Almost periodic). A non-zero radial solution u : tḢ sc x (I × R d ) and there exist functions N : I → (0, ∞) and C : (0, ∞) → (0, ∞) such that for any η > 0, We then have the following: Theorem 1.5 (Minimal counterexamples). Let µ ∈ {±1}, d ≥ 3, and 4 d+1 < p < 4 d . If E c < ∞, then there exists a radial almost periodic solution u : I × R d → C that attains E c and fits into one of the following scenarios: • Self-similar scenario: I = (0, ∞) and N (t) = t − 1 2 . • Cascade scenario: I = R, sup t∈R N (t) ≤ 1, and there exists a sequence of times along which N (t) → 0. • Soliton scenario: I = R and N (t) ≡ 1.
Note that Theorem 1.5 readily implies Theorem 1.3, since E c ≤ Q Ḣ sc x in the focusing case. The existence of an almost periodic solution that attains E c follows along standard lines (see e.g. [23,25]); however, as we are working in negativeregularity Sobolev spaces, some details of the argument change, particularly those pertaining to concentration compactness. We discuss the relevant details in Section 3. Once the existence of a minimal almost periodic solution is established, the arguments in [21,44] carry over to further reduce attention to the three specific solutions recorded above.
The proof of Theorem 1.1 therefore reduces to precluding the possibility of these three scenarios in the defocusing case. Our arguments are similar in spirit to those originally used to treat the radial mass-critical problem (cf. [21,44]). The key to precluding each of these three scenarios is to exploit almost periodicity in order to establish additional regularity. For the self-similar case, it is enough to show that solutions belong to L 2 . Indeed, L 2 solutions to the mass-subcritical NLS are automatically global, while self-similar solutions blow up at t = 0. In the remaining two scenarios, we are able to prove that the solution is in fact bounded in H 1 2 x ; it is here that we encounter the restriction p > p 0 (d). This allows us to prove that cascade solutions have zero mass, yielding a contradiction. In fact, as the arguments precluding the first two scenarios are equally valid in the focusing case, we deduce the following: In the defocusing case, we can preclude the possibility of a solution as in Theorem 1.6 and hence conclude the proof of Theorem 1.1. To do this, we utilize the Lin-Strauss Morawetz inequality, which relies on the defocusing nature of the nonlinearity. While we can rule out self-similar and cascade solutions for the focusing problem, we cannot rule out solitons. Thus we arrive at the following extension of Theorem 1.3, which is our last result for the focusing problem.
Then E c ≤ Q Ḣsc and there exists an almost-periodic solution to (1.1) that attains E c and has the properties given in Theorem 1.6.
We emphasize that Theorem 1.7 is a true existence theorem. In previous studies of the focusing mass-and energy-critical NLS (e.g. [9,23]), the above approach is used to yield a contradiction to the stronger and false assumption that E c is strictly smaller than the size of the ground state measured in the appropriate critical Sobolev norm. In our case, we do not know whether or not the identity is true. Whether or not this holds in the mass-subcritical case is an interesting open question.
As mentioned above, the analogue of (1.4) was proved by contradiction for the focusing mass-and energy-critical problems. The key ingredient to preclude the soliton scenario in those settings is a variational characterization of ground states. Unfortunately, there is no characterization of the ground state to the mass-subcritical problem in the negative-regularity Sobolev spaceḢ sc x . One reason to believe (1.4) might hold is that it is consistent with the soliton resolution conjecture, which states that a generic solution to (1.1) decomposes into a sum of solitons plus radiation. This implies that a generic non-scattering solution will contain at least one soliton component. As one expects the soliton components and the radiation to be mutually asymptotically orthogonal around t = sup I max , one might expect that E c is attained by the pure one-soliton solution, which would imply (1.4). Even belief in this conjecture does not fully settle the matter, since it does not address the behavior of certain non-generic solutions, such as excited-state solitons or other threshold behaviors. It is equally possible that the minimizer in Theorem 1.7 is such an exceptional solution.
In fact, there are other reasons to believe that (1.4) may fail. In [29,30,32], it is shown that there exists a threshold solution in the critical weighted space (which is contained inḢ sc x ) that has minimal weighted norm at a fixed time among all non-scattering solutions. The threshold is strictly smaller than the weighted norm of the ground state. However, this does not necessarily imply failure of (1.4), as minimality at a fixed time is not always equivalent to minimality around t = sup I max (see [31]). For further analysis of this type of minimization problem for mass-subcritical dispersive equations, we refer the reader to [29][30][31][32][33][34]. The rest of this paper is organized as follows. In Section 2 we set up notation and collect some useful lemmas. These include the radial Strichartz estimates (Section 2.3), which play a key role in the analysis throughout the paper. In Section 3, we discuss the proof of Theorem 1.5. The general scheme is well-established; thus we focus on those parts of the argument that are new in our setting. In Section 4 we rule out the self-similar scenario. In Section 5 we establish additional regularity for the cascade and soliton scenarios. It is in this section that we encounter the technical restriction appearing in (1.2). Finally, in Section 6 we rule out the cascade scenario for µ = ±1 and the soliton scenario for µ = 1, thereby completing the proofs of Theorems 1.1 and 1.3.

Notation and useful lemmas
For non-negative X and Y , we write X Y to denote X ≤ CY for some C > 0. If X Y X, we write X ∼ Y . The dependence of implicit constants on parameters is indicated by subscripts; for example, X u Y denotes X ≤ CY for some C = C(u). We write a ′ ∈ [1, ∞] for the dual exponent to a ∈ [1, ∞], that is, the solution to 1 a + 1 a ′ = 1. We write x± to denote x ± ε for any small ε > 0. For a monomial X we write Ø(X) to denote a finite linear combination of products of the factors of X, where Littlewood-Paley projections (see below) and/or complex conjugation may be applied in each factor. We write Ø(X + Y ) = Ø(X) + Ø(Y ).
We will frequently encounter weighted Lebesgue spaces. At times it will be convenient to employ the following notation: We can then define L q t L r x (I × R d ) and L q t L r;α x (I × R d ) in the usual fashion. For example, if q, r < ∞, then When q = r, we use the shorthand u L q;α t,x = u L q t L q;α x . We need the following Gronwall-type inequality from [26]. Lemma 2.1 (Acausal Gronwall inequality). Let r ∈ (0, 1) and K ≥ 4. Suppose {b k } is a bounded non-negative sequence and x k ≥ 0 satisfies 2.1. Harmonic analysis tools. We will work with the following definition of the Fourier transform: For s ∈ R, |∇| s is defined as the Fourier multiplier operator with multiplier |ξ| s . Let ϕ be a radial bump function on R d supported on {|ξ| ≤ 5 3 } and equalling one on {|ξ| ≤ 4 3 }. For N ∈ 2 Z we define the Littlewood-Paley projection operators , and P >N f = f −P ≤N f. We will often write f N = P N f and so on.
These operators are bounded on the usual Lebesgue and Sobolev spaces; in addition, they satisfy the following Bernstein estimates.
. We remind the reader that a function ω : R d → R is an A q weight for 1 < q < ∞ precisely when the Hardy-Littlewood maximal function is bounded on L q (R d ; ω(x) dx). For example, the function ω(x) = |x| a is an A q weight for 1 < q < ∞ if and only if −d < a < d(q − 1). Further, if ω is an A q weight, one also has the Littlewood-Paley square function estimate .
For this and much more see [40,Chapter V]).
We will need the following weighted Bernstein estimate.
. Proof. The operator N P N ∆ −1 ∇ is a convolution operator with an L 1 -normalized Schwartz function, and hence it is controlled pointwise by the Hardy-Littlewood maximal function. Thus boundedness of this operator on L q (|x| a dx) follows from the fact that |x| a is an A q weight. To deduce the estimate, write P M f = ∆ −1 ∇ · ∇P M f and sum over M > N .

The linear Schrödinger equation. The Schrödinger group
From these two identities one can deduce Interpolation then yields the dispersive estimate These estimates are used to establish the standard Strichartz estimates: for any t 0 ∈ I ⊂ R.
In particular, we need the following corollary of Lemma 2.5.
. Then for spherically-symmetric functions u and v we have Proof. The proof is almost identical to the one appearing in [45, Lemma 2.5]. There are two small changes: (i) We use frequency-localized solutions and always put the solutions inḢ sc , yielding the appropriate powers of M and N . (ii) We use the dual radial Strichartz estimate instead of the standard dual Strichartz estimate (see Section 2.3).

Radial Strichartz estimates.
We will rely heavily on improved Strichartz estimates that are available in the radial setting. We begin by recording a weighted dispersive estimate.
Proof. Writing P rad for the projection onto radial functions, we have the kernel estimate for any θ ∈ [0, 1] (cf. (2.2) and [26, (2.9)]). This implies the desired estimate with r = ∞; the remaining estimates follow from interpolation with e it∆ φ L 2 x = φ L 2 x . To obtain a wide range of radial Strichartz estimates, we will interpolate between several existing results, which we record in the following proposition: Proposition 2.8 (Radial Strichartz estimates). Let d ≥ 2, let α, s ∈ R, and let q, r ∈ [2, ∞] satisfy the following scaling condition: f Ḣs in each of the following three scenarios: By interpolation, Proposition 2.8 yields estimates for every ( 1 r , 1 q ) ∈ (0, 1 2 )×(0, 1 2 ) (as well as points on the boundary of this square). For each such pair, one can interpolate in two different ways, so as to obtain the largest and smallest possible values of the weight α. The result is the following: Theorem 2.9 (Interpolated radial Stichartz estimates). Let d ≥ 2, let α, s ∈ R, and let q, r ∈ (2, ∞) satisfy the following scaling condition: For any spherically-symmetric function f we have Proof. We define the points 1 4 ) and consider the following figure: We first consider the problem of obtaining the largest possible weight. The interior of the triangle OAE is defined by r . The largest possible weight is at the point E; the points C and D require α = 0. This leads to the upper bound Again, the largest possible weight is at the point E, with the line AD requiring α = 0. Interpolating between E and points on AD leads to the upper bound The points A and D require α = 0, while the largest weight at B is α < − 1 2 . This again leads to the upper bound We turn now to the problem of obtaining the smallest possible weight. We first consider the triangle ABO. The smallest weight at A and O is α = 0, while the smallest weight at B is α > − d 2 . Thus interpolating between B and AO leads to the lower bound α > − d q . We next consider the triangle OBC. Interpolating between O and C, we find the smallest weight on OC is α = 0. Thus, interpolating between B and OC leads to the lower bound α > − d r . This completes the proof of the theorem.
We also record the following dual/inhomogeneous Strichartz estimates.
holds for any spherically-symmetric function F .
x holds on I × R d for any spherically-symmetric function F .
We conclude with the following frequency-localized inhomogeneous Strichartz estimate.

Function spaces.
In this section we define some specific function spaces that will be used throughout the paper. Recall that we are considering (1.1) with d ≥ 3 and 4 d+1 < p < 4 d . First we remark that we may choose an exponent q such that To see that we may choose q as in (2.5), we note that p > 4 d+1 is equivalent to . Indeed, the larger root of the polynomial appearing in the numerator above is 5−d+ , which is ≤ 4 d+1 (with equality when d = 3).
Remark 2.13. Some of the above restrictions are based on the requirements of the Strichartz estimates of [15] (cf. Proposition 2.8 (ii)). Even though Theorem 2.9 provides slightly better estimates, we cannot improve on the restriction p > 4 d+1 . Recalling the notation in (2.1), we now define We write S(I) = {f : f S(I) < ∞}, and similarly for N (I). With this notation, Finally, we remark that |x| γ is an A q weight, as p < q implies γ > −d.
An analogous result holds backward in time.
The solution scatters inḢ sc forward in time if and only if S [t0,Tmax) (u) < ∞ for some t 0 ∈ I max , in which case T max = ∞. An analogous result holds backward in time.
We also have the standard stability result. for 0 < ε < ε 1 , then there exists a unique radial solution u : Finally, applying Theorem 2.16 withũ(t) = e it∆ u 0 leads to the following smalldata scattering result. 2.6. Scattering in weighted spaces. As mentioned in the introduction, Theorem 1.1 yields an improvement to our previous results in [20] for a range of nonlinearities in the radial defocusing case. Specifically, in certain cases we can show that solutions with radial data in the weighted space that remain bounded inḢ sc actually scatter in the stronger norm FḢ |sc| , in the sense that e −it∆ u(t) converges in FḢ |sc| as t → ±∞. Note that the embedding FḢ |sc| ֒→Ḣ sc for − d 2 < s c ≤ 0 follows from Hardy's inequality and Plancherel. As a simple example, let us demonstrate this fact when d = 3 and p 0 (3) < p < 4 3 . We suppose that u 0 ∈ FḢ |sc| is radial, and that the solution u to (1.1) with u(0) = u 0 remains uniformly bounded inḢ sc throughout its lifespan. Theorem 1.1 then implies that u is global, with u ∈ S(R). A further application of Strichartz (Theorem 2.9) shows that u also belongs to the critical space L 2p− t L 3p+ x (R × R 3 ). To use this norm without weights requires only p > 16 15 (see (2.4)). We will show that {e −it∆ u(t)} is Cauchy in FḢ |sc| as t → ∞. The case t → −∞ is similar. To this end, we recall the operator J s (t) = e it∆ |x| s e −it∆ = e i|x| 2 /4t (−4t 2 ∆) s/2 e −i|x| 2 /4t , which featured prominently in [20]. We will first show that J |sc| u remains bounded in L 2 . To see this, we use the Strichartz estimates in [20] (see e.g. Proposition 2.6 therein) and the Duhamel formula Writingũ(t) = e −i|x| 2 /4t u(t), we get the following estimate on any spacetime slab (t 0 , t 1 ) × R 3 : where L a,b t denotes the Lorentz space. Combining this estimate with a bootstrap argument (splitting R into finitely many intervals on which the L 2p− t L 3p+ x -norm is small) yields boundedness of J |sc| u in L 2 . Applying the same estimates once more, we now get x → 0 as s, t → ∞, which yields the desired result.

Concentration compactness
In this section we discuss the proof of Theorem 1.5. We denote theḢ scpreserving dilation by [D(λ)f ](x) = λ − 2 p f ( x λ ) and recall the notation f S(I) = f L q;γ t,x (I×R d ) , with q and γ defined in Section 2.4.
The proof of Theorem 1.5 relies heavily on the following linear profile decomposition.
Proposition 3.1 (Linear profile decomposition). Let {u n } ⊂Ḣ sc be a bounded sequence of radial functions. There exist radial {ψ j } ⊂Ḣ sc , radial {R J n } ⊂Ḣ sc , {N j n } ⊂ 2 Z and {t j n } ⊂ R such that (passing to a subsequence) the following decomposition holds for each J ≥ 1: Furthermore, this decomposition has the following properties: (1) The parameters are asymptotically orthogonal: (2) There is asymptotic decoupling of theḢ sc norm: for each J ≥ 1, (3) The remainders vanish in the following sense: The proof of Proposition 3.1 follows well-known arguments detailed, for example, in [25]. Hence, we emphasize only the main points and the details that need to be modified in our setting. The proof proceeds by induction, successively removing packets of concentration from the sequence by appealing to the following proposition. Then there exists β = β(ε, A), {N n } ⊂ 2 Z , and {t n } ⊂ R such that (passing to a subsequence) [D(N n )e itn∆ ] −1 u n ⇀ ψ weakly inḢ sc as n → ∞, with ψ Ḣsc ≥ β.
In turn, the starting point for Proposition 3.2 is the following improved Strichartz estimate, which firstly identifies a scale at which concentration occurs.

Lemma 3.3 (Improved Strichartz).
For any radial f ∈Ḣ sc , Proof. Fix an integer k ≥ 2 with 2(k − 1) < q ≤ 2k and take r 1 > q > r k so that 1 r1 + 1 r k = 2 q . Note that by Strichartz (choosing r 1 and r k close enough to q) and Bernstein, we have x . Thus, writing F (t) = e it∆ f and recalling that |x| γ is an A q weight, we have by the square function estimate, Hölder, Strichartz, and Bernstein, The result now follows from an application of Cauchy-Schwarz.
With Lemma 3.3 in place, we can now sketch the proof of Proposition 3.2.
Proof of Proposition 3.2. Using Lemma 3.3, we deduce the existence of N n ∈ 2 Z such that |x| γ q e it∆ P Nn u n L q As 1 2 − s c < 2 q , we may choose θ satisfying

Then by Hölder's inequality and Strichartz,
By Banach-Alaoglu, there is a subsequence along which N − 2 p n (e itn∆ u n )( x Nn + x n ) converges weakly inḢ sc . Taking inner products with the test function P 1 δ and using (3.1) shows that this weak limit is non-trivial, havingḢ sc norm is bounded below by β.
Finally, we observe that the sequence {N n x n } is bounded. Indeed, by radial Sobolev embedding, which combined with (3.1) yields Thus, passing to a subsequence, we conclude that N n x n converges to a fixed point in R d . After an additional translation, we may thus assume x n ≡ 0.
As stated above, successive applications of Proposition 3.2 yield the linear profile decomposition Proposition 3.1. For a textbook treatment, see [48].
To prove Theorem 1.5, we set where the supremum is taken over all compact time intervals K and over all radial solutions u ∈ C(K;Ḣ sc ) satisfying sup t∈K u(t) Ḣsc E. By the results in [20,31], we have the following characterization of the quantity E c introduced in (1.3): Hence, with the linear profile decomposition Proposition 3.1 and the stability theory of Section 2.5 in place, the existence of an almost periodic modulo symmetries solution attaining E c follows from well-known arguments; see, for example, [48]. Finally, the rescaling and limiting arguments in [21,44] apply in our setting and allow us to reduce attention to the three scenarios recorded in Theorem 1.5. We conclude this section with a few useful properties of almost periodic solutions. Proof. First, note that we may choose δ = δ(u) small enough that I δ (t 0 ) ⊂ I for any t 0 ∈ I (see e.g. [25,Lemma 5.18]). Let ε > 0. We first claim that there exists δ = δ(u, ε) > 0 such that To see this, we first use almost periodicity to write Using Strichartz, monotone convergence, and the precompactness of K, we deduce that for δ = δ(u, ε) small enough, This gives (3.2).
Next, we apply the stability theory with the approximate solutioñ on the interval I δ (t 0 ). Note thatũ solves (1.1) up to an error of Furthermore,ũ(t 0 ) = u(t 0 ). Thus, for ε small we can apply the stability result Theorem 2.16 to deduce that The result follows.
Using that N (t) = t − 1 2 for self-similar solutions and that N (t) ≤ 1 in the remaining two scenarios, we deduce: Corollary 3.5. Let u : I × R d → C be a radial almost periodic solution as in Theorem 1.5. For any ε > 0, there exists 0 < δ = δ(u, ε) < 1 such that: • If u is self-similar, then • If u is a cascade or soliton solution, then u S(I) < ε for any interval such that |I| < 2δ.
As a consequence of Lemma 3.4 and the frequency-localized Strichartz estimate Proposition 2.12, we obtain Finally we have the following reduced Duhamel formula for almost periodic solutions, which follows from the fact that e −it∆ u(t) must converge weakly to zero iṅ H sc at the endpoints of its maximal lifespan (cf. [25,Proposition 5.23]).

The self-similar scenario
In this section we prove the following: It suffices to prove that any self-similar solution must belong to L 2 . Indeed, L 2 solutions to the mass-subcritical NLS are automatically global, while self-similar solutions blow up at t = 0. Throughout this section, we write F (z) = µ|z| p z.
Proof of Theorem 4.1. Suppose toward a contradiction that u is a self-similar solution in the sense of Theorem 1.5. We fix ε > 0 to be determined below (cf. To prove u ∈ L 2 , we will prove quantitative decay for M(A) as A → ∞. The first result towards this goal shows that decay of N implies decay of M and S.
By the bilinear estimate from Corollary 2.6 and Corollary 3.5, we get On the other hand, by Hölder (in time), Hardy (cf. (4.9)), and Bernstein, x . Finally, an application of Strichartz yields We now combine these three estimates and collect the total powers of T − 1 2 and A. For the power of T − 1 2 , we have θ( d 2 − 1 + 2|s c |) + (p − θ)( d 2 − d+2 r + |s c | + a) = 0. Next, using the scaling relations (e.g. (4.9) above) and recalling α = kβ, we write the power of A as . This may be written in the form −σβ forσ > 0 provided  Proof. Suppose the left-hand side of (4.10) holds for some 0 < σ < b < 1. We apply Lemma 4.3 to obtain Choosing 0 < β = β(k, p, σ,σ) < 1 sufficiently close to 1 yields the desired estimate for N (A). Combining this with Lemma 4.2 we obtain the same bounds for M(A) and S(A).
With Corollary 4.4 in place, it remains to obtain some initial quantitative decay. Lemma 4.5 (Initial estimate for S). There exists 0 < σ 0 < 1 so that (4.11) In particular, choosing ε sufficiently small, we get Proof. We first prove (4.11). Writing the Duhamel formula for u beginning at t = T 1+δ and applying Strichartz, we get By Lemma 4.3 (with β = 1) and (4.5), we have (4.14) N (A) + N (A(1 + δ) − 1 2 ) N ( A 2 ) RHS(4.11). We will treat (4.13) by estimating at a fixed frequency B > A and summing. Fix θ ∈ (0, 1) (close to one) to be determined below and define Note that a > 2 for θ close to 1. By Hölder's inequality, we get where all space-time norms are over I δ (T ) × R d .
To estimate (4.15) we use Theorem 2.9. This requires α > − d a , which we can guarantee by choosing θ close enough to 1. Indeed, when θ = 1, we have α = γ q and a = q, in which case the desired bound follows from p < q. Setting s = d 2 − d+2 a − α, we then estimate We estimate (4.16) using the weighted dispersive estimate (Lemma 2.7). This requires ≥ 0, which follows from the final upper bound in (2.5). Using the reduced Duhamel formula (Lemma 3.7) and Lemma 2.7, we get Summing over B > A we obtain This completes the proof of (4.11).
We are now in a position to complete the proof of Theorem 4.1. We first claim that and some σ 0 > 0. In view of (4.5), we only need to verify this claim for A large. By Strichartz combined with Lemmas 4.5 and 4.3 (with β = 1), we find Taking A sufficiently large, we deduce (4.19).
We now choose 2|s c | < b < 1 and apply Lemma 4.4 finitely many times to deduce Thus, for any t ∈ (0, ∞), we have while by Bernstein, Lemma 5.2. The operator P + + P − is the identity on radial functions in L 2 (R d ).
Moreover, for a spherically-symmetric function f ∈ L 2 (R d ), We will also need the following kernel estimates for the frequency-localized free propagator: As usual, we denote F (z) = µ|z| p z. x (R × R d ).
Proof. Suppose u is a cascade or soliton solution as in Theorem 1.5. For N ∈ 2 Z , we define Note that M(N ) u 1 uniformly in N , for some s > 1 2 − s c and for sufficiently small η and sufficiently large N .
Let χ be the characteristic function of [1, ∞) and let χ N (x) = χ(N |x|). Using the in/out decomposition and the reduced Duhamel formula (Lemma 3.7), we decompose and where δ > 0 will be determined below and the limits are in the weakḢ sc x topology. We first estimate the contribution of (5.3) and (5.6). To this end, we first use Corollary 3.5 to find δ = δ(u, η) small enough that u p S(0,δ) < η.

Proof. It suffices to show
uniformly for N > 0. Writing we can decompose We begin by estimating the contribution of F (u ≤M ), namely, We remark that A is bounded on L 2 (uniformly in M ≥ N ), and that by the radial Strichatrz estimate (cf. Proposition 2.8 (i)) and Bernstein, Then by Strichartz, Hölder, and Bernstein, where we used the fact that p > 2 d−1 to sum. Note that . (5.11) We will collect all such constraints at the end of the lemma. We turn to To estimate this term will require a careful choice of exponents and the interpolated Strichartz estimates in Theorem 2.9. We introduce three sets of parameters, (r j , α j , s j ) that will satisfy the scaling relations We will take r 1 = r 2 , s 1 = s c −, s 2 = s c . For r ∈ [2, 6], the minimum appearing in (2.4) of Theorem 2.9 (restricted to the diagonal q = r) is given by we may choose max 2, 6 5 (p + 1) < r 1 = r 2 < min 6p 8−p(2d−1) , 2(p + 1), 6 . (5.14) This constraint guarantees that we may apply Strichartz (Theorem 2.9) with the weights It also guarantees that if we define then r 0 ∈ (2, 6). Finally, we choose −, which also determines s 0 via (5.13). We now claim that for r 1 large enough, we have Having chosen exponents as above, we use Strichartz, Hölder, and Lemma 3.6 to estimate , where in the last line we use the scaling relations. Finally, note that . That this is compatible with (5.18) follows from p > 4 d+5 . Collecting our estimates for (5.10) and (5.12), we conclude that the desired estimate holds provided This completes the proof of Lemma 5.7.
Proof. We begin by combining Lemma 5.1 and Lemma 5.3 to deduce the following kernel estimates: for N ≫ δ − 1 2 , Indeed, for (5.21) we note that for |t| ≥ δ ≫ N −2 , |x| ≤ N −1 , and |y| ≪ M |t|, we have |y − x| ≪ M |t|. For (5.22), we note that |y| − |x| ≪ M |t| under the given constraints; we also use Note that by Schur's test, we have We decompose the nonlinearity as in (5.9). We first consider the contribution of F (u ≤M ) to (5.19) and (5.20), for which it suffices to bound For this, we use Hölder and Bernstein to estimate M sc (M 2 2 k δ) −100d 2 k δM dp 2 u (δN 2 ) −50d .
Having chosen these parameters, we estimate (5.25) using (5.23), Hölder's inequality, and Lemma 3.6. Choosing ε sufficiently small, we have It follows that which completes the proof.
Collecting the results of Lemmas 5.6, 5.7, and 5.8, we deduce that (5.2) holds. This completes the proof of Lemma 5.5 and so proves Theorem 5.4.

The cascade and soliton scenarios
In this section, we rule out the cascade scenario of Theorem 1.5, as well as the soliton scenario in the defocusing case, thereby completing the proofs of Theorems 1.1 and 1.7. The key ingredient in both cases is the additional regularity afforded by Theorem 5.4. We will prove that cascade solutions must have zero mass, while soliton solutions are inconsistent with the Lin-Strauss Morawetz inequality in the defocusing case.
We first rule out the cascade scenario. The following theorem completes the proof of Theorem 1.6. Proof. Suppose toward a contradiction that u is a cascade solution as in Theorem 1.5. We will reach a contradiction by proving that u has zero mass.
Fix η > 0. By Theorem 5.4, we know that u ∈ L ∞ tḢ |sc| x . Thus, using almost periodicity and interpolation and choosing C = C(η) sufficiently large, we can first estimate uniformly for t ∈ R. Next, by Bernstein we have u ≤CN (t) (t) L 2 x [C(η)N (t)] |sc | |∇| sc u L ∞ t L 2 x u [C(η)N (t)] |sc | , uniformly for t ∈ R. Thus, by conservation of mass, we have u(0) L 2 x = u(t) L 2 x u η 1 2 + [C(η)N (t)] |sc | , uniformly for t ∈ R. Applying this to a sequence of times along which N (t) → 0 (recall u is a cascade solution) and noting that η > 0 was arbitrary, we deduce that u(0) = 0, a contradiction.
We turn to the soliton scenario in the defocusing case.
We will need the following standard Morawetz estimate (cf. [27]).  Note that by Theorem 5.4, the right-hand side of (6.1) is bounded uniformly in I ⊂ R when u is a soliton. To prove Theorem 6.2, we will prove that the left-hand side of (6.1) is bounded below by |I|, thus obtaining a contradiction for I long enough.
Proof of Theorem 6.2. Suppose u is a soliton in the sense of Theorem 1.5. As noted above, it suffices to show that the left-hand side of (6.1) is bounded below by |I|.
To this end, we first observe that since the orbit {u(t) : t ∈ R} is bounded in H 1/2 x and precompact inḢ sc , it is precompact in L 2 .
On the other hand, , |u(t, x)| p+2 } x dx and the right-hand side is continuous on L 2 and vanishes only at u ≡ 0. Thus LHS(6.1) is bounded below by a multiple of |I|, thereby proving the theorem.