Scattering for a mass critical NLS system below the ground state with and without mass-resonance condition

We consider a mass-critical system of nonlinear Sch\"{o}dinger equations \begin{align*} \begin{cases} i\partial_t u +\Delta u =\bar{u}v,\\ i\partial_t v +\kappa \Delta v =u^2, \end{cases} (t,x)\in \mathbb{R}\times \mathbb{R}^4, \end{align*} where $(u,v)$ is a $\mathbb{C}^2$-valued unknown function and $\kappa>0$ is a constant. If $\kappa =1/2$, we say the equation satisfies mass-resonance condition. We are interested in the scattering problem of this equation under the condition $M(u,v)<M(\phi ,\psi)$, where $M(u,v)$ denotes the mass and $(\phi ,\psi)$ is a ground state. In the mass-resonance case, we prove scattering by the argument of Dodson \cite{MR3406535}. Scattering is also obtained without mass-resonance condition under the restriction that $(u,v)$ is radially symmetric.

where (u, v) is a C 2 -valued unknown function, m and M are positive constants, and λ, µ ∈ C \ {0} are constants. From the viewpoint of physics, (1.1) is related to the Raman amplification in a plasma. See [2] for details. Furthermore (1.1) is regarded as a non-relativistic limit of the system of nonliear Klein-Gordon equations under the mass-resonance condition M = 2m (see [11]). Under the assumption λ = cμ for some c > 0, the solution to (1.1) conserves the mass and the energy, defined respectively by To use these conservation laws, we impose λ = cµ for some c > 0. Then the system (1.1) is reduced to the following system by some changes of variables: where κ is a positive constant. If (u, v) is a solution to (1.2), then (u λ , v λ ) := λ −2 (u, v)(λ −2 t, λ −1 x) also solves (1.2) for any λ > 0. This property is called scaling symmetry. Setting s c := d/2 − 2, then (u, v) Ḣ sc x is invariant under this scaling transformation. In particular, if d = 4, the mass is invariant under the scaling transformation. From this fact, if d = 4, we say the system (1.2) is mass-critical. Similarly, we call the case d = 5 and d = 6Ḣ 1 2 -critical and energy-critical, respectively. In this paper, we treat the following mass-critical system of NLS: where κ is a positive constant. If κ = 1/2, (1.3) has Galilean invariance, i.e., if (u, v) is a solution to (1.3), then (e ix·ξ e −it|ξ| 2 u(t, x − 2tξ), e 2ix·ξ e −2it|ξ| 2 v(t, x − 2tξ)) is also a solution to (1.3) for any ξ ∈ R 4 . Note that in our case, the mass and the energy are respectively defined by It is well known that local well-posedness and small-data scattering hold for the equation (1.3) in L 2 (R 4 ) 2 . In this paper, we are interested in the behavior of the large solutions.

1.2.
Known results for the single mass-critical NLS. In this subsection, we introduce known results about the following mass-critical single NLS: i∂ t u + ∆u = µ|u| where u is a C-valued unknown function and µ ∈ {−1, 1}. If µ = +1, we say (1.4) is defocusing and if not, (1.4) is called focusing. In the defocusing case, any L 2 -solution u of (1.4) exists globally and scatters to a free solution, which means that e −it∆ u(t) → u ± in L 2 (R d ) as t → ±∞ for some u ± ∈ L 2 (R d ).
This fact was proved by Dodson in [4], [6], and [7] in the case d ≥ 3, d = 2, and d = 1, respectively. On the other hand, in the focusing case, the existence of non-scattering solution is known. Indeed, the existence and uniqueness of the ground state Q, which is the positive radial solution to is known (see [1], [15]). In [5], Dodson proved that a solution to (1.4) with µ = −1 exists globally and scatters to a free solution under the condition that u(0) L 2 x < Q L 2 x . Concerning the system (1.3), we expect a similar scattering result to the focusing single NLS (1.4) with µ = −1, at least, when κ = 1/2, where the system has the Galilean invariance. In this paper, we also treat the case of κ = 1/2.
Associated functional for (1.7) is defined by Using this functional we give the definition of ground states.
is a nontrivial critical point of I }.
(i) There exists at least one ground state of (1.7).
(ii) Let (φ, ψ) be a ground state of (1.7). Then, it holds that for any (u, v) ∈ H 1 (R 4 ) 2 Moreover, equality is attained by the ground state.
However, it is not clear whether the global well-posedness and scattering in (L 2 ) 2 for initial data satisfying (1.8) hold or not. We first give the following answer when κ = 1/2. Then by the stability result in Section 2, L rad is continuous and so there exists radially symmetric critical mass M c,rad such that L rad (M ) < ∞ for M < M c,rad , L rad (M ) = ∞ for M ≥ M c,rad .
Since the ground state is radially symmetric, we obtain M c,rad ≤ M (φ, ψ). Note also that by the definition of M c it holds M c ≤ M c,rad . Our aim is to derive a contradiction by supposing M c < M (φ, ψ) with the mass-resonance condition and M c,rad < M (φ, ψ) without it.
Toward contradiction, we construct the minimal blow-up solution which has critical mass M c (in the radial case critical mass is M c,rad ) by the profile decomposition for the system. When we construct the minimal blow-up solution in the case κ = 1/2, we need the radial assumption due to the lack of Galilean invariance. After that we refine the minimal blow-up solution to apply the argument of Dodson [5]. We exclude two possible scenarios which are called rapid frequency cascade and quasi-soliton. We eliminate the rapid frequency cascade scenario by additional regularity of the minimal blow-up solution, which comes from the long time Strichartz estimate. To exclude the quasi-soliton scenario, we rely on the estimate based on the virial identity (cf. [11,17]) which is called the frequency localized interaction Morawetz estimate in [5].
The mass-critical case d = 4 is quite different from theḢ 1 2 -critical case d = 5. Hamano [9] gave the threshold for scattering or blow-up below the ground state inḢ 1 2 -critical case d = 5 and H 1 setting under the mass-resonance condition. To prove scattering, Hamano used the argument of Kenig-Merle [12] which is organized by stability, profile decomposition, construction of critical element, and rigidity of it. There are two differences between his argument and ours. One is regularity of initial data. More precisely, Hamano assumed H 1 regularity to solveḢ 1 2 -critical problem, while we only assume the minimal regularity L 2 . The other is a variety of parameters in the profile decomposition from the lack of compactness in L 2 . Indeed, the translation in the frequency side and the scaling transformation additionally breaks the compactness in L 2 .The radial assumption is used to remove the former in the case of κ = 1/2.
Organization of this paper. In Section 2, we prepare the stability result and bilinear Strichartz estimate which are used in Section 3 and Section 4, respectively. In Section 3, we introduce the profile decomposition for the system and use it to construct the minimal blow-up solution. In Section 4, we prove the long time Strichartz estimate. In Sections 5 and 6, we treat the rapid frequency cascade scenario and the quasi-soliton scenario, respectively.

Preliminaries
First, we collect some notations. Let (u, v) X := (u, v) X×X for any function space X and (u, v) ∈ X × X =: X 2 . We denote the (spatial) Fourier transform of a function f byf or F f . Let θ : [0, ∞) → [0, 1] be a smooth non-increasing function such that θ ≡ 1 on [0, 1] and supp θ ⊂ [0, 2]. For each number N > 0, we define the Littlewood-Paley projection operators by These operators are bounded uniformly in N on L p (R d ) for any 1 ≤ p ≤ ∞, and they commute with each other, as well as with differential operators and Fourier multipliers such as i∂ t + ∆ and e it∆ . It is easy to see that supp for any function f , and in particular, that P >N P ≤N/4 f · P ≤N/4 g = 0 for any f, g. We also define the Fourier projections with the frequency center ξ 0 ∈ R d as 2.1. Local well posedness. We follow the argument of Cazenave and Weissler for (1.4) to develop a standard local theory.
Then there exists a unique maximal-lifespan solution (u, v) : This solution also has the following properties: (In this case we say that (u, v) blows up forward in time.) A similar statement holds in the negative time direction.
• (Scattering) If sup I = +∞ and (u, v) does not blow up forward in time, then there exists (u + , v + ) ∈ L 2 (R 4 ) 2 such that (1.5) holds. Conversely, given (u + , v + ) ∈ L 2 (R 4 ) 2 there exists a unique solution to (1.3) in a neighborhood of +∞ so that (1.5) holds. A similar statements hold in the negative time direction.
• (Small data global existence) There exists η > 0 such that for any , v(0)) = (u 0 , v 0 ) exists globally and satisfies following bound: Remark 2.2. Note that scattering is equivalent to finiteness of L 3 t,x -norm on the maximal lifespan for any maximal-lifespan solution to (1.3). This is a consequence of Strichartz estimate and standard continuity argument.

Stability result.
In this section, we prepare the stability result. Note that the stability result follows regardless of the condition for the coefficients λ and µ. Theorem 2.3 (Mass-critical stability result). Let I be an interval and let (ũ,ṽ) be an approximate solution to (1.1) in the sense that i∂ tũ + 1 2m ∆ũ = λṽū + e 1 , i∂ tṽ + 1 2M ∆ṽ = µũ 2 + e 2 , for some functions (e 1 , e 2 ). Assume that Moreover, assume the smallness conditions: Proof. The proof is very similar to the one of [19,Lemma 3.6], and so we omit the detail.
Proof. Let a = 0 by time translation. We show only the homogeneous case: Moreover, we may assume |θ 1 |M ≪ |θ 2 |N , since otherwise the claim follows from the L 4 t L 8/3 x -Strichartz estimate and the Sobolev embeddingẆ x . By a suitable decomposition with respect to the angle and rotation, we may further assume that By duality, (2.1) is equivalent to with the assumptions on the Fourier supports of φ and ψ, by which we have We apply the Cauchy-Schwarz inequality in (u, ξ 1 , v), which yields , and then change back to the original variables to obtain (2.2).

Minimal mass blow-up solution
3.1. Inverse Strichartz inequality, Linear profile decomposition. In this subsection, we prepare the profile decomposition for the system of NLS. First we give some notations.
We also assume that Then, passing to s subsequence if necessary, there exist .
Using above Lemma, we give the Inverse Strichartz for the system.
Proof. Passing to subsequences if necessary, we may assume that For simplicity, we suppose that ε 1 ≤ ε 2 . Then we can apply Lemma 3.2 to {v n }. Therefore, passing to a subsequence if necessary, there exist {ξ n }, {y n }, {λ n }, {s n }, and ψ ∈ L 2 (R d ) satisfying stated properties. Since {h(0, κξ n , y n , λ n ) −1 e i sn κ ∆ u n } n is bounded in L 2 (R d ), passing to a subsequence if necessary, there exists φ ∈ L 2 (R d ) such that h(0, κξ n , y n , λ n )e i sn κ ∆ u n ⇀ φ weakly in L 2 (R d ).
Then (3.1) and (3.2) follow immediately. We can prove (3.3) as follows: . Now we are ready to give the profile decomposition.
where we set U κ (t) = (e it∆ , e iκt∆ ). Furthermore for each j = k it follows that Proof. This theorem follows by using Proposition 3.3 and the standard induction argument. For the detail see [14]. 1

3.2.
Construction of a minimal blow-up solution. In this subsection we fix κ = 1/2. Note that (1.3) has Galilean invariance in this case.
First we show the existence of a minimal blow-up solution which does not necessarily have additional properties (Theorem 3.8 below). Definition 3.5. Let g = g(θ, ξ 0 , x 0 , λ) ∈ G and (u, v) : I×R 4 → C 2 . Then we define T g (u, v) : λ 2 I× R 4 → C 2 as follows: For a solution (u, v) to (1.3), T g (u, v) is a solution to (1.3) with initial data g(u 0 , v 0 ) since we now assume κ = 1/2.
for any η > 0. The functions N (·), x(·), ξ(·), and C(·) are called the frequency scale function, the spatial center function, the frequency center function, and the compactness modulus function, respectively.
Note that the choice of N (·), x(·), and ξ(·) is not unique. For instance, if another function N : I → R + satisfies C −1 ≤ N (t)/ N (t) ≤ C on I for some C > 1, then we can replace N (t) with N (t). In fact, it turns out that we can choose these functions to be continuous with respect to t, although we will not do so here.  Before proving this theorem, we prepare the following proposition. where S ≤tn (u, v) := (inf In,tn] R 4 |u| 3 + |v| 3 dxdt and S ≥tn := [tn,sup In) R 4 |u| 3 + |v| 3 dxdt. Then G(u n (t n ), v n (t n )) has a subsequence which converges in G \ L 2 x (R 4 ) 2 topology. Proof. By time translation symmetry, we may assume t n = 0 for any n ∈ R. Applying Theorem 3.4 to the bounded sequence {u n (0), v n (0)} n and passing to a subsequence if necessary, we obtain the profile decomposition with stated properties, where we may assume t j n ≡ 0 or t j n → ±∞ as n → ∞. We define nonlinear profile (a j , b j ) : I j × R 4 → C 2 associated to (φ j , ψ j ) as follows: • If t j n ≡ 0, we define (a j , b j ) to be maximal-lifespan solution with (a j (0), b j (0)) = (φ j , ψ j ). • If t j n → ∞, we define (a j , b j ) to be the maximal-lifespan solution which scatters forward in time to (e it∆ φ j , e i t 2 ∆ ψ j ). • If t j n → −∞, we define (a j , b j ) to be the maximal-lifespan solution which scatters backward in time to (e it∆ φ j , e i t 2 ∆ ψ j ). Finally, for each j, n ≥ 1 we define (a j n , b j n ) : where I j n := {t ∈ R | (λ j n ) −2 t + t j n ∈ I j }. 3 Note that each (a j n , b j n ) is a solution with initial value (a j n (0), b j n (0)) = g j n (a j (t j n ), b j (t j n )). By the definition of (u j n , v j n ) and (3.8) we can easily check for each J that By the property stated in Theorem 3.4, we also obtain that and in particular sup j M (φ j , ψ j ) ≤ M c . Now we prove by contradiction that (3.12) M (φ 1 , ψ 1 ) = M c .
If not, then we have for some ε > 0. In this case, by the definition of the critical mass M c , all (a j n , b j n ) are defined globally and obey the estimates Moreover by small data scattering, we get In order to lead a contradiction, we define the approximate solution Furthermore, (u J n , v J n ) satisfies the following equation: To apply Theorem 2.3, we prepare the following lemmas: Lemma 3.10. For any J ≥ 1, the following holds: Proof. This follows immediately by (3.9), (3.10), and the definition of (u J n , v J n ). Lemma 3.11. The following holds: Proof. By the definition, we have Using the triangle inequality and the Hölder inequality, we get From asymptotic orthogonality (3.5), first terms of I, II converge to zero. Moreover the other terms converge to zero by (3.4) and (3.14).
Combining these Lemmas 3.10-3.12 and Theorem 2.3 , we obtain boundedness of {S R (u n , v n )} n which contradicts (3.7). Therefore we establish (3.12). Then we see J * = 1. Consequently, the profile decomposition simplifies to So we consider the case t n → +∞. The case t n → −∞ can be treated similarly, and so we omit it. By the Strichartz inequality we have Let θ n , ξ n , x n , λ n be parameters of g n and set h 1,n := h(θ n , ξ n , x n , λ n ), h 2,n := h(2θ n , 2ξ n , x n , λ n ). Then we establish Applying Theorem 2.3 (using (0, 0) as the approximate solution ), we conclude that This contradicts (3.7).
Proof of Theorem 3.8. Since L(M c ) = ∞, we can find a sequence (u n , v n ) : By time translation invariance we may take t n = 0. Invoking Proposition 3.9 and passing to a subsequence if necessary, we find g n ∈ G such that g n (u By applying T gn to (u n , v n ) we may take g n = I for all n ∈ N.
. We claim that (u, v) blows up both forward and backward in time. Indeed, if (u, v) does not blow up forward in time, then [0, ∞) ⊂ I and S ≥0 (u, v) < ∞. By Theorem 2.3 this implies that for sufficiently large n, we have [0, ∞) ⊂ I n and lim sup n→∞ S ≥0 (u n , v n ) < ∞. This is a contradiction. It remains to show that the solution (u, v) is almost periodic modulo symmetries. Consider an arbitrary sequence Since (u, v) blows up both forward and backward in time, we have Applying Proposition 3.9 once again, we see that G(u(s n ), v(s n )) has a convergent subsequence in 3.3. Further refinements. In this subsection, we again let κ = 1/2. In the following, we consider to refine given solution in Theorem 3.8 as we can apply the argument of [5]. For this aim we give the some definitions and lemmas. Following argument is based on [13] and [18].
be another solution, and K be a compact time interval. We say that (u n , v n ) converges uniformly to (u, v) on K if K ⊂ I, K ⊂ I n for sufficiently large n, We say that (u, v) is normalized if the lifespan I contains zero and More generally , we can define the normalization of a solution (u, v) at time t 0 ∈ I by ) is a normalized solution which is almost periodic modulo symmetries and has lifespan The parameters of (u [t0] , v [t0] ) are given by which is almost periodic modulo symmetries with parameters N n , x n , ξ n and compactness modulus function C. Suppose that (u n , v n ) converges locally uniformly to a non-zero solution (u, v) with lifespan I. Then (u, v) is almost periodic modulo symmetries with some parameters N (t), x(t), ξ(t) and the same compactness modulus function C. In particular we may take N (t) = lim sup n→∞ N n (t).
Proof. 1. We first show that for all t ∈ I. If (3.17) is failed for some t ∈ I, passing to a subsequence if necessary, In the case N n (t) → 0, we can get the result by similar argument and using Plancherel's theorem. This contradicts the nonzero assumption of (u, v). We can easily obtain (3.18) by similar argument, and so we omit the detail. Passing to a subsequences if necessary 6 , we may assume Then we can easily prove almost periodicity of (u, v) with parameters N (t), ξ(t), x(t).
Lemma 3.16. Let (u n , v n ) be a sequence of normalized maximal-lifespan solutions with lifespans 0 ∈ I n = (−T − n , T + n ), which are almost periodic modulo symmetries on [0, T + n ) with parameters N n (t), x n (t), ξ n (t) and a uniform compactness modulus function C. Assume that we also have Then, after passing to a subsequence if necessary, there exists a non-zero maximal lifespan solution (u, v) with lifespan 0 ∈ I = (−T − , T + ) that is almost periodic modulo symmetries on [0, T + ), such that (u n , v n ) converge locally uniformly to (u, v) on I.
Proof. By almost periodicity and the assumption sup n M (u n , v n ) < ∞, {(u n , v n )(0) | n ∈ N} is precompact in L 2 (R 4 ) 2 , and so passing to a subsequence if necessary, we may assume that Let (u, v) : I → C 2 be the maximal lifespan solution with initial data (u, v)(0) = (u 0 , v 0 ). Then we can apply the stability result, and so we establish (u n , v n ) → (u, v) locally uniformly on I. Almost periodicity of (u, v) on [0, T + ) follows from Lemma 3.15. and Proof. First we prove (3.19) by contradiction. If not there exist sequences t n ∈ [0, T + ) and δ n > 0 such that Applying Theorem 3.16 there exists the maximal-lifespan solution (u, v) : Then there exists n 0 ∈ N such that δ n ∈ [0, T + ) for all n ≥ n 0 , and so there exists . This is a contradiction. Next, we prove also (3.20) by contradiction. If not we can find sequences t n , t ′ n ∈ [0, T + ) such that → 0 or + ∞ as n → ∞.
On the other hand, by Lemma 3.16, there exists a maximal-lifespan non-zero solution (u, v) : and so (u, v) = (0, 0). This is a contradiction .
Proof. If T + < ∞, the result follows easily by Lemma 3.17. Consider the case Proof. Let (ũ,ṽ) : J → C 2 be an almost periodic solution constructed in Theorem 3.8 with frequency scale functionÑ : J → R >0 . Take a sequence {J n } of compact intervals satisfying J n ր J. Since sup t∈JnÑ (t) < ∞, there exist t n ∈ J n such that t,x (Jn×R 4 ) → ∞. Therefore we obtain J [tn] n ⊂ K for sufficiently large n. Since 0 ∈ K is an arbitrary compact subinterval of I, after passing to a subsequence, we may assume one of the following holds: • For every t ∈ (0, T + ), t ∈ J [tn] n for all sufficiently large n.
• For every t ∈ (0, T − ), t ∈ J [tn] n for all sufficiently large n.
By time reversal symmetry, it suffices to consider the former possibility. Then it follows that Applying Lemma 3.15 , we see that (u, v) is almost periodic modulo symmetry with frequency scale function N 0 (t) = lim sup n→∞Ñ [tn] (t) ≤ 2. By Corollary 3.18, then we get T + = ∞. Setting we can calculate as follows : By the dispersive estimate and Corollary 3.18, we obtain the desired result. Therefore we get (u + , v + ) = (0, 0). This is a contradiction.

3.4.
Minimal blow-up solution in the radial case. In this subsection we introduce the minimal blow-up solution in the radial case. The proof of this theorem is parallel to the proof of Theorem 3.19. However, we can not use Galilean invariance and so we remove the parameter ξ j n in the profile decomposition by using radial assumption. Indeed, For a sequence of radial L 2 function, we may refine profile decomposition as follows: Then in a profile decomposition given in Theorem 3.4, we can replace all ξ j n and x j n by zero. Furthermore we may take W J n and (φ j , ψ j ) are radially symmetric.
Proof. The proof of this fact is very similar to Theorem 7.3 in [20] but we give it in Appendix B.
In Theorem 3.21 we may also assume θ j n ≡ 0 after modifying the remainder term. Then we can prove Theorem 3.20 by quite similar argument in the proof of Theorem 3.19 because we do not need to use Galilean invariance.
3.5. Properties of almost periodic solutions. We collect various properties of almost periodic solutions (cf. [13], Lemma 5.13-Proposition 5.23). Proofs will be given in Appendix C.
The following is an immediate consequence of Lemma 3.25 and its proof.
In particular, from Lemma 3.22 we have Then, we see that N (t) and ξ(t) given in Theorem 3.19 or 3.20 can be taken so that for some C 0 = C 0 (u, v) > 1. In fact, Lemmas 3.22 and 3.23 show that (3.6) still holds if we modify N (t) and ξ(t) on [0, ∞) tō Therefore, in what follows we additionally assume (3.22). These additional properties will be useful later.
Furthermore, in the case ξ(t) ≡ 0, we see from Lemmas 3.23 and 3.26 that Hence, we can take a constant C * = C * (u, v) ≫ 1 such that for any k ≥ 0. We set C * := 1 when ξ(t) ≡ 0.

Long-time Strichartz estimate
This section is devoted to the following estimate. Let J ⊂ [0, ∞) be an arbitrary interval which is a union of finite number of characteristic intervals {J k }, and let K := J N (t) 3 dt (< ∞). Then, for any N ≤ C * K (with C * given in (3.23)) we have Remark 4.2. The minimality of the mass of (u, v) is not used in the proof of Theorem 4.1, and in fact we can obtain a similar estimate for general almost periodic solutions satisfying N (t) ≤ 1. However, when ξ(t) ≡ 0 we still need to assume κ = 1/2 so that the system has the Galilean invariance.
Proof. Since the time-dependent projection operator P |ξ−ξ(t)|>N does not commute with i∂ t + ∆, we need to freeze the frequency center ξ(t) before applying the Strichartz estimate.
(i) Note that the both sides of (4.2) are finite by the assumption S J (2 −6 N ) < ∞. We make a special decomposition of J = ∪ k J k . Let {B j } be the collection of the characteristic intervals J k for which by Lemma 3.26, the number of B j is at most O(RK/N ). Also note that such a characteristic interval does not exist if N ≥ R.
On each B j , we estimate the left hand side of (4.2) crudely by O(1) using the Duhamel formula and the Strichartz estimates. Then, the contribution from ∪ j B j is We next find a decomposition of J \ ∪ j B j into mutually disjoint intervals {G l }, each of which is a union of characteristic intervals, such that for each G l it holds that (3.23) and the assumption R ≥ C * imply that if t, t * ∈ G l , for any t, t * ∈ G l , which implies that P |ξ−ξ(t)|>N = P |ξ−ξ(t)|>N P |ξ−ξ(t * )|>N/4 . In the same manner, we have P |ξ−ξ(t * )|>N/4 = P |ξ−ξ(t * )|>N/4 P |ξ−ξ(t)|>2 −4 N .
Let us focus on the estimate for u; the argument for v is analogous. For each G l , using the Duhamel formula and the Strichartz estimates, we have l is an arbitrary point in G l and the implicit constant does not depend on t * l . By squaresumming the above estimate over G l 's and applying the bound on #G l , we obtain  ξ φF x f . Also define the shift operator τ (ξ 0 ) and the Galilean transforms In the last inequality we have used the identities and the Bernstein inequality.
We make further decomposition for (4.5) as For (4.7) and (4.8), we estimate as We sum up this bound over J k 's and obtain Cδ(R)S J (2 −6 N ). Finally, we apply the bilinear Strichartz estimate (Lemma 2.4) to (4.9). 7 We have 1.
Similarly, we have . For the second term on the right-hand side, we make a similar decomposition as (4.4)-(4.6). Applying the Hölder inequality, interpolation, and that S J k (M ) 1 for any M > 0, we have Thus, the total contribution from (4.9) is bounded by This completes the estimate for (4.5). (4.6) can be treated in a similar manner, and we have (4.2). 7 The Fourier supports of P |ξ−ξ(t k )|>2 −6 N u(t) and P |ξ−2ξ(t k )|≤2 −9 N v(t) are not necessarily separated when ξ(t k ) = 0. To apply the bilinear Strichartz estimate, we exploit the Galilean transforms to adjust the frequency centers to the origin. This is the only part where we essentially use the Galilean invariance of the system in the proof of Theorem 4.1.
(ii) Since N ≥ R and N (t) ≤ 1, we have no B j . Moreover, from (3.23) and N ≥ C * K we have for any t, t * ∈ J. Similarly to the estimate on G l in (i), we show The estimate on the nonlinear term is the same as before, except that at the last step we use This implies (4.3).
We are now ready to prove Theorem 4.1 in the case of κ = 1/2.
Proof of Theorem 4.1 (κ = 1/2). The proof will be done via an induction on N . Note that S J (N ) < ∞ for any N , since J consists of finitely many J k 's. We begin with the base case. We always have a crude bound S J (N ) (#J k ) 1/2 , which can be seen by applying the Duhamel formula and the Strichartz estimates on each J k . This bound is acceptable as long as N is so small that #J k ≤ K/N . Hence, there exists C 2 = C 2 (u, v) > 0 such that With the base case in mind, we impose the following condition on the function ρ: On the other hand, if we had (4.1), then the formula (4.2) would imply that  We first fix R ≥ C * sufficiently large so that 8C 1 δ(R) ≤ 1 2 , and then define the function ρ N * for N * ∈ 2 Z by It is easy to verify that ρ N * is bounded uniformly in N * , and that (4.11), (4.12) hold if N * ≥ K/#J k . Moreover, ρ N * is non-increasing and thus has a limit ρ N * (∞) ≥ 0 as N → ∞. Letting N → ∞ in the above recursive formula, we have ρ N * (∞) = 1 2 ρ N * (∞), concluding ρ N * (∞) = 0. Now, since K = J N (t) 3 dt ≤ J N (t) 2 dt S J (u, v) = #J k by Lemma 3.25, the quantity K/#J k has an upper bound N 0 which is independent of J. Therefore, we finally define ρ := ρ N0 . The claimed estimate (4.1) can be shown by an induction on N , with the base case (4.10) and the recursive formula (4.2), noticing (4.11) and (4.12).
This completes the proof.

Radial case.
Here, we consider the case ξ(t) ≡ 0. Since we do not need the Galilean invariance, the same argument as above can be applied for any κ > 0. Moreover, it turns out that we do not have to consider the intervals {G l }. Note that the estimate of (4.8)-(4.9) is slightly different; we decompose v as P >εN v + P ≤εN v with ε > 0 sufficiently small depending on κ, so that we can apply Lemma 2.4. As a result, we show the following: Lemma 4.5. Let (u, v) be as in Theorem 4.1, and assume that ξ(t) ≡ 0. Then, there exists C 1 = C 1 (u, v) > 0 such that the following inequalities hold: Let J ⊂ [0, ∞) be an arbitrary interval which is a union of (possibly infinite) {J k } such that K := J N (t) 3 dt < ∞.
(i) For any R > 0 and 0 < N ≤ K such that S J (2 −2 N ) < ∞, we have Here, Then, Theorem 4.1 can be shown by the same argument as before using Lemma 4.5 (i).

Additional regularity: Rapid frequency cascade scenario
For the minimal mass blow-up solution (u, v) given in Theorem 3.19 or 3.20, the following two scenarios are possible: In this section we shall derive additional regularity for the former case from Theorem 4.1, Lemma 4.3 (ii) or Lemma 4.5 (ii) and use it to preclude this scenario.
Theorem 5.1. In the rapid frequency cascade scenario, the solution (u, v) is in L ∞ (R + ; H s (R 4 ) 2 ) for any s > 0 and, with K : Proof. Let us focus on the case κ = 1/2; for the case ξ(t) ≡ 0 the claim is shown similarly by means of Lemma 4.5 (ii) instead of Lemma 4.3 (ii). We continue to use the notation in the preceding section.
In order to prove Theorem 5.1 it suffices to show that, for any s > 0, there exist C s > 0, C ′ s ≥ C * depending on u, v, s such that and similarly for v, proving Theorem 5.1.
To show (5.1), we first observe that A R+ (C * K) + S R+ (C * K) 1. In fact, the monotone convergence theorem reduces to showing it for any compact interval J ⊂ R + , which follows from Theorem 4.1. (We do not exploit the decaying factor ρ(N ) here.) In particular, we have A R+ (N ) + S R+ (N ) 1 for any N ≥ C * K.
Next, we set R ≥ C * and C 3 ≥ max{R, RK −1 } sufficiently large (according to s) so that = 0 for any N by the almost orthogonality (3.6), and similarly for v. From Lemma 4.3 (ii), we have Then, it is easy to show by induction that (5.1) holds with C s = C s 3 C 4 and C ′ s = C 3 .
Suppose that there exists a minimal mass blow-up solution (u, v) in the rapid frequency cascade scenario. Since We first consider the radial case: ξ(t) ≡ 0 and κ > 0 is arbitrary. From the almost periodicity of (u, v), we see that By Theorem 5.1 with s = 2 (in fact s = 1 + ε is sufficient) we obtain lim sup which implies that (u, v)(t n ) Ḣ1 → 0 as n → ∞. Using the Gagliardo-Nirenberg inequality we have E((u, v)(t n )) → 0 as n → ∞, and then E((u, v)(0)) = 0 by the energy conservation. This combined with M (u, v) = M c,rad < M (φ, ψ) and the sharp Gagliardo-Nirenberg inequality (Lemma 1.3 (ii)) shows that (u, v)(0) Ḣ1 = 0. By the Sobolev embedding we conclude that (u, v)(0) L 4 = 0, which clearly contradicts the fact that (u, v) is a non-zero solution.

Virial argument: Quasi-soliton scenario
Now we consider the scenario We notice that We will derive a contradiction by taking a close look at the quantity (We need to localize to low frequencies so that M(t) will be finite for L 2 solutions.) Also, a C 1 function N : [0, ∞) → R + is a variant of the frequency scale function N (·) of (u, v) which will also be defined in the proof, and we just assume for now that

is the constant given in (3.22).
Remark 6.1. Recall that Dodson's argument for (1.4) in [5] essentially used the virial identity; if u is a nontrivial solution of (1.4) and u L 2 < Q L 2 , then energy for (1.4). In our case, the following analogue is valid; for a solution (u, v) Roughly speaking, M(t) is a modification of the virial quantity Im x · (u∇u + 1 2 v∇v), for which the time derivative is always positive and away from zero by (6.2) and the minimality of the mass M (u, v) = M c,rad < M (φ, ψ). If we assume N (t) ≡ 1, then M(T ) − M(0) T = K, whereas the almost periodicity suggests that the solution is in some sense localized to low frequencies uniformly in t ≥ 0; we can in fact show that ∇(U, V ) L ∞ ([0,T );L 2 ) = o(K) as K → ∞. Since the weight function Θ(|x|/L)x is bounded, we obtain M(t) = o(K) uniformly in t, which contradicts the fact M(T ) − M(0) K for K sufficiently large. We will explain later the idea for the case that N (t) varies, where we need to introduce a time-dependent weight function with a carefully chosen N (t).
(U, V ) solves the following perturbed system: Using equations, we calculate time derivative of the momentum density as where (and hereafter) we always take summation with respect to repeated indices and Then, we have where we have written A := N (t)x. We now see Therefore, we obtain We shall prove the following: We can find the energy of the system in it, so by the minimality assumption M (u, v) = M c < M (φ, ψ) and the sharp Gagliardo-Nirenberg inequality (Lemma 1.3 (ii)) we expect these terms to be positive and the main part of  1 2 ] and χ(r) = 0 on [1, ∞), so that θ ≡ 1 on the support of χ. For a small ε > 0, we have 8 8 We keep a small amount of theḢ 1 norm for later use.
where ( U , V ) := χ(|A|/L)(U, V ) and we have used the inequality We invoke the sharp Gagliardo-Nirenberg inequality (Lemma 1.3 (ii)) for the function ( U (t), V (t)). Since we can select ε > 0 according to η 0 , κ so that which is, via the Gagliardo-Nirenberg inequality for a single function, bounded from below by Consequently, we have T 0 (6.4) + (6.6) dt We next observe that (6.5) ≥ 0. This follows from the fact that the matrix (B jk (x)) 1≤j,k≤4 is non-negative for any x.
All the remaining terms are considered as error terms. Among them, (6.7) + (6.8) is easy to handle. In fact, we see Let us turn to the control of (6.3) including the time derivative of N (t). By the Cauchy-Schwarz inequality, for ε > 0 we have Taking ε = ε(η 0 , κ) > 0 sufficiently small, together with the estimates obtained so far and N (t) ≤ N (t), we come to T 0 (6.3) + · · · + (6.8) dt for some C = C(η 0 , κ) ≫ 1. Now, we deduce from the almost periodicity and Lemma 3.24 that for sufficiently large K, L, uniformly in k ≥ 0, and Since we are assuming N (t) ∼ N (t k ) for t ∈ J k uniformly in k ≥ 0, it follows from Corollary 3.26 that for sufficiently large K, L and We therefore have proved the following: Lemma 6.3. Let N : [0, T ) → R + be a C 1 function satisfying (6.1). Then, there exist constants C ≫ 1 and L ≫ 1 depending only on (u, v) and κ such that we have whenever K > 0 is sufficiently large.
From now on we fix such an L ≫ 1. Our next task is to find an appropriate function N (t) satisfying (6.1) and This is the key step in the proof of Theorem 6.7, and we follow the argument of Dodson [5]. To see the idea, we observe that (6.1) and Corollary 3.26 imply Let us consider the case N (t) ∼ N (t). (In fact, we can easily construct N (t) satisfying (6.1) and which is not sufficient to conclude (6.11). However, if we consider the extremal case where N (t) is monotone on the whole interval [0, T ), then we can construct N (t) which is also monotone on [0, T ) and since we can take K arbitrarily large. This observation suggests that (6.11) is easier to achieve if N (t) is less oscillatory. The idea for constructing N (t) is that we deform N (t) to be less undulating by leveling 'peaks' in the graph of N (t).
for any T > 0.
Proof. Recall that N (t) is a step function associated with the partition {J k } ∞ k=0 of [0, ∞) and satisfies (3.22). Define the C Z ≤0 0 -valued step functions N m (t) (m = 0, 1, 2, . . . ) associated with {J k } inductively in m as follows: Let N 0 (t) := N (t). Then, for given N m (t), define N m+1 (t) by where for a positive integer l we call a union of consecutive l characteristic intervals [t k0 , t k0+l ) , 1, C 0 } for any m, k ≥ 0. Moreover, we claim that: (i) Every peaks in N m has length ≥ 2m + 1; (ii) If N m (t k ) = N (t k ), then N m (t k ) = N m (t k+1 ). (i) follows from the definition. To verify (ii), assume N m (t k ) = N (t k ) for some m, k ≥ 0. Then, there exists 0 . We see that this coincidence of the value of N m ′ at consecutive points t k , t k+1 persists throughout the construction procedure of {N m }.
Let k * be a positive integer, and let be the set of all peaks of N m included in the interval [0, t k * ). (j * (m, k * ) is the number of peaks in N m before t k * , and for the j-th peak in N m we denote by k m j and l m j the index of the beginning characteristic interval and the length, respectively. Note that this set is possibly empty.) Then, the total variation of N m on [0, t k * ) is estimated as Hence, the properties (i), (ii) imply that . (For instance, it suffices to connect the points {(t k , N m (t k ))} k≥0 on the graph smoothly and multiply it by C −1 0 .) This N m also satisfies for any T > 0, as desired.
From Lemma 6.4 we deduce the following results.
All we have to do is the estimate for the error term (6.9) arising from frequency localization by P ≤K . To conclude the proof of Theorem 6.2, we claim that T 0 (6.9) dt = o(K) (K → ∞). (6.12) The long-time Strichartz estimate (Theorem 4.1) now plays an essential role, since each of F , G contains at least one high-frequency function P >K/4 (u, v); for instance, = P >K uv + P ≤K uP >K v − P >K (P >K/4 uv) − P >K (P ≤K/4 uP >K/4 v).
First, we see that which is a direct consequence of Theorem 4.1. 9 We prepare one more estimate: for s > 1/2, (6.14) which is also deduced from Theorem 4.1 as To verify (6.12), we begin with observing that T 0 (6.9) dt = 2 Noticing Θ(r) ≤ 2/r and using (6.13)-(6.14), we estimate the first integral by The second integral is estimated by the Sobolev embedding and (6.13)-(6.14) as . This is the end of the proof of Theorem 6.2.
In view of Theorem 6.2, the quasi-soliton scenario is precluded once we show sup t∈[0,T ) Now, by the almost periodicity of (u, v) and the fact N (t) ≤ 1, we have as K → ∞. This and the boundedness of the weight function imply (6.15).
So far, the proof of Theorem 1.5 has been completed.
6.2. Non-radial, mass-resonance case. For non-radial case, we need the Galilean invariance of the system and thus restrict ourselves to the mass-resonance case κ = 1/2. To treat the situation where the spatial center function x(t) varies, we follow [5] and introduce the interaction-type modification of M(t) of the form on some time interval [0, T ) which is a union of characteristic intervals. N : [0, ∞) → R + is the same C 1 function as we used in the radial case, which satisfies (6.1). In this subsection, we use the Fourier projection operator P ≤C * K instead of P ≤K , and so (U, V ) := P ≤C * K (u, v), where the constant C * = C * (u, v) > 0 is given in (3.23) so that Definition of the weight function Θ L is quite different from the radial case. Let L be a large positive number to be chosen later, and let θ : [0, ∞) → [0, 1] be the same as before. First, we define the smooth radial function ϑ L : Note that ϑ L is non-increasing in |x| and satisfies ϑ L (x) = 1 for 0 ≤ |x| ≤ L − 1 and ϑ L (x) = 0 for |x| ≥ L. Then, define the functions θ L , where e 1 := (1, 0, 0, 0) ∈ R 4 . It is worth noticing that, since ϑ L is radially symmetric, θ L satisfies This helps us separate two spatial variables in the analysis of the interaction-type quantity M(t).
It follows from the definition that θ L and Θ L are non-negative, bounded uniformly in L, smooth (on (0, ∞), for Θ L ) functions such that θ L ≡ 0 outside [0, 2L] and Θ L (r) min{1, L/r}. Moreover, we can show that: Lemma 6.6. The following holds.
(ii) We observe that Integrating the second quantity with respect to r we obtain |θ ′ L (r)| r/L, since θ ′ which implies that Θ L is also non-increasing. The above identity also yields that |Θ ′ L (r)| L/r 2 . Furthermore, by the mean value theorem and (ii), We now compute ∂ t M(t). Derivatives of the weight functions are given by and the time derivative of mass density is Then, introducing the notation (6.26) where in this subsection we define A : Our goal is to establish the following: Theorem 6.7. There exists a large constant L > 0 and a C 1 function N : [0, ∞) → R + satisfying (6.1) such that M(t) defined as above satisfies Proof. The main part of ∂ t M(t) will be (6.18)+(6.20)+(6.24), which corresponds to (6.4)+(6.6) in the radial case. The new term (6.24) can be concealed by an appropriate gauge transformation.
Proof. By symmetry it suffices to show that 4 j,k=1 Since B jk is real valued and B jk = l B lj B lk , the left hand side is equal to as claimed.
The others will be error terms. We begin with: Proof. These two terms correspond to (6.7) + (6.8) in the radial case, but we need more careful treatment. For (6.21) we use Lemma 6.6 (iii) and the tightness in L ∞ t L 2 x and L 3 t,x . On each characteristic interval J k , we split the integral as follows: In the first and the second term we bound Θ L −θ L by 1, while in the last one we see from Lemma 6.6 (iii) that (Θ L − θ L ) N (t)|x − y| L 1/2 /L = L −1/2 . Since N (t) ≤ N (t), we apply the almost periodicity and Lemma 3.24 to conclude that the above is N (t k ) · o(1) (K, L → ∞) uniformly in k ≥ 0. The claim for (6.21) then follows from Corollary 3.26.
As a result, we still have the same lower bound given in Lemma 6.8 if we add the terms (6.19), (6.25), (6.21), and (6.22).
We now fix L ≫ 1 and proceed to the control of (6.17), which is parallel to that of (6.3). From the relation (6.27), for fixed t, z we have where the last integral is equal to zero by symmetry. Since the triangle inequality implies for any t, x, y, z, we estimate (6.17) similarly to the radial case with the Cauchy-Schwarz as for any ε > 0. We can choose ε small and N (t) via Corollary 6.5 to make the contribution from (6.17) smaller than the lower bound given in Lemma 6.8. Consequently, we have T 0 (6.17) + · · · + (6.22) + (6.24) + (6.25) dt K, whenever K is sufficiently large. There are only (6.23) + (6.26) remaining. It is then sufficient to prove T 0 (6.23) + (6.26) dt = o(K) (K → ∞), (6.29) which corresponds to (6.12) in the radial case.
Hence, by an integration by parts in x we see that T 0 (6.23) + (6.26) dt Recall that the weight functions satisfy |Θ L (|A|)A| L, |θ L + 3Θ L | 1.
Since (F * , G * ) has at least one of e −ix·ξ(t) P >C * K/4 u and e −2ix·ξ(t) P >C * K/4 v; for instance, we can estimate (6.32) and (6.33) using (6.30)-(6.31) similarly to the radial case. We focus on the last integral (6.34), which does not have a counterpart in the radial case. Notice that Im[F * U * + G * V * ] has two types of products of three functions; two highs and a low, or two lows and a high, The former case is easier to treat. In fact, we use the Hölder inequality in (t, y) as L 2 L 4 ·L 2 L 4 ·L ∞ L 2 and apply (6.30) to obtain a bound of o(K): For the latter case, we extract one derivative from the high-frequency functions as u, e −2ix·ξ(t) P > C * K 4 v and integrate it by parts (in y). When the derivative moves to one of the low-frequency functions, we obtain a bound like which is again o(K) by (6.30)-(6.31). If the derivative moves to the weight function Θ L (|A|)A, then we have another N (t) as (6.33), and the resulting bound will be Similarly to (6.33), the Sobolev embedding and (6.30)-(6.31) show that this is also o(K).
This completes the proof of (6.29), and hence that of Theorem 6.7.
We finish the proof of which precludes the quasi-soliton scenario for the non-radial case. To prove that, we first observe M(t) is also invariant under the gauge transformation (U, V ) → (U * , V * ), which can be seen in the same manner as (6.17) is under (U, V ) → (U * , V * ). Now, the almost periodicity of (u, v) yields and similarly for V * . An application of the Cauchy-Schwarz finally lead us to the conclusion.
Lemma A.6. We define the metric d on G \ L 2 (R d ) 2 as follows: where O d denotes the topology defined by the metric d.
Appendix B. Proof of the profile decomposition in the radial case Lemma B.1. Let {u n } L 2 (R d ) , u ∈ L 2 (R d ). Then, following three conditions are equivalent. 1. u n ⇀ u weakly in L 2 (R d ).
Proof. This follows from same argument in the proof of Lemma A.3.
Proof of Lemma 3.24. By the Duhamel formula and the Strichartz estimates, we have The desired estimate follows from an interpolation between the above and (3.6) if S J (u, v) ≤ 1. When S J (u, v) > 1, we first divide J into O(S J (u, v)) subintervals {J k } so that S J k (u, v) ∼ 1 on each J k , and sum up the obtained estimates on J k .
Applying the Hölder inequality to the left hand side, we have for any t ∈ J. The first inequality in (3.21) then follows after an integration in t.
For the second inequality in (3.21), we may focus on the case S J (u, v) > 1. Applying Lemma 3.24 with η = SJ (u,v)/100 1+SJ (u,v) (∼ 1), we see that there exists R = R(u, v) > 0 satisfying , which is, via the Hausdorff-Young inequality followed by the Hölder, bounded by as desired.