STRONGLY NONLINEAR PERTURBATION THEORY FOR SOLITARY WAVES AND BIONS

. Strongly nonlinear perturbation theory would seem to be an oxy-moron, that is, a contradiction of terms. Nonetheless, we here describe pertur- bation methods for wave categories that are intrinsically nonlinear including solitons (solitary waves), bound states of solitons (bions) and spatially periodic traveling waves (cnoidal waves). Examples include the Kortweg-deVries and Benjamin-Ono equations with general power law nonlinearity and the Fifth Order KdV equation. The perturbation strategies include (i) the Gorshkov-Ostrovsky-Papko near-equal-amplitude soliton interaction theory (ii) pertur- bation series in the Newton-homotopy parameter and (iii) approximations for large values of the nonlinearity exponent. A long section places strongly non- linear perturbation theory for waves in a larger context as a subset of unconventional perturbation expansions including phase transition theory in 4 − (cid:15) dimen- sions, the (cid:15) = 1 /D expansion where D is the dimension in quantum chemistry, the renormalized quantum anharmonic oscillator, the Yakhot-Orszag expan- sion in the exponent of the energy spectrum in hydrodynamic turbulence, and the Newton homotopy expansion.


1.
Introduction. When a nondimensional physical parameter is small, a perturbation series in is an obvious strategy. However, unobvious choices of an expansion parameter can greatly extend the range and usefulness of perturbation series. In the next section, we discuss a wide variety of unorthodox expansions. The reader who is only interested in nonlinear waves can skip Sec. 2 without loss of continuity, but this section does place unconventional perturbation theory for waves in a larger context.
In the remainder of the article, we describe multiple ways to apply perturbation theory to solitons, bions and cnoidal waves -all waves for which nonlinearity is not a slight distortion of linear waves, but rather essential and strong.
2. Unconventional perturbation parameters: Phase transitions in 3.99 dimensions, the 1/D expansion and All That. A wide variety of unorthodox perturbation parameters are catalogued in Table 1. In the rest of the article, we elaborate on this bestiary of perturbative species.
The author is supported by NSF grant DMS1521158. 2.1. Dimensional perturbation theory. Sometimes a very counterintuitive choice of perturbation parameter is best. In the theory of critical phenonemon, an important tool is perturbation series in the difference between the dimension D and 4, which led to a famous paper with the amusing title "critical phenomenon in 3.99 dimensions" [62,61]. In the words of the physicist and Fields Medalist Ed Witten, "After decades in which the study of critical phenomena was thwarted by the absence of an expansion parameter, [ Nobel Laureate] Kenneth Wilson and Michael Fisher suggested that to introduce a parameter, one should regard the number of spatial dimensions not as a fixed number, three, but as a variable parameter. They showed that critical phenomena are simple in four dimensions and that in 4 − δ dimensions critical phenomena can be understood by perturbation theory in δ. Even at δ = 1, the original three-dimensional problem, this perturbation theory is quite successful." [63] A number of other problems simplify when the dimension D is infinite and the perturbation parameter is the reciprocal of D. As Witten explains, the 1/D series was first invented in atomic physics.
Intrigued by Witten's review, the chemistry Nobel Laureate, Dudley Herschbach, applied the 1/D expansion and related scaling laws to a large number of problems in quantum chemistry as summarized in his scientific autobiography [43] and reviews [42,40]. His most elementary example, also treated by Witten [63], is the Ddimensional hydrogenic atom which solves the eigenproblem where ψ is the wave function of the electron, D is the spatial dimension, is the orbital angular momentum, E is the energy (the eigenvalue), r is the radial coordinate in spherical or hyperspherical coordinates and Z is the nuclear charge. (Herschbach also exploits a 1/Z expansion, but we must leave this to his articles.) The exact smallest ("ground state") eigenvalue is exactly Witten notes with disgust that the 1/D expansion converges rather slowly as Herschbach is made of sterner stuff and obtains high accuracy by deploying series acceleration tricks.
The Padé approximation f [M/N ] is a rational approximation, the ratio of a polynomial of degree M divided by a polynomial of degree N , chosen so that the approximant has a Taylor series that matches the first (M + N + 1) terms of the target function f ( ) where here the perturbation parameter is ≡ 1/D. All Padé approximations with N ≥ 2 are exact for hydrogenic atoms. The inexact Padé approximants and the truncated power series from when they come are listed below with their errors at = 1/3 ↔ D = 3: It is typical that the [2/1] Padé approximation is more than six times more accurate than the cubic polynomial (truncated power series) whose coefficients are the sole input for calculating the numerator and denominator polynomials of the [2/1] Padé approximant. For more complicated problems, Herschbach shows that the Padé approximants with M = N are much more accurate than the truncated 1/D power series from whence they came. This is generic for Padé approximations in all applications, and not peculiar or otherwise restricted to 1/D series. In quantum chemistry, exact solutions are often available for D = 1 as well as D = ∞. This allows Herschbach to interpolate in D very successfully. Witten's frown was changed to Herschbach's smile through Padé approximations, dimensional interpolation and other accelerators given in Herschbach's review papers [30,40,42,43] and more generally in [51] and [56].
This illustrates a general theme applicable to unconventional perturbation theory including expansions for strongly nonlinear waves: often, the series is accurate and useful only when the solution is post-processed by a series acceleration method.

2.2.
is the reciprocal of the number of quarks in quantum chromodynamics. The Nobel Laureate Gerald t'Hooft extended inverse dimension expansions to quantum chromodynamics (QCD) in 1974 by pretending there are N q quarks instead of the three quarks of standard QCD. Quantum chromodynamics simplifies as the number N q of quarks becomes large because only those terms in the perturbation series which are associated with "planar" Feynman diagrams contribute in this limit. Unfortunately, as Witten observes, "the planar diagrams are a vast class . . . one might think that without being able to sum the planar diagrams, one could learn very little from the 1/N q expansion. However, we encounter a surprise. Even without being able to sum the planar diagrams, we can get a considerable insight into the phenomenology of quantum chromodynamics . . . the reason is that there are certain "selection rules" or qualitative properties that are preserved, diagram by diagram, for each of the planar diagrams, but which are violated by the nonplanar diagrams.", page 42 of [63].
2.3. The delta-expansion: The perturbation parameter is the exponent of nonlinearity. Bender, Milton, Pinsky and Simmons treated the exponent of nonlinearity, which in most applications is physically restricted to an integer, as a continuous small parameter [7]. Thus, a polynomial equation may be rewritten as and solved as a perturbation series in powers of δ. This article has been cited more than a hundred times and has spawned many applications including differential equations as well as algebraic equations. This same idea has been applied to many problems by He [36,37,39,38]. Solving the redefined problem by perturbation theory is often called the "linear delta expansion" by physicists [3,4]. An application to waves is given in Section 3.
2.4. = h, the finite difference ("lattice") grid spacing. A common strategy in non-perturbative quantum field theory is to regularize the partial differential equation, removing the divergence of the perturbation series, by replacing derivatives by finite differences on a lattice of discrete points. Bender and Tovbis extended this strategy to boundary layer problems in fluid mechanics [10,9]. For example, they treated the Blasius problem (1908) [12,13,19,23,45] in the form δu xxx + uu xx = 0 (11) This nonlinear boundary value problem is transmuted to a much easier initial value problem if one can compute the missing initial condition which they take as the goal of their perturbation series. 1 Their finite difference scheme is Their procedure is to (i) solve for u 0 n for = 0 (ii) impose lim n→∞ u 0 n+1 − u 0 n = 1 (iii) compute the O( ) solution u 1 n [their b m ] and (iv) extrapolate to → ∞. Unfortunately, their best estimate for the missing second derivative at the origin is 6.3% too low. This is not very satisfactory given that a variety of much more accurate ad hoc approximations are known [45,23]. 2 Bender and two new collaborators [9] had another go at lattice expansions for the instanton [Taylor shock] and Blasius problem. 3 Despite calculating two hundred orders for the instanton and three hundred for the Blasius flow and also applying Kleinert's variational perturbation method, Padé resummation of the series always converged to a wrong answer, off by 1 part in 1000 for the instanton and 1.5 % for the Blasius second derivative at the origin. It must be freely acknowledged that unconventional perturbation theories are useful only about half the time; failure is always an option, and, metaphorically, one doesn't know how the play turns out until the end of the third act.
2.5. The ground state eigenvalue as perturbation parameter. Bender and Jones solved the stationary one-dimensional Schrodinger equation by means of a perturbation series in which the small parameter is the eigenvalue itself. They are able to solve this expansion order by order; the j-th order as a (2j − 1)-fold iterated integral. The good news is that the series converges and with the aid of acceleration by Padé approximants, they are able to obtain numerically accurate approximations. The bad news is that although they are able to calculate many terms in the series for test examples like the quantum harmonic oscillator whose exact solution is known, the iterated integrals rapidly become unmanageable for nontrivial problems, even as simple a problem as the quartic anharmonic oscillator.
2.6. The small parameter as the exponent of the energy spectrum: Renormalization group perturbation theory in hydrodynamic and MHD turbulence. Renormalization group theory [60] has been applied to turbulence by multiple authors in multiple variants, but the -RG theory of Yakhot and Orszag [64,65,66] has generated more than 1200 citations of the earliest paper in their series [64]. Turbulence is always strongly nonlinear but in the work of Yakhot and Orszag, "the properties of an [energy spectrum] E(k) ∝ k 1−2 /3 are examined through an expansion, in powers of , of a spectrum E(k) ∝ k" in the words of Kraichnan (pg. 2403 of [47]. Kolmogorov had showed through irrefutable dimensional reasoning that for three-dimensional turbulence, E(k) ∝ k −5/3 , so the physically-interesting value of the "small" (!) parameter is . Nevertheless, their theory gave very good values for the fundamental constants of turbulence theory.
Orszag team-taught the introductory graduate-level applied mathematics course at MIT in the seventies with perturbation theory expert Carl Bender, and perhaps never entirely recovered from the experience. As he generated numerical tests and graphical evaluations of perturbative method after method and problem after problem as catalogued in their outstanding textbook [8], he observed again and again that the expansion was accurate even when was not small. The audacity of applying the renormalization group expansion when = 4 was thus based on extensive experience.
The -RG theory continues to influence turbulence studies [29,66]. Zhou wrote in 2010 [66]: It therefore appears that RG theories, like all effectively computable turbulence theories, are compromises -compromises in the interest of analytical tractability. Continued interest in this kind of compromise is certainly justified provided that the simplifications of the dynamics that are introduced are well understood.

2.7.
Quantum quartic anharmonic oscillator: Renormalization and Chebyshev polynomials. The quantum anharmonic oscillator is the eigenvalue problem where λ ≥ 0 is the "coupling constant" and E is the eigenvalue (energy level). The obvious expansion is a power series in λ. This diverges for all λ > 0 because for sufficiently large |y| the perturbation λy 4 is not small compared to the unperturbed potential energy y 2 . However, it has been proved that Padé approximants from the "small-coupling" series converge for all positive λ, albeit more and more slowly with increasing λ. The subtle, unobvious perturbation theory begins by interchanging the definitions of unperturbed and perturbed potential energy terms by the transformations and definitions For λ → ∞, the unperturbed strong coupling problem is what we shall dub the "pure quartic quantum oscillator" Unfortunately, this has no simple explicit solution. It is easily solved numerically but the dream of perturbation theory is always to make number crunching unnecessary. Even without an explicit or numerical solution, however, the scaling gives useful information that Furthermore, E is asymptotically a function not of 1/λ, but rather of λ −2/3 . One strength of a perturbation series is that it is not a single discrete solution at a particular value of the parameter, as furnished by a numerical solution, but rather the series is an approximation which is continuous in the parameter , accurate not at a point but over an entire continuous range. If a numerical problem-solver is available, its discrete solutions can be turned into perturbation-like series.
For the quantum quartic oscillator, for example, Hioe and Montroll [44] combined many numerical solutions to extrapolate the coefficients of the inverse power series for large . Converting from their notation to ours, they obtained Only five digits are shown for the first correction and just three for the O( −4/3 ) term because extrapolating derivatives is very ill-conditioned, and the pernicious effects of roundoff error grow rapidly with the order of the derivative.
2.7.1. Chebyshev series. Boyd pointed out forty years ago that a much better procedure is to numerically solve the problem at a set of discrete and then apply the usual machinery of Chebyshev interpolation [20] to approximate the eigenvalue and eigenfunction as accurately as one pleases [15]. For example, the ground state eigenvalue is given for all where = 1.0857670466379. 4 Boyd gives bivariate Chebyshev series in both and x for the lowest few eigenfunctions that cover all positive and all real x. The power series about = 0 are factorially divergence; in contrast, the Chebyshev series for both large and small converge exponentially fast.
There is, however, an alternative to Chebyshev methods for large that is purely perturbative.

2.7.2.
Renormalized perturbation theory. Vinette andČižek [55,58,57] developed an improved perturation series that has no more deficiencies than the weak coupling expansion. The first step in renormalization is to rescale the coordinate and the eigenvalue E so that: The coefficients are given to many more digits and higher degree in [15]. However, if we use E and to denote the eigenvalue and coupling constant used here in 2017 and E and Λ as the analogous quantities in [15], then Λ = /2 and E( ) = 2 E( /2). The earlier article employed the "summation-prime" convention of defining the coefficient of T 0 (x) to be twice its true contribution to the approximation and then halving this coefficient when evaluating the sum. This was popularized by the British Chebyshev school of Leslie Fox and Charles Clenshaw because it simplified theory, but has fallen out of favor. The Chebyshev coefficients of [15] are incorrect and should be doubled except for the zero degree coefficients which are unchanged because rejecting the "summation-prime" convention cancels E = 2E for the coefficient of T 0 . Lastly, the earlier paper employed Λ −2/3 /4 as the coupling coefficient but the 1/4 was omitted in several places. The asymptotic series in [15], The heart of renormalization is introduce a new coupling constant κ such that τ 2 x 2 is partitioned between a −x 2 which is identical with the unperturbed term in the original problem plus a second term which is proportional to the new coupling constant κ, and thus will become part of the perturbation. Further, it is desirable that the κx 2 should be opposite in sign to the x 4 term so as to effectively weaken the perturbation. These considerations motive the definition To make the fourth degree part of the coefficient proportional to the new coupling constant, − τ 3 = (κ/3)x 4 , where the "3" is a convention found convenient by Vinette andČižek, but otherwise irrelevant. This requires that τ is a function of implicitly as the solution to the cubic The renormalized problem is then The perturbation series can be calculated in exactly the same way as for the "weak coupling" series, albeit the coefficients are different [55]: Table 2 gives the leading coefficients in this series. The c n , which are all rational numbers, were calculated up to c 200 in [57].
In the limit → ∞, κ → 1 and τ → 0. The most extreme test of the usefulness of the renormalized expansion is to apply it in this limit by computing the smallest eigenvalue of the pure quartic oscillator. This is given by the renormalized series as The renormalized series is factorially divergent for all κ, but forming diagonal Padé approximations and then setting κ = 1 gives the rapid convergence catalogued in Table 3.
Hopefully buoyed by these successes of unorthodox, Coloring-Outside-the-Lines perturbation theory, we now tackle in turn and in greater detail three successful applications of perturbation theory to strongly nonlinear waves.
3. Gorshkov-Ostrovsky-Papko perturbation theory: When solitary waves are of almost equal amplitude. When a tall, narrow solitary wave of the Korteweg-deVries (KdV) equation is behind a short, wide solitary wave, the taller solitary wave will overtake the smaller solitary wave. The two soliton KdV solution is given by an exact analytical formula in textbooks such as Whitham (1974) [59]. This formula is very complicated and it does not apply to generalizations such as the fifth order KdV equation (FKdV). Gorshkov, Ostrovsky and Papko then asked themselves if there was any way to analyze soliton-soliton interactions using perturbation theory [31,32,34]. They realized there was a limit that allows such an analysis. If the two solitary waves are of almost identical height and width, then the difference in phase speeds will be very small. The solitary waves will approach each other very slowly. Although the amplitude of a solitary wave falls off exponentially with the distance from the center of the solitary wave so that two solitons separated by a distance s will have an interaction whose magnitude is O(exp(−qs)) for some constant q > 0, the two solitary waves may nevertheless cause O(1) changes in each other over the long interaction time interval precisely because the time interval is long. When the difference in amplitudes is not small, it is known from the analytical solution that the two solitons may superimpose to generate a single peak during the collision. This superposed structure splits up to reproduce the two original solitary waves. However, the analytical solution also shows that if the amplitudes of the two solitary waves are very close, then the two solitary waves reach a finite minimum separation, but never fuse and remain two separate and distinct peaks for all time.
With soliton-soliton interactions, the phase speeds of the solitons, c 1 and c 2 , are not constant but rather vary with the slow timescale τ = t where 1 is some measure of the strength of the interaction of the two solitons.
For example, consider the Fifth-Order Korteweg-deVries (FKdV) equation, For traveling waves of phase speed c, it is convenient to shift into a frame of reference moving with the wave by defining In the moving frame, the wave is steady and the partial differential equation simplifies to the ODE Define Ψ(X) to be the isolated solitary wave for unit phase speed. Nagashima and Kuwahara computed the approximate solution [48] Ψ(X) = 2.65758756 exp −0.16X 2 × as illustrated in Fig. 1. A more accurate approximation for the soliton is given in Appendix A. The approximation for weakly interacting solitons is where At first order in in the vicinity of the peak of U 1 , define The following consistent approximation and derivation are discussed at greater lengths in [16].
1. Define s(τ ) as the separation between the peaks of the two solitons. Then 2. The adjoint of the linear operator L is where by "adjoint operator" we mean a linear differential operator L * such that for arbitrary functions v(X) and w(X) where for arbitrary a(X) and b(X). 3. U 1 (X) is a non-trivial eigenfunction of L * with zero eigenvalue, that is, L * U 1 = 0; therefore, the first order equation L∆ 1 = H(X) has a bounded solution only if the "solvability condition" is satisfied. 4. U 1,τ ∼ −c t U 1 + higher order terms 5. s t = c ↔ s tt = c t 6. U 2 (X 2 ) ∼ U 1 (X 1 − s) 7. By integration by parts, after dropping subscripts, where the soliton interaction potential energy is This differential equation has the form of the usual second order Newton equation of classical particle physics. The left solitary wave creates a potential energy function whose gradient accelerates the soliton on the right, which in turns exerts an equal and opposite force on the leftmost soliton. This "potential energy" is graphed as the left two plots in Fig 2. Both show the same curve but the vertical scale is linear in the upper plot and logarithmic in the lower graph. It can be shown that because the FKdV soliton oscillates as a decaying oscillation for all X the same is true for V (s).
In contrast, KdV solitary waves decay monotonically with |X|. Gorshkov, Ostrovsky and Papko pointed out that this implies that the KdV interaction potential energy decays monotonically, too, as shown in the right two plots in Fig. 2. The KdV potential energy has no minima except at s = ∞. When the solitary wave decreases monotonically from its peak without oscillations, as true of KdV solitary waves, Gorshkov, Ostrovsky and Papko show that the interaction of the two solitary wave is always repulsive. This repulsion eventually brings the two solitary waves to a stop relative to one another and then at a larger time the solitary waves begin to move farther apart. This repulsive interaction increases the velocity of the rightmost solitary wave while decreasing the speed of the solitary wave which was initially overtaking it from the left. For KdV solitary waves, the phase speed is proportional to the square of the amplitude. It follows that when the initially-smaller rightmost solitary wave is accelerated, its height increases whereas the height of the trailing solitary wave decreases as it is deccelerated by its interaction with the solitary wave to its right. For large times, the solitary waves recover their initial amplitude and height but have exchanged positions so that the leftmost solitary wave, initially the taller, is now the short, wide solitary wave while the leading solitary wave has become the tall, narrow soliton. The only long-term consequence of their interaction is a phase shift. Remarkably, the KdV phase shift is given exactly by the lowest order theory of Gorshkov, Ostrovsky and Papko [31].
The solitary waves of the fifth order Korteweg-deVries equation do not decay monotonically but oscillate as illustrated in Figure 1. This implies that the potential energy for soliton-soliton interactions is also spatially oscillatory with an infinite number of increasingly shallow potential "wells" (local minima). One solitary wave can be trapped at the minimum of the potential well created by the other solitary wave. The result is that the two solitary waves become a single coherent structure traveling at a common phase speed. Such two-peak solitary waves are usually called "bions". Boyd has given an extensive study of FKdV bions [17,16]. The Gorshkov-Ostrovsky perturbation theory gives the exact phase shifts for the collision of two nearly equal solitary waves of the classic KdV equation. The theory is not exact for the fifth order KdV equation. Nevertheless, the predictions of the theory give an excellent first guess to initialize Newton's iteration to compute FKdV bions [17].  Gorshkov, Ostrovskii, and Papko [31,34,32,33] and Kolossovski, Champneys, Buryak and Sammut (2002) [46] have developed the method further. Long ago, Grimshaw and Malomed predicted (bions) [bound states] of weakly nonlocal solitary waves [18] for the generalized FKdV equation that includes a third derivative term [35]. This was confirmed numerically by Boyd [17], who showed the intimate relationship between nearly-singular modes of the linerized wave equation and the convergence of Newton's iteration to a bound state of solitary waves. 4. Perturbation theory in an artificial homotopy parameter.

4.1.
Homotopy in an artificial parameter. When the soliton for the ordinary Benjamin-Ono equation [1,2] is used as the first guess for the solution to the cubic Benjamin-Ono equation, Newton's iteration for the spectrally-discretized equation fails. This is an extremely common difficulty arising in many other problems both within and without the field of nonlinear waves. A very general remedy to solve N(u) = 0 is to inflate the problem to a more complicated with one additional paramete δ that interpolates between the target at δ = 1 and a problem with a known solution, Q(u 0 ) = 0, at δ = 0. The inflated equation is This inflation trick is called a "homotopy" method by numerical analysts. (This particular scheme is usually labeled the "Newton homotopy".) It can be implemented purely numerically. The known solution for δ = 0 is a successful initialization for Newton's iteration for δ = δ 1 for sufficiently small δ 1 . Once a solution has been obtained for δ = δ 1 , it provides a first guess for δ = 2δ 1 . The computation can thus proceed in small steps in δ to δ = 1. This stepwise numerical march is the "continuation" method described in [49,5,24].

4.2.
Solitons of the generalized Korteweg-deVries equation by means of homotopy perturbation theory. The cubic KdV, also called the "Modified KdV" (MKdV) in the literature, can be connected to the usual KdV through the artificial parameter homotopy This is a useful illustration because the cubic KdV soliton is known in explicit form and all orders of the perturbation series can be expressed as polynomials.
The initial and final points on δ ∈ [0, 1] are A valuable trick is to make the change of coordinate This allows one to convert the solution at each perturbative order into a finite polynomial in z. This technique is illustrated in detail with a table of helpful identities and an application to shocks of the KdV-Burgers equation in Appendix B of [18]. Useful auxiliary formulas needed here include In z, the KdV homotopy ODE becomes with the limits into the ODE, matching powers of δ, and solving the linear inhomogeneous problem at each order by matching powers of z gives A symbolic manipulation language such as Reduce or Mathematica is very good at manipulating polynomials; the complete Maple code is listed in Table 3 of [26].
Unfortunately, the perturbation series converges too poorly to be useful at δ = 1. However, if we form Padé approximants from the power series, the sixth order approximation -a ratio of one cubic polynomial in δ divided by another of the same degree -has a maximum pointwise error at δ = 1 (the cubic-KdV soliton) of only 0.0009. The [3/3] (62) where z = tanh(X/2) as in (56). Of course, this rational function of hyperbolic functions is much more complicated than the exact Modified KdV soliton (55), but the success of the perturbation theory here encourages applications to other examples where the higher order solution is unknown.

4.3.
Perturbative extrapolation in δ for the cubic Benjamin-Ono equation. For traveling waves propagating steadily at a speed c, define X ≡ x − ct as the spatial coordinate in a frame of reference that moves at the speed of the wave. The generalized Benjamin-Ono equation becomes the ordinary integro-differential equation (OIDE) subject to the boundary condition u → 0 as |X| → ∞. The quadratically nonlinear (m = 1) case was independently derived by Benjamin [11] and Davis and Acrivos [28] as an approximation to weakly nonlinear water waves in deep water. Bona and Kalisch [14] have numerically studied the quartic nonlinearity (m = 3). Boyd and Xu studied the solitary waves for general integer m [26].
Unfortunately, when the soliton for the ordinary Benjamin-Ono equation is used as the first guess for the solution to the cubic Benjamin-Ono equation, Newton's iteration fails as noted earlier. An artificial parameter homotopy linking the known soliton of the quadratic Benjamin-Ono equation with the unknown soliton of the cubic BO equation is For this problem, the numerical-continuation-in-small-δ-steps was a failure with steps of 1/6 in δ, but ten steps of 1/10 was successful.
In theory, the continuation method will always succeed for a sufficiently small step size in δ, but continuation often fails in the real world as explained in [22,21] or succeeds only with multiple precision arithmetic and other heroic measures [5]. Boyd and Zhu therefore experimented with perturbative continuation [26].
Matching powers in δ yields at O(δ n ) a linear integro-differential equation of the form H[u n,XX ] − u ,nX + u 0 u n,X + u n u 0,X = r n (X) where r 1 = − u 2 0 u 0,X − u 0 u 0,X r 2 = − u 2 0 u 1,X + 2u 0 u 1 u 0,X − u 0 u 1,X − u 1 u 0,X + u 1 u 1,X and so on. The linear equations at each order cannot be solved explicitly, but the rational basis pseudospectral method is just as effective as for the full nonlinear problem [25,26]. Indeed, the perturbation series can be regarded as a quasi-Newton method in which the linearization is not performed about the current iterate, but always about δ = 0; this is called the "chord" or "Shamanski" method [50,24]. Fig. 4 is a "seesaw diagram" that graphically calculates the radius of convergence δ c of a series. For δ < δ c , the error norm falls with increasing order and also with increasing |δ − δ c |. In contrast, the error grows with order and |δ − δ c | for δ > δ c . There is little change with order for δ ≈ δ c . The graphs of the errors (or here, error norms) resemble snapshots of a seesaw, hence the name of this species of plot. Here, the "pivot point" is at roughly δ c ≈ 1/2. This is discouraging, however, since the cubic Benjamin-Ono soliton is at δ = 1.
In contrast the Padé approximations formed from the series converge at δ = 1 as shown in Fig. 5. At δ = 1 -the cubic Benjamin-Ono soliton -the [3/3] Padé approximant has an error of only about 0.03. This is more than sufficient so that Newton's iteration for the pseudospectral discretization of the nonlinear integro-differential equation will converge to the cubic BO soliton from the Padéperturbation-series initialization.
Although this is not an important consideration for a one-space-dimensional soliton, the perturbative extrapolation is much cheaper than numerical continuation in floating point operations because only a single matrix need be computed and Cholesky-factored. Numerical continuation requires roughly n s n i such matrix computations where n s is the number of continuation steps and n i is the number of Newton iterations at each step. Fig. 6 compares the cubic Benjamin-Ono soliton of unit phase speed (solid) with its counterpart for the classical Benjamin-Ono equation. The cubic-BO soliton decays from its maximum of 3.282 at the origin to half its maximum at |X| = 0.414. The decay for larger |X| is much slower, asymptotically proportional to roughly 4.15/X 2 . (That the asymptotic decay is proportional to 1/X 2 was rigorously proved in [26].)

5.
Solitary waves when the exponent of nonlinearity is large.

Overview. Our target is solitary waves for wave equations of the general form
Lu where L is a linear operator and m > 0 is an integer. In the limit that m 1, the nonlineari term becomes more and more narrow relative to u(X) itself. This makes it possible to carry out a perturbative analysis in which the small parameter is The real axis in X divides into an outer region where the linear terms are dominant and an inner region of width O(1/m) where nonlinearity cannot be neglected. It is necessary to analyze each region separately.

Outer region.
Let G denote the usual Green's function for the linear part of the problem, that is, G is the solution which decays as |X| → ∞ to where δ(X) is the usual Dirac delta function. When m 1, the (m + 1)-st power of a function is very tall and narrow, rather like the Dirac delta function. It is therefore useful to write the full nonlinear problem as where f (X) ≡ −u m+1 (X)/(m + 1).  Figure 5. Padé approximations formed from the perturbation series in δ for the quadratically-to-cubically-nonlinear Benjamin-Ono homotopy.
For large nonlinearity exponent m, the function f (X), which is proportional to u m+1 (X), will be very narrow compared to the scale of u(X) itself, and has negligible amplitude outside small |X|. By making the replacement f (X) → (1/ε)f (X/ε) where the parameter ε << 1, we can build this narrowness into the inhomogeneous term of the linear OIDE (84). For a symmetric f (X), the convolution solution can be written When the forcing is narrow, by changing the integration variable to ζ ≡ Z/ε. When ε << 1, where the symmetry of f (ζ) implies that ∞ −∞ ζ f (ζ) dζ = 0. This approximation is not uniformly valid in space because G(X) is logarithmically singular at X = 0 whereas u(X; ε) is not singular. Outside of a small neighborhood of the origin, however, the approximation u(X; ε) ∼ F (0) G(X) is accurate with an error O(ε 2 ).
which is just a constant times the Green's function. Unlike the Generalized Benjamin-Ono equation described in the next subsection, the leading order asymptotic approximation and the Green's function are proportional to one another for all |X|, not merely for |X| large.

5.3.
Benjamin-Ono equation with high order nonlinearity. It is straightforward to compute Benjamin-Ono solitons of higher nonlinearity exponent m as shown in Fig. 8. As m increases, the solitary wave of unit phase speed become narrower, and the region of strong curvature around the peak narrows as O(1/m).
To integrate the integro-differential equation H(u XX ) − u X = − u m u X with respect to X, it is convenient to recall that the X-derivative commutes with the Hilbert transform. The OIDE becomes where the constant of integration is zero because u vanishes for large |X|. This is of the form of the linear inhomogeneous equation where f (X) is symmetric with respect to the origin. It turns out that one can learn much about the nonlinear problem by studying this linear equation, leaving f (X) unspecified for the moment.
The following is proved in Boyd and Zhu [26] which also provides additional properties of the Green's function that are not repeated here.
where f (X) is symmetric with respect to the origin, that is, f (X) = f (−X) ∀X. Then 1.
is the Fourier transform of f . 2.

Define the Green's function G(X) as the solution to
where δ(X) is the Dirac delta function. Then Even before the inner region is analyzed, the outer approximation is useful. Boyd and Xu prove that the solitons of the Generalized Benjamin-Ono equation have the large X asymptotic approximation given by the following.
Other wave equations are treated, again with Green's function reasoning, in Theorem 5.1 of Chen and Bona [27].
For moderate and small |X|, the Green's function approximation is much superior to the O(1/X 2 ) prediction of the theorem, (93). Figs. 4 and 5 of [26] are a visual proof for the Cubic and Quintic Benjamin-Ono equations. Fig. 9 shows the outer region for the solitary wave of the Octic Benjamin-Ono equation for unit phase speed (c = 1) and its two outer approximations. The Green's function and the inverse square approximation merge for large X, but the Green's function is much better for small and moderate X, being visually identical with the solitary wave.
5.4. Inner region. It would be nice to have an analytic approximation for that narrow region where the Green's function approximation fails and the nonlinear term is important. Alas, an explicit inner approximation is an important unsolved problem, for the Benjamin-Ono equation and for high-power nonlinearity in general. Still, much can be learned even without an explicit inner solution. Fig. 10 is a zoom plot that shows the inner region for Octic Benjamin-Ono equation. 6. Summary. We end with a few final remarks on perturbative theory: 1. Unconventional choices of perturbation parameter are often very successful. 2. Through unconventional , perturbation theory can be usefully applied to strongly nonlinear waves. 3. Unconventional perturbation theory sometimes fails to be useful. In the words of Sir Horace Lamb, "A traveler who refuses to pass over a bridge until he has personally tested the soundness of every part of it is not likely to go far; something must be risked, even in mathematics." 4. Series acceleration methods like Padé approximants are often crucial to success. 5. Perturbation theory often hits an Explicit Wall; numerical solutions, or equivalently, a symbolic/numeric (symbolurgical/arithmurgical) approach is necessary to penetrate further into the unknown.
After spending many hours checking paper-and-pencil calculations one long day in 1821, Charles Babbage said to his friend and fellow calculator, John Herschel, "I wish to God these calculations had been done by steam!" [53]. Babbage's efforts to realize "steam algebra" through brass, oil and coal were unsuccessful except to inspire future computer architects (and that only in hindsight [53]), but Babbage's dream has been realized. Computer algebra in systems like Maple and Mathematica has condensed months of chirugery ("handwork") into a few seconds on a laptop or  It is a truism in arithmurgy (literally "number-working") that advances in numerical computing have progressed through hardware advances and algorithmic invention in roughly equal proportion, decade after decade. The solution of the Poisson equation has benefited enormously from the evolution from the CDC 6600 of 1965 ( perhaps six-tenths of a megaflop in Fortran) to the petaflop computers of today, but multigrid has reduced an O(N 3 ) matrix problem to O(N ).
It is no different in perturbation theory. "Steam algebra" has enormously increased the range and possibilities of perturbation theory. As we have tried to show, unconventional perturbation parameters and series acceleration strategies have also and equally vastened the scope and possibilities of perturbation theory. through grant DMS1521158. I thank Pedro Jordan and Barbara Kaltenbacher for inviting me to contribute to this special issue of Evolution Equations and Control