The general recombination equation in continuous time and its solution

The process of recombination in population genetics, in its deterministic limit, leads to a nonlinear ODE in the Banach space of finite measures on a locally compact product space. It has an embedding into a larger family of nonlinear ODEs that permits a systematic analysis with lattice-theoretic methods for general partitions of finite sets. We discuss this type of system, reduce it to an equivalent finite-dimensional nonlinear problem, and establish a connection with an ancestral partitioning process, backward in time. We solve the finite-dimensional problem recursively for generic sets of parameters and briefly discuss the singular cases, and how to extend the solution to this situation.


Introduction
This contribution is concerned with differential equation models for the dynamics of the genetic composition of populations that evolve under recombination. Here, recombination is the genetic mechanism in which two parent individuals are involved in creating the mixed type of their offspring during sexual reproduction. The essence of this process is illustrated in Fig. 1 and may be idealised and summarised as follows.
Genetic information is encoded in terms of sequences of finite length. Eggs and sperm (i.e., female and male germ cells or gametes) each carry one such sequence. They go through the following life cycle: At fertilisation, two gametes meet randomly and unite, thus starting the life of a new individual, which is equipped with both the maternal and the paternal sequence. At maturity, this individual generates its own germ cells. This process may include recombination, that is, the maternal and paternal sequences perform one or more crossovers and are cut and relinked accordingly, so that two 'mixed' sequences emerge. These are the new gametes and start the next round of fertilisation (by random mating within a large population).
Models of this process aim at describing the dynamics of the genetic composition of a population that goes through this life cycle repeatedly. These models come in various flavours: in discrete or continous time; with various assumptions about the crossover pattern; and in a deterministic or a stochastic formulation, depending on whether or not the population is assumed to be so large that stochastic fluctuations may be neglected. We will employ the deterministic continuous-time approach here, but allow for very general crossover patterns.
From now on, we describe populations at the level of their gametes and thus identify gametes with individuals. Their genetic information is encoded in terms of a linear arrangement of sites, indexed by the set S := {1, . . . , n}. For each site i ∈ S, there is a set X i of 'letters' that may possibly occur at that site. For the sake of concreteness, we restrict ourselves to Each line symbolises a sequence of sites that defines a gamete (such as the two at the top that start the cycle as 'egg' and 'sperm'). The pool of gametes at the left and the right comes from a large population of recombining individuals. These sequences meet randomly to start the life of a new individual. Altogether, each of these sequences has been pieced together from two randomly chosen parental sequences.
finite sets X i for the moment; we generalise this to arbitrary locally compact spaces X i in Section 2.
A type is thus defined as a sequence x = (x 1 , . . . , x n ) ∈ X 1 × · · · × X n =: X, where X is called the type space. By construction, x i is the i-th component or coordinate of x, and we define x I := (x i ) i∈I as the collection of 'coordinates' with indices in I, where I is a subset of S. A population is identified with a probability vector p = p(x) x∈X on X, where p(x) denotes the proportion of individuals of type x in X. Note that we assume the sequences to have fixed length. Additional processes that may change this, such as copying blocks, are disregarded here; see [16] and references therein for possible extensions.
With Fig. 1 in mind, recombination may now be modelled as follows. A new ('offspring') sequence is formed as the 'mixture' of two randomly chosen parental sequences (say x and y) from the population: it copies the letters of x at some of its sites and those of y at all others. If, for example, a double crossover happens between sites i and i + 1 and between j and j + 1 (i < j), then the offspring sequence reads (x 1 , . . . , x i , y i+1 , . . . , y j , x j+1 , . . . , x n ). The offspring sequence replaces a randomly chosen sequence (possibly one of the parents, but this is negligibly rare in a large population). Viewed differently, the offspring sequence in our example reads x whenever the parents are of the form (x 1 , . . . , x i , * , . . . , * , x j+1 , . . . , x n ) and ( * , . . . , * , x i+1 , . . . , x j , * , . . . , * ). Here, a ' * ' at site i stands for an arbitrary element of X i , so means marginalisation. This will be helpful when formulating the differential equation.
The sites that come from the paternal and the maternal sequence, respectively, define a partition A of S into two parts. Due to the random choice of the parents, we need not keep track of which sequence was 'maternal' and which was 'paternal'. In principle, all partitions of S into two parts (A = {A 1 , A 2 }) can be realised, via a suitable number of crossovers at suitable positions. If no crossover happens, then the partition is A = {S}, and the offspring is an exact copy of the first parent. Reproduction with recombination according to a partition A happens at rate ̺(A).
We shall introduce all notions with more care later. For now, we turn the verbal description into a differential equation system and obtain (1) for all x ∈ X, where P 2 (S) denotes the set of partitions of S into two parts. Eq. (1) may be understood as a 'mass balance' equation: For every A ∈ P 2 (S), sequences of type x are 'produced' from the corresponding parental sequences at overall rate ̺(A) p t (x A 1 , * ) p t ( * , x A 2 ), where the product reflects the random combination; at the same time, sequences of type x are lost (i.e., replaced by new ones) at overall rate ̺(A) p t (x). Note that the case A = {S} provides no net contribution toṗ t (x), since gain and loss are equal in this case. The resulting ODE system appears difficult to handle, due to the large number of possible states and the nonlinearity of the right-hand side. In previous papers [6,5,4,19,3,7,8], we have concentrated on a special case, namely, the situation in which at most one crossover happens at any given time. That is, we restricted attention to ordered partitions into two parts, corresponding to the sites before and after a single crossover point. We have analysed the resulting models in continuous time (both deterministic and stochastic), as well as in discrete time. For the deterministic continuous-time system, a simple explicit solution is available [6,5]. This simplicity is due to some underlying linearity.
It is now time to tackle the case of general partitions (in continuous time). Therefore, in this contribution, we give up the single-crossover assumption -and even allow for an arbitrary number of parents in a given recombination event, which leads to partitions with more than two parts. Even though this is not a common biological feature, we will see that it requires little extra mathematical effort. Also, it is a very natural structure on the lattice of partitions of a (finite) set. This contribution is motivated by the pioneering work of Geiringer [14] and Bennett [10], who worked on a similar system in discrete time (but restricted to a special type space); and by more recent work of Dawson [12,13], who presented a solution in 2000 and 2002. It relies on a certain nonlinear transformation from (gamete or type) frequencies to suitable correlation functions, which decouple from each other and decay geometrically. If sequences of more than three sites are involved, this transformation must be constructed via recursions that involve the parameters of the recombination process.
Dawson's construction testifies to remarkable insight into the problem. However, it is not easy to penetrate to the mathematical core of his arguments. We therefore start at the very beginning and formulate the model on a fairly general type space, in a measure-theoretic framework, and allowing for arbitrary partitions. More importantly, we put the problem into a systematic lattice-theoretic setting; this will turn out as the key for the transparent construction of the solution.
The paper is organised as follows. After introducing the mathematical objects we need and some of their properties in Section 2, the general recombination equation is discussed in Section 3. As a first step, this is done in the setting of a measure-valued ordinary differential equation (ODE), which is then reduced to a finite-dimensional ODE system. Section 4 solves this system under a linearity assumption, which is motivated by previous work, but does not give the solution in sufficient generality.
As a further preparation for the general solution, we study the behaviour of the system under marginalisation in Section 5, which is then followed by the derivation of the general solution in Section 6, which is recursive in nature and applies to the generic choice of the recombination rates. More detailed properties of the solution are investigated in Section 7, while Section 8 deals with various types of non-generic cases. The Appendix provides some material for the treatment of degenerate cases. This paper builds on previous work, most importantly on [6,5]. Some of the results from these papers will freely be used below, and not re-derived here (though we will always provide precise references).

Partitions, measures and recombinators
Let S be a finite set, and consider the lattice P = P(S) of partitions of S; see [1] for general background on lattice theory. When the cardinality of S is |S| = n, the set P contains B(n) elements, known as the Bell number ; compare [15, A000110]. With B(0) := 1, these numbers are recursively computed as B(n + 1) = n k=0 n k B(k) for n 0, with generating function F (z) := ∞ n=0 B(n) n! z n = exp(e z − 1) and explicit formula B(n) = 1 e ∞ k=0 k n k! . A subset of relevance to us, for the biological applications, consists of all partitions of S into two parts, P 2 (S) = A ∈ P(S) |A| = 2 , which contains 2 n−1 − 1 elements. Note that P 2 (S) generates the lattice P(S) in an obvious way.
Here, we write a partition of S as A = {A 1 , . . . , A r }, where r = |A| is the number of its parts, and one has A i ∩ A j = ∅ for all i = j together with A 1 ∪ · · · ∪ A r = S. The natural ordering relation is denoted by , where A B means that A is finer than B, or that B is coarser than A. The conditions A B and B A are synonymous, while A ≺ B means A B together with A = B.
The joint refinement of two partitions A and B is written as A ∧ B, and is the coarsest partition below A and B. The unique minimal partition within the lattice P is denoted as 0 = {{x} | x ∈ S}, while the unique maximal one is 1 = {S}. When A B, we also employ the interval notation [A, B] := {C | A C B}. For a general subset G of P, we write the complement as G c = P \ G. Finally, when G ⊂ P, the coarsest partition below all elements of G is denoted by G. Note that ∅ = 1 by convention.
When U and V are disjoint (finite) sets, two partitions A ∈ P(U ) and B ∈ P(V ) can be joined to form an element of P(U ∪ V ). We denote such a joining by A ⊔ B, and similarly for multiple joinings. Conversely, if U ⊂ S, a partition A ∈ P(S), with A = {A 1 , . . . , A r } say, defines a unique partition of U by restriction. The latter is denoted by A| U , and its parts are precisely all non-empty sets of the form A i ∩ U with 1 i r.
Let now S = {1, 2, . . . , n} and define X = X 1 ×· · ·×X n , where each X i is a locally compact space. In many concrete applications, the X i will be finite sets, but we do not make such a restriction as it is neither necessary nor desirable. In particular, there are situations in quantitative genetics [11] that will profit from the more general setting we employ here.
When S and X are given, we denote the natural projection to the ith component by π i , so π i (X) = X i . Similarly, for an arbitrary non-empty subset U ⊂ S, we use the notation π U : X −→ X U := × i∈U X i for the projection to the subspace X U .
Let M(X) denote the space of finite, regular Borel measures on X, equipped with the usual variation norm . , which makes it into a Banach space. Also, we need the closed subset (or cone) M + (X) of positive measures, which we mean to include the zero measure. Within M + (X), we denote the closed subset of probability measures by P(X). Note that M + (X) and P(X) are convex sets. The restriction of a measure µ ∈ M(X) to a subspace X U is written as π U .µ := µ • π −1 U , which is consistent with marginalisation of measures. For any Borel set A ⊂ X U , one thus has the relation π U .µ (A) = µ π −1 U (A) . Given a measure ν ∈ M(X) and a partition A = {A 1 , . . . , A r } ∈ P, we define the mapping where the symbol :µ⊗ν : means 'site ordering' and is used to indicate that the product measure refers to the ordering of the sites as specified by the set S. We shall also use site ordering for cylinder or product sets. When the meaning is clear, we shall drop these extra symbols. We call a mapping of type R A a recombinator. Note that recombinators are nonlinear whenever A = 1. Proposition 1. Let S = {1, 2, . . . , n} and X = X 1 × · · · × X n with all X i locally compact. Now, let A ∈ P(S) be arbitrary, and consider the corresponding recombinator as defined by Eq. (2). Then, the following assertions are true.
(1) R A is positive homogeneous of degree 1, which means that R A (aν) = aR A (ν) holds for all ν ∈ M(X) and a 0.
ν holds for all ν ∈ M(X). (4) R A maps M + (X) into itself. It remains to prove (2). Let A = {A 1 , . . . , A r } be a partition with r parts, where 1 r n. Then, one can show inductively in r that for arbitrary µ, ν ∈ M(X). If one of them is the zero measure, the Lipschitz estimate is a consequence of claim (3). Let now µ, ν ∈ M(X) both be non-zero, and assume that µ ν . Using Eq. (3) together with positive homogeneity of R A , we have where we used that µ/ µ and ν/ ν are measures of norm 1. Next, one has which, on inserting above, gives the inequality from which our claim follows, with L 2r + 1.

Remark 1.
Let us mention that there is an alternative way to see the Lipschitz property, at least for finite X. This is because the dynamics may then be reformulated in terms of a chemical reaction system. This is a large class of models, for which a substantial body of theory is available; see [17] for a review. In particular, the Lipschitz property applies under very general conditions, which are satisfied in our case. ♦ The estimate used above is rather coarse, but suffices for our needs here. When using the invariance of P(X), one simply needs Eq. (3) from the last proof to establish the following consequence.
Before we embark on the recombination equation and its solution, we need to establish one technical result on the relation between recombinators and projectors to subsystems, as defined by non-empty sets U ⊂ S. Here, the system on which a recombinator acts is marked by an upper index S (for the full system) or U (for the subsystem). Later, we will drop this index whenever the meaning is unambiguous. Lemma 1. Let S be a finite set as above, and A = {A 1 , . . . , A r } ∈ P(S) an arbitrary partition. If U ⊂ S is non-empty and ω ∈ M + (X), one has where A| U ∈ P(U ) and the upper index of a recombinator indicates on which measure space it acts.
Proof. We shall show the claimed identity by verifying it for certain 'rectangular' measurable sets that suffice for the equality of the two measures as Baire measures, and then rely on the unique extension of Baire measures to regular Borel measures; compare the discussion around [6,Fact 1]. To this end, let A ∈ P(S) be given as A = {A 1 , . . . , A r }, and write A| U ∈ P(U ) as A| U = {V 1 , . . . , V s }, where s r. Each V i is contained in precisely one part of A. Without loss of generality, we may thus assume that V i = A i ∩ U for 1 i s, while A j ∩ U = ∅ for all j > s (when r = s, this case does not occur). Now, let E i be a Borel set in X V i = π V i (X), and consider the corresponding 'rectangular' set E = : E 1 × · · · × E s : ⊂ X U , where : · · · : once again means proper site ordering. When evaluated on E, the right-hand side of our claim gives where we have used that π U .ω = ω and where π (U ) denotes a projector that is only defined on the subspace X U . This is now to be compared with the evaluation of the left hand side of the claim, which gives (with U c = S \ U ) which agrees with our previous expression and thus proves the claim. Lemma 1 has some interesting consequences for the structure of recombinators. We only state the result and omit the proof, as it is analogous to the corresponding result in [6].
Proposition 2. Let A, B ∈ P(S). On M + (X), the corresponding recombinators satisfy In particular, each recombinator is an idempotent and any two recombinators commute.
Below, we shall also need probability vectors on P = P(S), which form the compact space P(P). An interesting subclass of them can be constructed as follows. Let f : P −→ [0, 1] be a function on the lattice P, and consider for A ∈ P, which clearly satisfies c(A) 0. Moreover, one has wherefore c ∈ P(P(S)), which means that c is a probability vector on the lattice P(S).
At this point, we have gathered the core material to embark on the discussion of the recombination equation and its properties.

The general recombination equation
With the tools at hand, it is now rather obvious how to generalise the 'mass balance' equation (1) of the Introduction to our measure-theoretic setting. Within the Banach space (M(X), . ), we thus consider the nonlinear ODE with non-negative numbers ̺(A) that have the meaning of recombination rates in our context. They are written in this way because we consider ̺ : P(S) −→ R as an element of the Möbius algebra over R; see [2,18] for background. We will usually assume that an initial condition ω 0 ∈ M(X) for t = 0 is given for the ODE (6), and then speak of the corresponding Cauchy problem (or initial value problem).

Remark 2.
Since R 1 = ½, the value of ̺(1) is immaterial, and the corresponding term could clearly be omitted on the right-hand side of Eq. (6). We nevertheless keep it here, as it will become useful in connection with the reduction to subsystems.
Another way to write the ODE iṡ but we must keep in mind that Φ is a nonlinear operator. Nevertheless, one has the following basic result; compare [2] for background on ODEs on Banach spaces.
Proposition 3. Let S be a finite set and X the corresponding locally compact product space as introduced above. Then, the Cauchy problem of Eq. (6) with initial condition ω 0 ∈ M(X) has a unique solution. Moreover, the cone M + (X) is forward invariant, and the flow is norm-preserving on M + (X). In particular, P(X) is forward invariant under the flow.
Proof. By part (2) of Proposition 1, all R A are globally Lipschitz on M(X), which is then also true of Φ. Therefore, the uniqueness statement for the Cauchy problem is clear. Next, let ν ∈ M + (X) and consider an arbitrary Borel set E ⊂ X with ν(E) = 0. Then, with ̺ tot = A∈P(S) ̺(A) as above and Remark 2, we have which implies the preservation of the norm of a positive measure under the forward flow. The last claim is then obvious.
In view of our underlying biological problem, we restrict ourselves to the investigation of the recombination equation on the cone M + (X), and on P(X) in particular. To proceed, we will first bring the ODE (6) into a simpler form. To this end, observe the structure of Φ, which suggests that, as the flow proceeds in forward time from an initial measure ν ∈ P(X), the solution picks up fractions of components of the form R B (ν) for various or even all B ∈ P(S), depending on which recombination rates are positive and which partitions can thus be reached in the course of time.
Let |S| = n and fix some ν ∈ M + (X). Now, define the (finite-dimensional) set which is the closed convex set that consists of all convex linear combinations of the measures R C (ν) with C ∈ P(S). The dimension of ∆ ν depends on the nature of ν, and can even be 0. When the dimension is maximal (meaning B(n) − 1, which is the generic case), ∆ ν is a simplex and the B(n) measures R C (ν) are the extremal measures of the simplex in the sense of convex analysis. Observe that holds for any B ∈ P(S), which follows from a straightforward though slightly technical calculation (details of which are given below). This now suggests the ansatz for the solution of our above Cauchy problem, with (generic) initial condition ω 0 and coefficient functions a t , the latter thus with a 0 (C) = δ C,1 . For each t, we view a t : P(S) −→ R as an element of the Möbius algebra. We shall prove a posteriori that the strategy of Eq. (7) works, and that the orginal Banach space ODE is thus reduced to a finite-dimensional system of ODEs, whose solution consists of a one-parameter family of probability vectors on P(S). The case of non-generic initial conditions will be discussed afterwards.
To proceed, we now have to calculate the action of Φ on a general measure ω ∈ ∆ ν . It is convenient to first consider probability measures, as the extension to general positive measures is then immediate via the positive homogeneity of the recombinators. So, let ν ∈ P(X), which means ∆ ν ⊂ P(X). Consider a single partition B = {B 1 , . . . , B r } and observe that where we have used Lemma 1 in the last step. Observing that the product measure in the last expression is a measure of the form R A (ν) for some A ∈ P(S) with A B, and that each such A must occur here, we see that where we use a dot under a symbol to mark it as the summation variable, thus following the notation of [1]. Next, if α 0, Proposition 1 implies that are two equivalent ways to write ω ′ = αω. Consequently, we extend our previous formula as where we define the function γ as for any A, B ∈ P(S) with A B, and as γ(q; A, B) = 0 otherwise. In Eq. (10), q may be any real vector of dimension B(n), even though we have only considered positive ones above. Also, one has γ(0; A, B) = 0 for consistency. Remark 3. When q is a probability vector on P(S), one has q 1 = q(P(S)) = 1. The right-hand side of Eq. (9) can then be read as a product of marginalised probabilities for the subsystems defined by the parts B i of the partition B. It is this structure that will later pave the way to a recursive solution of the recombination equation. ♦ For fixed q, the function γ(q; ·, ·) is an element of the incidence algebra of our lattice P(S) over the field R. Following [2], we denote this algebra by A(P(S)), with convolution as multiplication. The latter is defined as Note that this definition automatically gives (α * β)(A, B) = 0 if A B, as it must. Important elements of A(P(S)) include the unit δ, defined by δ(A, B) = δ A,B , the zeta function ζ, defined by ζ(A, B) = 1 for A B and ζ(A, B) = 0 otherwise, and the Möbius function µ, which is the multiplicative (left and right) inverse of ζ. One then has ζ * µ = µ * ζ = δ. We refer to [18] for details and more advanced aspects of incidence algebras.
With ̺ tot = A∈P(S) ̺(A) and ω 0 as above, Eq. (9) allows us to continue as follows, which is the desired action of Φ on elements of M + (X). Inserting Eq. (7) into the recombination equation (6) now gives the following result.
Lemma 2. The ansatz (7) for the solution ω t of the recombination equation (6) on M + (X) leads to a system of induced ODEs for the coefficient functions a t , namely for A ∈ P(S), where γ is defined as in Eq. (10).
Proof. Clearly, Eq. (11) is the result of a comparison of coefficients. For a generic ν ∈ M + (X), this is justified by the fact that the measures R A (ν) with A ∈ P(S) then are the extremal measures of the forward-invariant simplex ∆ ν . Since the set of generic ν is dense in M + (X), and the solution of the Cauchy depends continuously on the initial condition, the extension to all ν ∈ M + (X) is consistent.
Note that our above calculation was based upon the action of recombinators on positive measures, because we have used Lemma 1. Consequently, we do not know how Eqs. (6) and (11) are related beyond this case. Fortunately, this is not required, as the next result shows, where we use Proposition 4. The ODE system defined by Eq. (11), with γ as in Eq. (10), is of dimension B(n) = |P(S)|, and has a unique solution for its Cauchy problem.
The closed cone M + (P(S)) ≃ R is invariant in forward time, and the · 1 -norm of non-negative initial conditions is preserved. In particular, the simplex P(P(S)) of probability vectors on P(S) is invariant in forward time.
Proof. The ODE system of Eq. (11) can be written asȧ t = Ψ (a t ), where Ψ is nonlinear. The first claim is clear, while solution uniqueness once again follows from Lipschitz continuity of Ψ , which is obvious here.
Let q ∈ M + (P(S)) = R be arbitrary and consider any subset G ⊂ P(S) such that q(G) := A∈G q(A) = 0. Then, one has Now, the last sum (for any fixed B) is as follows from Eq. (9) by taking norms on both sides and observing that R A (ν) = ν = 1 for all A B as well as R B (ω) = 1, while one has γ(q; A, B) 0 for any q ∈ M + (P(S)); alternatively, one can also verify this with a direct calculation on the basis of the definition of γ. Consequently, Ψ (q) (P(S)) = 0, which implies the claimed norm preservation. The positive invariance of P(P(S)) is then clear.

Remark 4.
Observing that γ(q; A, 1) = q(A) holds for any q ∈ R P(S) , one can rewrite the ODE of Eq. (6) asȧ which corresponds to the observation made in Remark 2. ♦ With Proposition 4, we may now conclude that our ansatz (11) is consistent and suitable for the reduction of the original problem to a finite-dimensional one.
Proof. The solution property is clear from our above calculations, while the correspondence of the initial conditions is obvious for the generic case, and extends to the general case by a standard continuity argument. The claim now follows from the uniqueness statements for the two Cauchy problems.
Remark 5. Let us mention that an alternative path to Theorem 1 is possible via [2, Thm. 16.5 and Rem. 16.6], by showing that the convex set ∆ ν is forward invariant for any ν ∈ M + (X) as initial condition. This requires the verification of the 'inside reflection property' for any piece of the boundary of ∆ ν , which is somewhat tedious in view of the possible degeneracies. This is the reason why we opted for our approach above. ♦ Let us briefly discuss the structure of ∆ ν for a general ν ∈ M + (X), including the nongeneric cases. Each ν gives rise to a set of partitions which is non-empty (since 1 ∈ H ν for all ν) and defines a unique partition The partition U ν also defines the sublattice P ν := P(U 1 ) × . . . × P(U r ) of product form, with B(|U 1 |) · . . . · B(|U r |) elements. It is precisely this sublattice of P(S) that determines the structure of the convex set ∆ ν , which is now the Cartesian product of r simplices, and of total dimension B(|U 1 |) − 1 + . . . + B(|U r |) − 1 .
It is clear that the time evolution of a measure ν ∈ M + (X) under the flow of the recombination equation (6) can thus be reduced to a smaller ODE system, which is fully consistent with our above treatment as a consequence of Eq. (8), as Φ(ν) = Φ R Uν (ν) then means that effectively only partitions from the set P ν are involved. We leave it to the reader to spell out the details for the modified correspondence and the appropriate initial conditions for the finite-dimensional ODE system.
We are now in the position to approach a solution of the recombination equation.

Solution under a linearity assumption
The relatively simple solution structure in the case of single crossover dynamics (see references [6,5]) was due to the fact that the nonlinear recombinators acted linearly along solutions. Since this is a special case of our general model, via setting ̺(A) = 0 for any A that fails to be an ordered partition with (at most) two parts, it is reasonable to consider this point of view also more generally. It will turn out that the linearity assumption is false in general, but we can still learn some interesting things along the way.
Thus, let us assume that also the more general recombinators act linearly along the solution. A simple calculation shows that our coefficient functions then have to satisfy the ODEs for all A ∈ P(S), where, for any fixed q ∈ M(P(S)), is another element of the incidence algebra A(P(S)). The difference of Eq. (12) to the general equation (11) thus lies in the replacement of the function γ by the significantly simpler linear function β.
Proposition 5. The Cauchy problem defined by Eq. (12) together with the initial condition a lin 0 (A) = δ(A, 1) has the unique solution given by which, for t 0, constitutes a one-parameter family of probability vectors on P(S).
Proof. Consider the (upper) summatory function F a lin t (A) := B A a lin t (B), which is and hence satisfies the simple ODE together with the initial condition F a lin 0 (A) = 1 for all A ∈ P(S). where µ ∈ A(P(S)) denotes the Möbius function for the lattice P(S).
The fact that a t , for each t 0, is a probability vector on P(S) follows from our earlier calculation in Eq. (4), with f (x) = e x .
This approach clearly has a lattice-theoretic basis, which lends itself to a number of interesting further aspects and insights [9]. Remark 6. Let us note that Eqs. (13) and (14) allow a re-interpretation of the coefficient formula from Proposition 5 as where it is assumed that the total recombination rate ̺ tot is known. This detail corresponds to the fact that ̺(1) does not contribute to any of the χ(A). Either version of a lin t permits the determination of the asymptotic properties of the coefficients a lin t as t → ∞, and hence that of the measure ω t . In particular, when the partition subset G := {A ∈ P(S) | ̺(A) > 0} satisfies G = 0, one has lim t→∞ ω t = R 0 (ω 0 ), with convergence in the . -topology. This is the obvious generalisation of the known asymptotic properties in the special cases treated in [11,6]. Proof. Let a probability vector q on P(S) be given. The definitions of β and γ imply, via a simple calculation, that γ(q; A, 1) = q(A) = β(q; A, 1) holds for any A ∈ P(S). This gives the claim for A = 1, because Eqs. (11) and (12) are equal in this case, hence a t (1) = a lin t (1). More generally, when A = {i}, S \ {i} for some i ∈ S, one finds that γ(q; A, A) = q(1) + q(A) = β(q; A, A), so that the ODEs from Eqs. (11) and (12) coincide also for such partitions A, which proves the second claim.
Corollary 2. The coefficient formula from Proposition 5 gives the correct solution of Eq. (11) for all A ∈ P(S) whenever S is a finite set with 1 |S| 3.
Proof. There is nothing to prove for |S| = 1. The claim for |S| = 2 is obvious, and also follows from [5,Prop. 3]. When |S| = 3, all partitions except A = 0 satisfy the conditions of Theorem 2. Since a t (0) = 1 − B =0 a t (B), the claim follows.

Remark 8.
In the special situation of single crossover recombination, where ̺(A) > 0 only for ordered partitions A into two parts, the solution formula of Proposition 5 reduces to the known solution for this case from [6,5]. In particular, the linearity assumption is satisfied, and the solution holds for all system sizes and all values of the single crossover rates. ♦ Note, however, that already for |S| = 4, when we are beyond the single crossover case, the coefficients a t and a lin t can differ, for instance for A = {{1, 2}, {3, 4}}, which is a biologically relevant partition. We thus need to proceed without the linearisation assumption.

General case: Marginalisation consistency
It is clear that our general recombination equation can only be considered a reasonable model if it is marginalisation consistent. By this we mean that the restriction to a subsystem, via appropriate marginalisation, gives a solution of the recombination equation for the subsystem. We now discuss this in more detail, and then establish this consistency property for our model, both in the measure-theoretic and in the finite-dimensional version. The latter case will depend on an interesting interplay between elements of the Möbius and the incidence algebras at hand.
Let S be as above, or any other finite set with n elements, and consider a subsystem as specified by ∅ = U ⊂ S. When ω S t is the solution of the general recombination equation (6) with initial condition ω S 0 ∈ M + (X) according to Proposition 3, it is natural to define as the corresponding (marginalised) measure for the subsystem defined by U . Then, recalling that the projector π U is linear, we get where the fourth step is an application of Lemma 1, while the last step anticipates the definition of the induced recombination rates for the subsystem as for any A ∈ P(U ). It is obvious that non-negativity of the rates ̺ S implies that of the induced rates ̺ U . Our little calculation proves the following result.
Proposition 6. Let S be a finite set and U ⊂ S a non-empty subset. If ω S t is a solution of the recombination equation (6), with recombination rates ̺ S (B) 0 for B ∈ P(S) and with initial condition ω S 0 ∈ M + (X), the marginalised measure ω U t from Eq. (16) solves the recombination equation for the subsystem defined by U, provided the recombination rates ̺ U (A) for A ∈ P(U ) are defined according to Eq. (17).
Let us see how this translates to the finite-dimensional ODE systems on the level of the coefficient functions a U t (A). Given a general probability vector q = q S on P(S), we define the corresponding marginal probabilities via for any partition A ∈ P(U ), in complete analogy to Eq. (17). Clearly, one has q U (A) 0, and the proper normalisation follows from where the penultimate step follows from the observation that the inner sum over C consists of precisely one term because the restriction of a partition to a subset U ⊂ S is unique.
B A q U (B) can also be calculated by marginalisation, namely as wherefore consistency at this level is obvious. ♦ Let us return to the recombination rates ̺ S (B) for B ∈ P(S) together with compare Remark 2, and consider the induced rates ̺ U (A) with A ∈ P(U ) and ∅ = U ⊂ S.
Repeating the above calculation of the normalisation condition, one finds (19) ̺ U tot = ̺ S tot = ̺ tot , as it should be. The total recombination rate is thus the same on all levels, and independent of U . Note that, in this process, ̺ U ({U }) = ̺ U (1) need not vanish. This does not matter because the corresponding recombinator (on the subsystem) is the identity, and hence does not affect the solution; compare Remark 2.
Let us now see how the result of Proposition 6 translates to properties of the coefficients for the ODE system and its subsystems. Here, the desired marginalisation consistency will depend on the following slightly technical, but somewhat surprising identity. Lemma 3. Let U ⊂ S be as before, and let A, B ∈ P(U ) with A B be arbitrary, but fixed. Then, for any D ∈ P(S) with D| U = B, one has P(S)∋ .
where q S is any probability vector on P(S) and q U its marginalisation according to Eq. (18).
Proof. Let A, B ∈ P(U ) with A B be given, and assume B = {B 1 , . . . , B r }. With our definition of the γ-function, the right-hand side evaluates as For an arbitrary D ∈ P(S) with D| U = B, this is now to be compared with (20) To this end, we write D = {D 1 , . . . , D r , D ′ 1 , . . . , D ′ s } with r = |B| and D i ∩ U = B i for 1 i r (which is without loss of generality) and D ′ j ∩ U = ∅ for 1 j s. Note that s = 0 is possible, in which case no D ′ j is present. Now, any partition C ∈ P(S) in the summation in Eq. (20) must be a joining of the form , subject to the additional condition that we always have C| U = A, which now means C| B i = A| B i for all 1 i r. The outer summation in Eq. (20) can now be broken into smaller sums that can be absorbed into the factors, which yields that the right-hand side of Eq. (20) is a product of two terms, namely Now, the double sum in each factor of the second product is clearly equivalent to summing over all E ∈ P(S), so that each factor is 1 since q S is a probability vector on P(S). Consequently, the entire second term is 1, which is also true if s = 0 (in which case we have the empty product here). Likewise, the double sum in the ith factor of the first product is equivalent to summing over all E ∈ P(S) subject to the condition that by our previous calculation, which proves the lemma.
The marginalisation consistency can now be stated as follows.
Proposition 7. Let S be a finite set and U ⊂ S a non-empty subset. If the family of probability vectors {a S t | t 0} is a solution of the Cauchy problem of Eq. (11) with initial condition a S 0 (B) = δ S (B, 1), the marginalised family {a U t | t ≥ 0}, with a U t (A) defined according to Eq. (18) for all A ∈ P(U ), solves the corresponding Cauchy problem for the subsystem, with initial condition a U 0 (A) = δ U (A, 1) and the induced recombination rates of Eq. (17). The analogous statement remains true for a general probability vector a S 0 as initial condition, with the marginalised initial condition a U 0 according to Eq. (18) on the subsystem.
Proof. Let {a S t | t ≥ 0} be a solution of Eq. (11), and let A ∈ P(U ) be fixed. Then, we havė where we have used Lemma 3 in the second last step. This made the last term independent of D, which in turn allowed the last step on the basis of Eq. (17). This shows that the marginalised family indeed satisfies the proper ODE for the subsystem, as it must. It is an easy exercise that δ U (A, 1) is the initial condition for the subsystem that emerges as the marginalisation of the original initial condition δ S (B, 1). Since A ∈ P(U ) was arbitrary, the main statement is proved.
The last claim is an obvious generalisation.
Note that Proposition 7 can also be viewed as a consequence of Proposition 6 and Theorem 1. We have nevertheless opted for an explicit verification because several steps of the above proof will reappear when we now proceed to a solution of the recombination equation.

Generic case: Recursive solution
Motivated by the Möbius formula in Eq. (15) and by a reminiscent structure in the solution of the discrete recombination equation, compare [3], we now restrict Eq. (11) to the forwardinvariant simplex P(P(S)), compare Proposition 4, and make the ansatz for any non-empty U ⊂ S, with decay rates ψ U and coefficient functions θ U . Our aim is to determine θ U by a comparison of coefficients once we know a (recursive) formula for the rate function ψ U together with linear independence of the exponentials in Eq. (21). Note that this will then also solve the ODE (11) with an initial condition from M + (P(S)) via multiplying the right-hand side of Eq. (21) by the norm of the initial condition. From our results of Section 4, we know that In particular, this gives ψ U (A) = 0 for any U with |U | = 1, which is the trivial limiting case with an empty sum on the right-hand side. All other values are now defined recursively by which is motivated by the known structure in the discrete analogue of our model [19]. For any U with |U | = 2, this results in ψ U (A) = δ U (A, 1), while one finds ψ U (0) = 0 for all non-empty U ⊂ S. This is consistent with the fact that no further partitioning (or 'decay') is possible when starting from the partition 0 of U . The decay rates defined this way have important summation properties as follows.
Lemma 4. Let S be a finite set and ∅ = U ⊂ S be arbitrary, but fixed. If A = {A 1 , . . . , A r } is a partition of U , with r = |A|, one has ψ U (A) + r i=1 ̺ A i (1) = r̺ tot . If s A of the parts of A are singletons, this can be simplified as In particular, ψ U (0) = 0 holds for all non-empty U ⊂ S.
Moreover, for arbitrary B, C ∈ P(U ) with C B, one has Proof. The first claim follows from the relation ψ A i (1) = ̺ tot − ̺ A i (1), where ̺ tot does not depend on U as a consequence of Eq. (19). For any part A i that is a singleton, one has ̺ A i (1) = ̺ tot , which implies the second identity. The latter, in turn, confirms that ψ U (0) = 0 for all U ⊂ S, since s A = r in this case. For the remaining identity, fix B = {B 1 , . . . , B r } ∈ P(U ). If C = {C 1 , . . . , C s } ∈ P(U ) is a refinement of B, we have s r and C must be a joining of the form implies the existence of a bijection between the two families {C k | 1 k s} and {C (i) j | 1 i r and 1 j k i }. Eq. (23) now implies that the right-hand side of our claim is and the argument is complete.
If U = ∅, with |U | 3, one sees with Corollary 2 that holds for any A ∈ P(U ). For larger U , this relation holds for A = 1 and any A ∈ P(U ) with |A| = 2 if one of the parts is a singleton set, as follows from Theorem 2.
Lemma 5. Let the decay rates ψ S (A) with A ∈ P(S) be distinct and assume that the recombination rates for subsystems are calculated according to Eq. (17). Then, for any non-empty U ⊂ S, the decay rates ψ U (B) with B ∈ P(U ) are distinct as well. In this case, the exponential functions e −ψ U (B)t are linearly independent over R.
Proof. Assume distinctness of the rates ψ S (A) with A ∈ P(S) and fix a non-empty U ⊂ S.
Since S = U∪ U c , we can employ Lemma 5 to derive that which is a contradiction to the distinctness assumption. This proves the first claim, while the second assertion is standard.
Below, we will need to understand the decay rates in various ways. Let us thus expand on their relation with the recombination rates. Starting from the definition in Eq. (23), with A = A 1 , . . . , A |A| , one obtains In particular, the coefficients κ U (A, B) are non-negative integers, and one has κ U (A, B) = 0 if and only if B A.
Proof. The first claim summarises our above calculation. It is obvious from the explicit formula that κ U (A, B) ∈ N 0 , for all A, B ∈ P(U ). The coefficient vanishes if and only if each condition under the summation is true, which means But the latter condition, in turn, is equivalent with B A, which proves the assertion. Let us now return to the original ansatz (21). The coefficient function θ U is an element of the incidence algebra A(P(U )). Since a U 0 (A) = δ U (A, 1) is our initial condition, we see that, for ∅ = U ⊂ S and any A ∈ P(U ), we must have This way, for non-empty U ⊂ S and A ∈ P(U ), the coefficients θ U (A, 1) are either fixed or recursively determined from θ U (A, C) with A C ≺ 1.
Taking the derivative of our ansatz (21) leads to which has to be compared with the right-hand side of Eq. (11). Observing that and inserting the corresponding expression for the coefficients of the subsystem according to our ansatz yields the expressioṅ where the second part of Lemma 4 was used in the last step. Inserting the expression for a U t (A), changing summation variables and regrouping terms giveṡ where the second step effectively removes all terms with ̺ U (1); compare Remark 2.
Let U ⊂ S be non-empty and let us assume that the exponential functions on the righthand side of Eq. (21) are linearly independent. By Lemma 5, this is certainly the case when the rates ψ S (A) with A ∈ P(S) are distinct. A comparison of coefficients with the right-hand side of Eq. (26) then yields the relations for all A, B ∈ P(U ) with A B. Note that our assumption entails the condition that ψ U (1) = ψ U (B) for all 1 = B ∈ P(U ), so that we get where the θ coefficients for the subsystems are themselves determined recursively, under the distinctness condition for the subsystem decay rates, which follows from our assumption by Lemma 5. Since we know θ U for any U with 1 |U | 3 from Corollary 2 and Remark 6, see also Corollary 5 below, the recusion (28) uniquely determines all θ coefficients.
We have thus shown the following result for the generic situation of Remark 10. Let us mention in passing that a similar result can be derived for other initial conditions as well. We concentrate on this one as it fits Theorem 1 and thus also provides a solution of the original recombination equation.
In the situation of Theorem 3, we thus know that our original ansatz (21) leads to a solution. Since we then also know that ψ S (A) = 0 holds only for A = 0, we have the following immediate consequence for the asymptotic behaviour of the solution. which means that the corresponding solution ω t of the ODE (6) with initial condition ω 0 , . -converges to the equilibrium R 0 (ω 0 ) as t → ∞.
The recursive nature of our solution does not facilitate to understand its structure and meaning. Let us thus study some aspects of it in more detail.

Generic solution: Structure and further properties
To begin, let us observe that Theorem 3 has an interesting consequence which contrasts the result of Lemma 6 and shows a surprising structural similarity with the corresponding relation of Remark 6.
Corollary 4. Let S be as in Theorem 3 and assume that the decay rates ψ S (A) with A ∈ P(S) are distinct. Then, the recombination and the decay rates are related by and the corresponding relation holds for any subsystem that is defined by a non-empty U ⊂ S.
Proof. By Theorem 3 and Proposition 4, we know that the ansatz of Eq. (21), for U = S, leads to the unique solution of the ODE (11) with initial condition a S 0 (A) = δ S (A, 1). Thus, we may equate the right-hand side of Eq. (11) with that of Eq. (26), again for U = S, and consider the resulting identity at t = 0. Observing that γ(a S 0 ; A, B) = δ S (A, B) holds for our probability vector a S 0 as a result of Eq. (10), the first claim is clear. The second assertion follows by the corresponding calculation with a U t , which is justified by Proposition 7 together with Lemma 5.
Note that the relation of Corollary 4 is a nonlinear one, because the θ coefficients themselves generally depend on the recombination rates. To make any further progress, we need to better understand these coefficients, beyond θ U (1, 1) = 1 and the relations in Eq. (25).

Proposition 8.
Under the assumptions of Theorem 3, we have the following properties.
Proof. For the first claim, observe that 0 = 1 for any U with |U | = 1, so θ U (0, 0) = θ U (1, 1) = 1 in this case. Assume now that the claim is true for all U with |U | r, and consider a larger set, U = {u 1 , . . . , u r+1 } say. With ψ U (0) = 0, we then get from Eq. (28) that where the second step uses the induction hypothesis (note that we always have |C i | r), while the last step employs Eq. (24) which applies here.
Next, the normalisation property of the a U t (A) implies Since ψ U (0) = 0 and since the decay rates ψ U (B) with B ∈ P(U ) are distinct by our assumption together with Lemma 5, the functions e −ψ U (B)t are linearly independent. Consequently, the last identity is equivalent to the second claim. For the third claim, recall that the dimension of the ODE system (11) is B(n) if the set of partitions A ∈ P(S) with ̺ S (A) > 0 generates the entire lattice P(S), which is the case under our assumptions. The convex set that is spanned by the solution functions a S t is then a simplex of dimension B(n) − 1, and the normalisation condition A∈P(S) a S t (A) = 1 is the only linear relation between these functions.
We already know that θ S (1, 1) = 1, and one easily finds for any A ∈ P(S) with |A| = 2. One may now proceed inductively in the number of parts. If there were some A = 0 with θ S (A, A) = 0, where we may assume A to be the coarsest partition with this property, we would get from Eq. (21) an additional linear relations among the solution functions a S t , which is impossible. Since θ S (0, 0) = 1 by the second assertion, the claim is also true on the top level (defined by S). Repeating the argument for any non-empty U ⊂ S completes the argument.
Finally, the invertibility of θ U as an element of the incidence algebra is standard; compare [2].
Consequently, θ U has a unique (left and right) inverse, η U say. This means that holds for all A C. The coefficients η U are thus determined by η U (A, A) = 1/θ U (A, A) together with the recursion for A ≺ C, which derives from the left equality above. Alternatively, one may use the corresponding formula that derives from the other identity.
Either from direct calculations, or by invoking the results from Section 4, the following result is obvious.
Corollary 5. For any U with 1 |U | 3, one has η U = ζ U , and hence also θ U = µ U . For larger sets U, one has η U (A, A) = 1 for A = 1 and for all A ∈ P(S) with two parts, if one of them is a singleton set. In the latter case, also θ U (A, A) = 1.
Proof. Corollary 2 implies the first claim, while Theorem 2 gives the second one for A = 1 as well as for any A with two parts, provided one of them is a singleton set. The last claim follows from θ U (A, A) η U (A, A) = 1.
Let us now define which is an analogue of the summatory function from Section 4. Using η U * θ U = δ U , we now obtain S as a consequence of ψ U (0) = 0; compare Lemma 4. One difference to the linear case is that the summation weights in Eq. (29) generally depend on the recombination rates, while they were constant (in fact, given by the ζ-function of the incidence algebra) in Section 4. Due to the properties of the decay rates ψ U , one inherits the corresponding relations among the coefficient determines all coefficients, while Lemma 4 implies the additional relation for any A, B ∈ P(U ) with A B. Note that the function b U t generally does not emerge from b S t via marginalisation in the sense of Eq. (17), which is another important difference to the special situation of Section 4 and Remark 9.
Lemma 7. Under the assumptions of Theorem 3, one finds the following properties.
Proof. The relation η U (0, 0) = 1 is an obvious consequence of the first assertion of Proposition 8.
Next, we already know that η U (0, 0) = 1. For A ≻ 0, we can proceed inductively via the standard inversion formula [1] for η U = θ U −1 , which gives where the second step employs the induction hypothesis while the second assertion of Proposition 8 was used in the penultimate step. The third claim is a consequence of the initial conditions together with Eqs. holds for any A ∈ P(U ).
It is quite clear that the structure of η is somewhat simpler than that of θ. At present, we have η defined as the inverse function to θ in the incidence algebra, but it would be nice to also have a direct way to calculate it, for instance via another recursion.

Some comments on the singular cases
Our focus in the previous section was on the generic case that allowed for a recursively defined, general solution. Let us now briefly look into what happens when certain degeneracies among the decay rates occur.
Recall that our ansatz (21) requires, a priori, the linear independence of the exponential functions e −ψ U (A)t , and thus the distinctness of the decay rates ψ U (A), for A ∈ P(U ), separately for all non-empty U ⊂ S. A posteriori, we need to understand what happens when two decay rates, as functions of the recombination rates ̺ S of the system on the top level, become equal. Once again, this is best looked at inductively. If ∅ = U ⊂ S satisfies |U | 3, we are in the realm of the 'linear' solution of Section 4, and no consequences emerge from degeneracies. In other words, the solution formula from Proposition 5 is valid for all values of the recombination rates, irrespective of possible degeneracies; compare Corollaries 5 and 2.
When we step up in system size, there can be degeneracies that are still 'harmless' in the sense that the θ coefficients extend continuously to these situations. In this case, the (now reduced) set of exponentials in Eq. (21) is still sufficient, and the solution valid. However, as is evident from Eq. (28), the situation changes when ψ U (B) = ψ U (1) for some B ∈ P(U ), as we then hit a singularity. To understand the underlying phenomenon, let us assume that U corresponds to the smallest subsystem where this type of 'bad' degeneracy occurs. Now, rewrite Eq.
Note that the ε(A, B) are well-defined for all A, B ∈ P(U ) and all values of the recombination rates, because only θ coefficients of subsystems smaller than U occur in the product. We are now in the standard situation of ODE theory that we have summarised in Lemma 8 of the Appendix: Precisely when ψ U (B) = ψ U (1) for some B ∈ P(U ), we need an additional function for our ansatz, namely t · e −ψ U (B)t . So, our original ansatz (21) no longer suffices, and has to be modified accordingly. At the same time, in line with our previous observation, additional degeneracies of the form ψ U (B) = ψ U (B ′ ) with B, B ′ = 1 are harmless. Now, this dichotomic structure continues on each new level: There are 'harmless' degeneracies that do not require additional functions for our ansatz, while degeneracies of the type ψ U (B) = ψ U (1) once again render the function set of the ansatz incomplete, even one that was augmented in the previous step. Lemma 9 of the Appendix reviews a typical case, while the remarks following it show how the degenerate case produces extra monomial factors of increasing exponents for each degeneracy of the 'bad' type. At this point, we hope that the general structure is sufficiently clear, so that we can leave further details to the reader.
Let us summarise our discussion as follows.
An illustration of the function E 0 is shown in the left panel of Figure 2. A similar type of result emerges for mixtures of exponentials with higher order monomials.
Lemma 9. Let ̺ and σ be non-negative real numbers, and m ∈ N 0 . Then, the Cauchy problemġ with initial condition g(0) = g 0 has the unique solution g(t) = g 0 e −̺t + ε E m (̺, σ; t), where Some examples are illustrated in the right panel of Figure 2. By standard results from ODE theory, it is clear how to combine the results of Lemmas 8 and 9 to cover further situations. Let us only add that the ODEġ with initial condition g(0) = g 0 has the unique solution which explains the appearance of monomial factors with increasing powers in the solution of such equations with degenerate rates.