Nonholonomic and constrained variational mechanics

Equations governing mechanical systems with nonholonomic constraints can be developed in two ways: (1) using the physical principles of Newtonian mechanics; (2) using a constrained variational principle. Generally, the two sets of resulting equations are not equivalent. While mechanics arises from the first of these methods, sub-Riemannian geometry is a special case of the second. Thus both sets of equations are of independent interest. The equations in both cases are carefully derived using a novel Sobolev analysis where infinite-dimensional Hilbert manifolds are replaced with infinite-dimensional Hilbert spaces for the purposes of analysis. A useful representation of these equations is given using the so-called constrained connection derived from the system’s Riemannian metric, and the constraint distribution and its orthogonal complement. In the special case of sub-Riemannian geometry, some observations are made about the affine connection formulation of the equations for extremals. Using the affine connection formulation of the equations, the physical and variational equations are compared and conditions are given that characterise when all physical solutions arise as extremals in the variational formulation. The characterisation is complete in the real analytic case, while in the smooth case a locally constant rank assumption must be made. The main construction is that of the largest affine subbundle variety of a subbundle that is invariant under the flow of an affine vector field on the total space of a vector bundle.


Introduction
For mechanical systems not subject to nonholonomic constraints-sometimes called "holonomic mechanical systems"-it is well-known that the physical motions are the extremals for a problem in the calculus of variations involving a physically meaningful Lagrangian. It is well-known that this property of physical motions breaks down when the system is subject to nonholonomic constraints. There is a natural problem in the calculus of variations-a problem with constraints-that one can associate to mechanical systems with nonholonomic constraints; it is just that the extremals from the calculus of variations problem are not generally physical motions. In the case when the conservative forces are absent, this calculus of variations problem is, however, equivalent to the determination of extremals in sub-Riemannian geometry. This, therefore, gives rise to two natural sets of governing equations associated with the data that describe a mechanical system subject to nonholonomic constraints, one physical and one variational, and both interesting in their own right.
There has been a literature devoted to the comparison of the two sorts of equations describing constrained motion, and the modern take on this seems to originate with papers of Kozlov [1992] and Kharlomov [1992]. Other work includes [Borisov, Mamaev, and Bizyaev 2017, Cardin and Favretti 1996, Favretti 1998, Gràcia, Marin-Solano, and Muñoz-Lecanda 2003, Kupka and Oliva 2001, Lewis and Murray 1995, Vershik and Gershkovich 1990, Zampieri 2000]. Our interest is in providing a characterisation of those physical motions that also arise as extremals for the constrained variational problem. Some work has been done on this problem [e.g., Cortés, de León, Martín de Diego, and Martínez 2002, Crampin and Mestdag 2010, Favretti 1998, Fernandez and Bloch 2008, Jóźwikowski and Respondek 2019, Rumiantsev 1978, Terra 2018; we refer to the introduction of [Jóźwikowski and Respondek 2019] for a nice review of the literature on this topic. Our approach and conclusions differ from what presently exists in the literature. While much (but not all) of the existing literature considers general Lagrangians, we work exclusively with kinetic energy minus potential energy Lagrangians. This allows us to take advantage of the geometric structure of such systems. Also, most of the work in the literature derives certain sufficient conditions, sometimes involving additional system structure, that allows one to conclude that all physical trajectories are also constrained variational trajectories. While the work of Cortés, de León, Martín de Diego, and Martínez [2002] in principle offers a complete resolution to the problem of when all physical trajectories are also constrained variational trajectories, this resolution comes in the form of an iterative "algorithm" which requires certain regularity conditions and which offers very little insight as to just when the algorithm yields an affirmative answer. Also, Cortés, de León, Martín de Diego, and Martínez do not consider singular trajectories. By contrast, we are able here to offer a complete resolution to the comparison problem for the most important class of Lagrangians in the real analytic case, 1 while our results for the smooth case give sufficient conditions and require assumptions of the locally constant rank of some subbundles. Moreover, our results are of an insightful nature in multiple ways, connecting the detailed geometry of the interaction of the constraint distribution and the Riemannian metric defining the kinetic energy.
1.1. Contribution of paper. We restrict ourselves to "kinetic energy minus potential energy" Lagrangians, and characterise in Section 7 the comparison of solutions of the two constrained problems using the interaction of the Levi-Civita affine connection and the distribution. The end result of our detailed constructions is an affine vector field that describes the evolution of the adjoint variable (i.e., the Lagrange multiplier) for the constrained variational problem, and it is this vector field that allows us to nicely characterise cases where nonholonomic trajectories are also constrained variational trajectories. The most interesting of our results can be seen as analogous to the following question from linear algebra: Let V be a finite-dimensional R-vector space, let U ⊆ V be a subspace, let A ∈ End(V), and let b ∈ V. Determine all solutions to the probleṁ Of course, there are some technicalities that distinguish our problem from this simple one, but this simple problem is useful to keep in mind.
Another objective of our presentation is to develop, in a simple context, a methodology for doing Sobolev-type nonlinear analysis on manifolds; this is given in Section 3. For the setting of this paper, the analysis involves the space of curves on a manifold. A typical technique for doing this type of analysis is to develop the structure of an infinite-dimensional Hilbert manifold for the space of curves [e.g., Klingenberg 1995, Kupka and Oliva 2001, Terra and Kobayashi 2004a, Terra and Kobayashi 2004b. This type of analysis has the benefit that, once one has at hand the manifold structure, all of the standard tools of differential geometry are made available. The drawback of the methodology is that the infinite-dimensional manifold structure can be difficult to work with. The approach we develop in this paper is that, given finite-dimensional manifolds M and N, one can replace a single mapping Φ : M → N with the family of functions f • Φ : M → R, one for each smooth function f : N → R. By taking this point of view, one works, not with the space of mappings which does not have a vector space structure, but with the space of functions which does have a vector space structure. Indeed, we are able to do all of the analysis we need in the paper while working explicitly only with the space the space H 1 ([t 0 , t 1 ]; R) of absolutely continuous functions on the interval [t 0 , t 1 ] that are square integrable with square integrable derivative. This is a point of view that has been explored in a variety of ways in a variety of settings. For example, the replacement of the nonlinear manifold structure with the linear structure of its space of functions is a device reminiscent of algebraic geometry, and gives rise to a sort of "algebraic analysis" that is explored for smooth differential geometry, for example, in the book of Nestruev [2003]. Agrachev and Gamkrelidze [1978] use this idea of function evaluations as the basis for their "chronological calculus" used to study flows of vector fields. These ideas are further explored by Jafarpour and Lewis [2014], and indeed this latter work, combined with our modest undertakings here, can be used as a basis for a comprehensive methodology for Sobolev-type analysis on manifolds. Some explorations along these lines have been carried out by Convent and Van Schaftingen [2016a, 2016b. In this paper, we make use of these ideas to characterise spaces of curves that satisfy a linear velocity constraint and/or endpoint constraints. We reproduce in our Section 5.1 the results of Kupka and Oliva [2001, §5], while only using elementary methods (the proofs themselves are not necessarily trivial, mind).
Another novel feature of our presentation is the development in Section 4.1 of some results for the invariance of subsets, not generally submanifolds, of a manifold under the flow of a vector field. Of special interest is the situation where the subset is a (not necessarily locally constant rank) subbundle of a vector bundle and where the vector field has some interesting structure relative to the vector bundle structure, e.g., linear or affine. We give a useful infinitesimal characterisation for the invariance of such subbundles under such vector fields in Sections 4.2,4.3,4.4,and 4.5. Using these constructions, in Section 4.6 we are able to build the "largest invariant affine subbundle variety contained in a subbundle." This construction plays a crucial rôle in our comparison results of Section 7.
1.2. An outline of the paper. In Section 2 we overview some constructions and notation concerning vector bundles, subbundles and affine subbundles, connections in vector bundles, vector fields on the total space of a vector bundle, Riemannian geometry, and the geometry of subbundles of the tangent bundle. In Section 3 we develop our methodology for the nonlinear analysis that we will use to deduce the two sets of equations of motion which we will ultimately compare. Results concerning subbundles and affine subbundles invariant under linear and affine vector fields in vector bundles are developed in Section 4. The equations governing nonholonomic mechanics and constrained variational mechanics are developed in Section 5. The equations we produce are those derived by Kupka and Oliva [2001], but we do this a little more comprehensively than do Kupka and Oliva, filling in some gaps in their written arguments, correcting some confusing typography, and banishing the use of coordinates. We also cast the equations in a new way using constructions from Section 2.11 involving affine connections adapted to distributions. It is these new representations of the governing equations that makes possible a systematic and comprehensive comparison of nonholonomic mechanics and constrained variational mechanics. In Section 6 we make some connections between constrained variational mechanics and sub-Riemannian geometry. The affine connection formalism we use here provides some new tools for problems in sub-Riemannian geometry, where the Hamiltonian approach mainly prevails in the current literature. In Section 7 we present the main new results of the paper, which are this comprehensive comparison of nonholonomic mechanics with constrained variational mechanics. We point out how we can encompass existing results, in cases when this is easily done.
1.3. Background and notation. We use standard set theoretic terminology, with the possible exception that we use "⊆" to denote the inclusion of a set in another, and use "⊂" to denote strict inclusion. By id X we denote the identity map on a set X. If f : X → Y is a map of sets and if A ⊆ X, we denote by f |A the restriction of f to A. For sets X 1 , . . . , X k , we denote by pr j : X 1 × · · · × X k → X j , j ∈ {1, . . . , k}, the projections. By Z we denote the set of integers, while Z >0 and Z ≥0 denote the sets of positive and nonnegative integers. By R we denote the set of real numbers, while R >0 denotes the set of positive real numbers. By R m×n we denote the set of m × n matrices with real entries. The rank of a matrix A ∈ R m×n we denote by rank(A). The n × n identity matrix we denote by I n .
An affine subspace of a R-vector space V is a subset A such that sv 1 + (1 − s)v 2 ∈ A for every v 1 , v 2 ∈ A and s ∈ R. We denote by L(A) the linear part of A defined by for some v 0 ∈ A. We refer to [Berger 1987, Chapter 2] for background on affine spaces. For a R-vector space V and for S ⊆ V, we denote by span R (S) the smallest subspace of V containing S and by aff R (S) the smallest affine subspace of V containing S.
For R-vector spaces U and V, L(U; V) denotes the set of linear mappings from U to V. By V * = L(V; R) we denote the algebraic dual of V. For A ∈ L(U; V), we denote by A * ∈ L(V * ; U * ) the algebraic dual. The pairing of α ∈ V * with v ∈ V will be denoted by one of α(v), α · v, α; v , whichever seems most aesthetically pleasing in the moment. By T r s (V) we denote the rcontravariant and s-covariant tensors on V, i.e., We denote by End(V) the endomorphisms of V, i.e., End(V) = L(V; V). By k (V * ) and S k (V * ) we denote the k-fold alternating and symmetric tensors on V, respectively. If A is a (0, 2)-tensor and B is a (2, 0)-tensor on a finite-dimensional R-vector space V, we denote by A : V → V * , B : V * → V the mappings defined by A (u); v = A(v, u), α; B (β) = B(α, β), u, v ∈ V, α, β ∈ V * .
If (V, G) is a R-inner product space and if S ⊆ V, we denote by S ⊥ the subspace orthogonal to S.
For a topological space X and for A ⊆ X, we denote by int(A), cl(A), and bd(A) the interior, closure, and boundary of A, respectively.
By B(r, x) ⊆ R n we denote the open ball of radius r and centre x. For Banach spaces E and F, an open set U ⊆ E, and a mapping Φ : U → F, we denote by DΦ(u) : E → F the Fréchet derivative of Φ at u, when this exists.
We shall be concerned with functions of two variables, (s, t) → f (s, t). For such functions, we have the partial derivatives h , defined when the limits exist.
For an interval I ⊆ R, A ⊆ R, and for p ∈ [1, ∞), we denote by L p (I; A) the set of measurable A-valued functions f on I for which I |f (t)| p dt < ∞.
The norm on L p (I; R) we denote by For s ∈ Z ≥0 , by H s (I; R) we denote the set of measurable functions whose first s distributional derivatives are in L 2 ([t 0 , t 1 ]; R). We denote the norm on H s (I; R) by f (a) being the ath derivative of f . Of course, H 0 ([t 0 , t 1 ]; R) = L 2 ([t 0 , t 1 ]; R).
Our geometric notation mainly follows [Abraham, Marsden, and Ratiu 1988]. Manifolds will be assumed to be smooth, Hausdorff, and paracompact. We shall at some crucial points require real analyticity of the manifolds and geometric objects we use. To cover the smooth and real analytic cases, we shall allow the regularity classes r ∈ {∞, ω}, r = ∞ being the smooth case and r = ω being the real analytic case. For a manifold M, the tangent bundle is denoted by π TM : TM → M and the cotangent bundle is denoted by π T * M : T * M → M.
The set of C r -mappings from a manifold M to a manifold N is denoted by C r (M; N). We abbreviate C r (M) = C r (M; R).
For a C r -vector bundle π : E → M, we denote the fibre at x ∈ M by E x . We will sometimes denote the zero vector in E x by 0 x . The set of C r -sections of π : E → M we denote by Γ r (E). If S ⊆ M, we denote by E|S the restriction of E to S, i.e., E|S = π −1 (S). The trivial bundle R k × M is denoted by R k M . If π : E → M is a C r -vector bundle and if Φ ∈ C r (N; M) is a C r -mapping of manifolds, then Φ * π : Φ * E → N is the pull-back vector bundle, with Φ * E = {(e, y) ∈ E × N | π(e) = Φ(y)} and Φ * π(e, y) = y. The bundle of k-jets of local sections of a C r -vector bundle π : E → M we denote by J k E. The derivative of Φ ∈ C ∞ (M; N) is denoted by T Φ : TM → TN. We denote T x Φ = T Φ|T x M. If f ∈ C r (M), then df ∈ Γ r (T * M) is defined by If I ⊆ R is an interval and if γ : I → M is differentiable at t ∈ I, then we denote γ (t) = T t γ (1). For a diffeomorphism Φ : M → N, for tensor fields A on M and B on N, Φ * A is the push-forward of A by Φ and Φ * B is the pull-back of B by Φ. We shall completely eschew any use of local coordinates, but for certain technical results we shall properly embed manifolds in some Euclidean space R N , assuming in such instances that manifolds are second countable, e.g., connected. The existence of such embeddings is proved by Whitney [1936] in the smooth case and by Grauert [1958] in the real analytic case. The principal manner in which we shall use these embeddings is according to the following result.
Proof: We assume that we have a proper embedding ι : M → R N . If x ∈ M, we shall simply write ι(x) = x. We denote by X 1 , . . . ,X N the coordinate vector fields on R N and bŷ g 1 , . . . ,ĝ N the coordinate functions.
(i) Denote by X 1 , . . . , X N ∈ Γ r (TM) the vector fields on M obtained by requiring that X j (x) be the orthogonal projection of X j (x) (with respect to the Euclidean metric) onto T x M. If x ∈ M and if v ∈ T x M, then v ∈ T x R N and so there are unique v 1 , . . . , v N ∈ R such that v = v 1 X 1 (x) + · · · + v N X N (x).
We then have, by orthogonal projection, v = v 1 X 1 (x) + · · · + v N X N (x), and the result follows by performing the previous constructions for v = X(x) for every x ∈ M.
The flow of a vector field X ∈ Γ r (TM) is denoted by Φ X t , so that the solution to the initial value problem ξ (t) = X • ξ(t), ξ(0) = x, is t → Φ X t (x). Many of our constructions and results do not require vector fields to be complete, but a crucial component of our analysis requires completeness of a certain vector field. If X ∈ Γ r (TM) and if f ∈ C r (M), by L X f ∈ C r (M) we denote the Lie derivative of f with respect to X. The Lie bracket of vector fields X, Y ∈ Γ r (TM) is the vector field [X, Y ] defined by L [X,Y ] Let (M, G) be a Riemannian manifold. By · G we denote the fibre norm defined by G.
We denote by G ∇ the Levi-Civita affine connection. We denote by exp the Riemannian exponential, which we regard as a mapping from a neighbourhood of the zero section in TM into M. If f ∈ C ∞ (M), we denote grad f = G • df .
For a C r -manifold M, we denote by C r M the sheaf of C r -functions over M. The stalk of this sheaf at x ∈ M is denoted by C r x,M . For a C r -vector bundle π : E → M, we denote by G r E the sheaf of C r -sections of E, thought of as a C r M -module. By G r x,E we denote the stalk of G r E at x ∈ M. We shall on occasion require the following lemma which is elementary in the smooth case, but is less elementary in the real analytic case.
1.2 Lemma: (Globally defined sections with a prescribed jet at a point) Let r ∈ {∞, ω}, let π : E → M be a C r -vector bundle, let k ∈ Z ≥0 , and let Ξ ∈ J k E x . Then there exists ξ ∈ Γ r (E) such that j k ξ(x) = Ξ.
Proof: Let E k x be the sheaf of sections of E whose k-jets vanish at x. Thus Consider the short exact sequence of sheaves One readily verifies that This short exact sequence of sheaves gives rise to the long exact sequence for global sections We claim that H 1 (E k x ) = 0. We consider the smooth and real analytic cases separately. 1. In the smooth case, [Wells Jr. 2008, Proposition 3.11] (along with [Wells Jr. 2008, Examples 3.4(d, e)] and [Wells Jr. 2008, Proposition 3.5]), immediately gives the vanishing of H p (E k x ) for p ∈ Z >0 . 2. In the real analytic case, first, by Oka's Theorem, 2 G ω E is coherent. By a standard argument using Hadamard's Lemma (cf. Lemma 1 from the proof of Proposition 4.3), one shows that E k x is locally finitely generated (by monomials, in coordinates). Thus E k x is coherent in the real analytic case by [Grauert and Remmert 1984, Example 2, pg 235]. Now, by Cartan's Theorem B in the real analytic case [Cartan 1957, Proposition 6], the sheaf cohomology of E k x vanishes in positive degree, particularly in degree 1. As H 1 (E k x ) = 0, we have the surjectivity of the mapping which is what is to be proved.
List of symbols. For convenient reference we list the commonly used, but not necessarily commonplace, notation that we use, along with its place of definition. Acknowledgements. Discussions with Connor Boyd and Bahman Gharesifard were helpful in the development of some ideas in the paper. The author thanks Mike Roth for helping to clarify the notions of invariance presented in Section 4.

Geometric preliminaries
In this section we develop the tools we need to state our main results. Some of the constructions we present are made for review and to present the notation we use. However, some of the developments are nonstandard. Thus Let us define the horizontal/vertical decomposition of a vector bundle associated with a connection Γ in π : E → M. For x ∈ M, e ∈ E x , and e 1 ∈ J 1 e E, let ξ ∈ Γ r (E) be such that ξ(x) = e and j 1 ξ(x) = e 1 . There then exists a unique and well-defined linear mapping One can verify that (1) ker(P H Γ ) = ker(T π) and (2) TE = ker(P H Γ ) ⊕ image(P H Γ ). We then denote the horizontal subbundle by HE = image(P H Γ ) and the vertical subbundle by VE = ker(P H Γ ). We denote by P V Γ = id TE −P H Γ the projection onto VE. Just by definition, P V Γ is a C r -vector bundle mapping according to the diagram continuous sections along locally absolutely continuous curves. Consider a continuous curve γ : I → M defined on an interval I ⊆ R and a continuous section η : I → E over γ, i.e., Suppose that both γ and η are differentiable at t ∈ I. Let X ∈ Γ r (TM) be such that .
If γ and η are locally absolutely continuous, then we can define a section ∇ γ η over γ by a.e. t ∈ I.
Moreover, (2.2) implies that Vector fields on the total space of a vector bundle. An essential rôle is played in our main results by certain vector fields defined on the total space of a vector bundle and the dual of a vector bundle. Let us define the types of vector fields that will arise on the total space of a vector bundle.
2.1 Definition: (Vector fields on the total space of a vector bundle) Let r ∈ {∞, ω}, let π : E → M be a C r -vector bundle, and let X 0 ∈ Γ r (TM).
(i) A vector field X ∈ Γ r (TE) is a linear (resp. affine) vector field over X 0 if (a) it projects to X 0 , i.e., the diagram it is a C r -vector bundle morphism (resp. affine bundle morphism) of the preceding diagram.
(iii) For A ∈ Γ r (End(E)), the vertical evaluation of A is the vector field A e ∈ Γ r (TE) defined by A e (e) = vlft(A(e), e).
Additionally assume that ∇ is a C r -linear connection in E.
(iv) The horizontal lift of X 0 is the vector field X h 0 ∈ Γ r (TE) defined by X h 0 (e) = hlft(X 0 • π(e), e). • The following lemma assembles all of the above ingredients.
2.2 Lemma: (Linear and affine vector fields, and linear connections) Let r ∈ {∞, ω}, let π : E → M be a C r -vector bundle, let ∇ be a C r -linear connection in E, and let X 0 ∈ Γ r (TM). Then the following statements hold: is an affine vector field over X 0 , then there exists A X aff ∈ Γ r (End(E)) and b X aff ∈ Γ r (E) such that Proof: (i) Since ∇ is a linear connection, the vertical projection ver, and therefore also the horizontal projection hor, are vector bundle mappings with respect to the following diagram: Kolář, Michor, and Slovák 1993, §11.10]. Therefore, since X lin is a linear vector field over X 0 , we have a vector bundle mapping hor(X lin ) determined by the following diagram: Thus we conclude two things: (1) hor(X lin ) = X h 0 since both hor(X lin ) and X h 0 are horizontal vector fields projecting to X 0 ; (2) X h 0 is a linear vector field over X 0 . Thus X lin − X h 0 is a linear vector field over the zero vector field. This shows that X lin − X h 0 is vertical and so we have that ver(X lin ) is a linear vector field. Thus we have the following vector bundle mapping: for A X lin ∈ End(E) x , and this gives the assertion.
(ii) This follows from the observation that an affine map between vector spaces has the form of the sum of a linear map and a constant map.
2.3. Flows of vector fields on the total space of a vector bundle. It will be useful to have at hand characterisations of the flows of the various vector fields considered above.

Lemma:
(Flows of vector fields on the total space of a vector bundle) Let r ∈ {∞, ω}, let π : E → M be a C r -vector bundle, let ∇ be a C r -linear connection in E, let X 0 ∈ Γ r (TM), let X ∈ Γ r (TE) be a linear vector field over X 0 , let ξ ∈ Γ r (E), and let A ∈ Γ r (End(E)). Then the following statements hold: is the parallel transport of e along the curve t → Φ X 0 t (π(e)).
This is a linear differential equation in E x , where x = π • Υ(t) for all t, and so we have (iv) This is the content of [Kobayashi and Nomizu 1963, §II.3].
The following characterisation of integral curves of affine vector fields will allow us to connect the formulation in this section to formulations that will arise in Section 7.
2.4 Lemma: (Covariant derivative characterisation of integral curves of affine vector fields) Let r ∈ {∞, ω}, let π : E → M, let ∇ be a C r -linear connection in E, let X 0 ∈ Γ r (TM), let A ∈ Γ r (End(E)), and let b ∈ Γ r (E). For a curve Υ : I → E, the following are equivalent: Proof: Let γ = π • Υ so that Υ is to be thought of as a section of E along γ. Then Υ is an integral curve of Taking the vertical part of this equation and using the second of the equations in (2.3) gives the lemma.
The following adaptation of the variation of constants formula for linear ordinary differential equations will also be useful. In the statement and proof of the lemma we use some notation and results that we will introduce and prove, respectively, below.
2.5 Proposition: (Variation of constants formula for the flow of an affine vector field) Let r ∈ {∞, ω}, let π : E → M be a C r -vector bundle, let X 0 ∈ Γ r (TM), let X lin ∈ Γ r (TE) be a C r -linear vector field over X 0 , and let b ∈ Γ r (E). Define Proof: We first make a calculation: By Lemma 2.10 below, we have (noting that, since F ∈ Lin r (E), F = λ e for some λ ∈ Γ r (E * )). We have since X aff projects to X 0 . Thus our initial calculation shows that the derivative of the linear function F as a function of time is the derivative of F with respect to X lin plus the derivative of F with respect to b v . Since the right-hand side of the asserted expression evaluates to F (e) at t = 0, we conclude that this right-hand side gives the evolution of F along integral curves of X aff , as claimed.
2.4. The dual of a linear vector field. If π : E → M is a vector bundle, then E * is the set of vector bundle maps from E to the trivial vector bundle R M . This is the dual vector bundle for E. We denote the canonical projection by π * : E * → M, acknowledging the possible confusion of the projection π * with the pull-back by the projection π. Let U ⊆ M be an open subset and let Φ : E|U → E have the property that it is a vector bundle isomorphism onto its image over the map Φ 0 : U → M which is a diffeomorphism onto its image. The dual of Φ is the map Φ * : (Φ(E|U)) * → (E|U) * defined by Φ * |E * Φ 0 (x) = (Φ|E x ) * . Associated with a linear vector field on E is its dual, determined according to the following.
2.6 Definition: (Dual of a linear vector field) Let r ∈ {∞, ω}, let π : E → M be a C r -vector bundle, let X 0 ∈ Γ r (TM), and let X ∈ Γ r (TE) be a linear vector field over X 0 . The dual vector field of X is the vector field X * on E * defined by We can prove some fundamental properties of the dual of a linear vector field. To do so, it is convenient to introduce some notation. For a vector bundle π : E → M with dual bundle π * : E * → M and for a linear vector field X on E, we denote by X × X * the vector field on E × E * defined by X × X * (e, α) = (X(e), X * (α)). We also note that the Whitney sum E ⊕ E * is the submanifold of E × E * given by With this notation we have the following result.
2.7 Lemma: (Properties of the dual vector field) Let r ∈ {∞, ω}, let π : E → M be a C r -vector bundle, let X 0 ∈ Γ r (TM), and let X ∈ Γ r (TE) be a linear vector field over X 0 . Then the dual vector field X * has the following properties: Thus X * projects to X 0 . Since the flow of X * , by definition, consists of local isomorphisms of E * , it follows from [Kolář, Michor, and Slovák 1993, §47.9] that X * is a linear vector field.
(iv) In the proof of part (ii), we only used the fact that X = X * are linear vector fields over X 0 in the proof. Thus the proof applies to the linear vector field Y over the same vector field X 0 as X. ( Let us determine the dual of a linear vector field represented in the decomposition of Lemma 2.2. 2.8 Lemma: (Duals of decomposed linear vector fields) Let r ∈ {∞, ω}, let π : E → M be a C r -vector bundle, let ∇ be a C r -linear connection in E, and let X 0 ∈ Γ r (TM). For A ∈ Γ r (End(E)), the dual of the linear vector field X = X h 0 + A e is X * = X h, * 0 − (A * ) e . Moreover, X h, * 0 is the horizontal lift of X 0 corresponding to the dual linear connection in E * .
By [Abraham, Marsden, and Ratiu 1988, Theorem 4.2.10], we have L −A e ⊕(A * ) e f E = 0. By Lemma 2.7(iii) this gives The first assertion in the result follows from Lemma 2.7(v). For the final assertion, we make three observations from which the assertion follows: 1. the flows of the horizontal lifts X h 0 and X h, * 0 are given by parallel translation Lemma 2.3(iv); 2. the flow of X h, * 0 is the dual of the inverse flow of X h 0 by definition; 3. the parallel translation by the dual of a linear connection is the dual of the inverse of parallel translation of the linear connection (by definition of the dual of a linear connection).
2.5. Functions on the total space of a vector bundle. In this section we introduce some special classes of functions on vector bundles, and indicate how to differentiate these with respect to the special kinds of vector fields we introduced in the preceding sections. The functions we consider are the following.
2.9 Definition: (Functions on the total space of a vector bundle) Let r ∈ {∞, ω}, let π : E → M be a C r -vector bundle.
Associated with these notions we introduce the notation with Aff r (E). We shall use the notation These isomorphisms will arise frequently in our presentation in multiple ways. Let us now see how to differentiate the special classes of affine functions just introduced with respect to the special classes of vector fields considered in Section 2.2.
(ii) Since f h is constant on fibres of π and ξ v is tangent to fibres, we have Differentiating with respect to t at t = 0 gives the result. (iii) Again, f h is constant on fibres and A e is tangent to fibres. Thus we have Differentiation with respect to t at t = 0 gives the result.
2.6. The symplectic structure of the tangent bundle of a Riemannian manifold. In Section 6 we shall relate constrained variational mechanics to sub-Riemannian geometry, and in doing so it will be convenient to have at hand some nice formulae for the pull-back of the canonical symplectic structure of the cotangent bundle to the tangent bundle by the metric-canonical diffeomorphism. We follow [Paternain 1999, §1.3.2, §1.4].
Let r ∈ {∞, ω}. Let us first intrinsically describe the canonical symplectic structure of the cotangent bundle of a C r -manifold M. We begin by describing a canonical one-form on the cotangent bundle. We define θ 0 ∈ Γ r (T * T * M) by Let us name the one-form θ 0 and define the canonical symplectic two-form.
2.11 Definition: (Liouville one-form, symplectic two-form) For r ∈ {∞, ω} and for a C r -manifold M, (i) the one-form θ 0 is the Liouville one-form and (ii) the two-form ω 0 = −dθ 0 is the canonical symplectic two-form on T * M.
• Now let G be a C r -Riemannian metric on M and consider the vector bundle isomorphism G : TM → T * M. The following lemma describes the pull-back of ω 0 to TM. We denote by K G the connector associated with the Levi-Civita connection for G as in (2.1).
2.12 Lemma: (The canonical symplectic form on the tangent bundle of a Riemannian manifold) Let r ∈ {∞, ω} and let (M, G) be a C r -Riemannian manifold. Then as asserted.
(ii) This part of the lemma will follow if we can show that Abraham, Marsden, and Ratiu 1988, Theorem 7.4.4].
Let v x ∈ TM and let X vx , Y vx ∈ T vx TM. Let X 0 , Y 0 , X 1 , Y 1 ∈ Γ r (TM) be such that Let us see how to differentiate functions such as this.
Proof: (i) Let v ∈ TM and let γ : [0, T ] → M be an integral curve of X through π TM (v). Let Υ : [0, T ] → TM be the vector field along γ defined by parallel translating v. Then, using Lemma 2.3(iv), as desired.
With these preliminaries and using [Abraham, Marsden, and Ratiu 1988, Proposition 7.4.11], we calculate which is the desired assertion. Paternain [1999] makes use of the Riemannian metric of Sasaki [1958] on TM to prove the preceding lemma, but, as we see, this is not necessary.
We shall denote Varieties. In Section 4 we shall consider vector fields that leave subsets of manifolds and vector bundles invariant. It will be essential for our results to have the desired generality that we allow for these subsets to be more general than submanifolds and subbundles. In this section and the next three we present the sorts of objects we shall work with when discussing invariance.
First we consider subsets of manifolds we work with, generalising the notion of a submanifold.

2.13
Definition: (C r -variety) Let r ∈ {∞, ω} and let M be a C r -manifold. A subset S ⊆ M is a C r -variety if, for any x ∈ M, there exists a neighbourhood U of x and f 1 , . . . , f k ∈ C r (U) such that In words, a C r -variety is a subset that is locally the intersection of the level set of finitely many functions of class C r . Note that, in the case of r = ∞, the notion of a C ∞ -variety is equivalent to that of a closed set. The following lemma which proves this is well-known, but we could not find a reference for it.
2.14 Lemma: (C ∞ -varieties are precisely closed sets) If U is an open subset of a smooth manifold M, then there exists f ∈ C ∞ (M) such that f (x) ∈ R >0 for all x ∈ U and f (x) = 0 for all x ∈ M \ U.

Proof:
We shall construct f as the limit of a sequence of smooth functions converging in the weak C ∞ -topology. We equip M with a Riemannian metric G. Let g ∈ C ∞ (M). If K ⊆ M is compact and if k ∈ Z ≥0 , we define where · G indicates the norm induced on tensors by the norm associated with the Riemannian metric. One readily sees that the family of seminorms · k,K , k ∈ Z ≥0 , K ⊆ M compact, defines a locally convex topology agreeing with other definitions of the weak topology. Thus, if a sequence (g j ) j∈Z >0 satisfies lim j→∞ g − g j k,K = 0, k ∈ Z ≥0 , K ⊆ M compact, then g is infinitely differentiable [Michor 1980, §4.3].
We suppose that M is connected since, if it is not, we can construct f for each connected component, which suffices to give f on M. Since M is paracompact, connectedness allows us to conclude that M is second countable [Abraham, Marsden, and Ratiu 1988, Proposition 5.5.11]. Using Lemma 2.76 of [Aliprantis and Border 2006], we let (K j ) j∈Z >0 be a sequence of compact subsets of U such that K j ⊆ int(K j+1 ) for j ∈ Z >0 and such that ∪ j∈Z >0 K j = U. For j ∈ Z >0 , let g j : M → [0, 1] be a smooth function such that g j (x) = 1 for x ∈ K j and g j (x) = 0 for x ∈ M\K j+1 ; see [Abraham, Marsden, and Ratiu 1988, Proposition 5.5.8]. Let us define α j = g j j,K j+1 and take j ∈ R >0 to satisfy j < (α j 2 j ) −1 . We define f by and claim that f as defined satisfies the conclusions of the lemma.
First of all, since each of the functions g j takes values in [0, 1], we have and so f is well-defined and continuous by the Weierstrass M -test. If x ∈ U, then there exists N ∈ Z >0 such that x ∈ K N . Thus g N (x) = 1 and so f (x) ∈ R >0 . If x ∈ M \ U then g j (x) = 0 for all j ∈ Z >0 and so f (x) = 0. All that remains to show is that f is infinitely differentiable. Let x ∈ M, let m ∈ Z >0 , and let j ∈ Z ≥0 be such that j ≤ m. If x ∈ K m+1 then g m is zero in a neighbourhood of x, and so Thus, whenever j ≤ m we have Let K ⊆ M be compact, let r ∈ Z ≥0 , and let ∈ R >0 . Take N ∈ Z >0 sufficiently large that for m 1 , m 2 ≥ N with m 1 < m 2 , this being possible by convergence of ∞ j=1 1 2 j . Then, for m 1 , m 2 ≥ N , Thus, for every k ∈ Z ≥0 and K ⊆ M compact, (f m ) m∈Z >0 is a Cauchy sequence in the seminorm · k,K . Completeness of the weak C ∞ -topology implies that the sequence (f m ) m∈Z >0 converges to a function that is infinitely differentiable.
The lemma implies that C ∞ -varieties are too general to expect to be able to say much about them. Indeed, in the smooth case we shall restrict ourselves to the consideration of submanifolds. However, in the real analytic case, we consider C ω -varieties that are not submanifolds.
2.8. Generalised and cogeneralised subbundles. We shall encounter subsets of vector bundles that, like subbundles, are comprised of a union of fibres that are subspaces of fibre of the vector bundle, but, unlike subbundles, the dimension of these fibres is not locally constant. To study these sorts of objects in a systematic way, there needs to be some regularity assumptions made. In this section we present two natural forms of such regularity, both of which we shall use, and give some properties of these.
First let us make some initial definitions.
2.15 Definition: (Generalised subbundle, cogeneralised subbundle) Let r ∈ {∞, ω}, let π : E → M be a C r -vector bundle, and let F ⊆ E be such that, for each We call the sections (ξ i ) i∈Ix local generators for F on U x .
We shall adapt some usual notation for vector bundles to generalised or cogeneralised subbundles.
2.16 Definition: (Constructions with generalised or cogeneralised subbundles) Let r ∈ {∞, ω}, let π : E → M be a C r -vector bundle, and let F ⊆ E be a C r -generalised or a C r -cogeneralised subbundle.
(i) If S ⊆ M, the restriction of F to S is F|S = π −1 (S) ∩ F.
(ii) By Γ r (F) we denote the C r -sections of E taking values in F: The following result will be essential in our discussion of invariant subbundles.
2.17 Lemma: (Cogeneralised subbundles are closed) Let r ∈ {∞, ω} and let π : E → M be a C r -vector bundle. If F ⊆ E is a C r -cogeneralised subbundle, then F is closed.
Proof: We consider the smooth and real analytic cases separately. First, in the smooth case, we assume that M is connected, since, if it is not, the argument we give can be applied to each connected component. Note that Sussmann [2008] proves that a smooth generalised subbundle has a finite set of global generators. That is, if G ⊆ E is a generalised subbundle, then there are ξ 1 , . . . , ξ k ∈ Γ ∞ (E) such that G x = span R (ξ 1 (x), . . . , ξ k (x)), x ∈ M.
In this case, we see that G is the image of the vector bundle map Now, since F is a smooth cogeneralised subbundle, Λ(F) is a smooth generalised subbundle. Thus Λ(F) = image(Φ) for a vector bundle mapping Φ as above. Since Λ(image(Φ)) = ker(Φ * ), we have Thus F is the preimage of the zero section of a smooth vector bundle under a smooth vector bundle map. Thus F is closed since the zero section is closed.
In the real analytic case, one can show, with a great deal of work, that, for a real analytic generalised subbundle G and for x ∈ M, there exists a neighbourhood U of x and sections ξ 1 , . . . , ξ k ∈ Γ ω (E) such that x ∈ U, [Lewis 2012, Theorem 5.2]. Using this result and the same arguments as in the smooth case above, it follows that, for each x ∈ M, there is a neighbourhood U of x such that F|U is a closed subset of E|U. To show that F is then closed, let (e j ) j∈Z >0 be a sequence in F that converges to e ∈ E. There is then a neighbourhood U of π(e) such that e j ∈ π −1 (U) for j sufficiently large. Since F|U is closed (possibly after shrinking U), it follows that e ∈ F, and so F is closed.
For generalised subbundles, the proof of the preceding lemma immediately gives the following result.
2.18 Corollary: (The fibres of a generalised subbundle are generated by global sections) Let r ∈ {∞, ω} and let π : E → M be a C r -vector bundle. If F ⊆ E is a C r -generalised subbundle, then More specifically, for each x ∈ M, there exists a neighbourhood U of x and ξ 1 , . . . , ξ k ∈ Γ r (F) such that F y = span R (ξ 1 (y), . . . , ξ k (y)), y ∈ U.
For cogeneralised subbundles, the proof of the lemma gives the following.

2.19
Corollary: (Cogeneralised subbundles are varieties) Let r ∈ {∞, ω} and let π : E → M be a C r -vector bundle. If F ⊆ E is a C r -cogeneralised subbundle, then it is a C r -variety.
One could, therefore, say that cogeneralised subbundles are "linear varieties." We shall expand on this idea below when we discuss affine subbundle varieties.
An important feature of the definitions is the following characterisation of the set of regular points for a generalised or cogeneralised subbundle.
2.20 Lemma: (Regular points for generalised and cogeneralised subbundles) Let r ∈ {∞, ω}, let π : E → M, and let F be a C r -generalised or a C r -cogeneralised subbundle of E. Then there exists an open dense subset U ⊆ M such that F|U is a C r -subbundle of E|U.
Proof: We suppose that M is connected so that E has constant fibre dimension, say m. The lemma will follow in the general case by applying the proof here to each connected component of M.
First suppose that F is a C r -generalised subbundle. For j ∈ {0, 1, . . . , m + 1}, denote . . , m}, and U = ∪ m j=0 U j . We will show that U satisfies the conclusions of the lemma.
First we claim that V j is open for each j ∈ {0, 1, . . . , m + 1}. Let x ∈ M and suppose that dim R (F x ) ≥ j. Let U x be a neighbourhood of x and let (ξ i ) i∈Ix be local generators for F on U x . Then there exist i 1 , . . . , i j ∈ I x so that ξ i 1 (x), . . . , ξ i j (x) are linearly independent. By continuity, ξ i 1 (y), . . . , ξ i j (y) are linearly independent for y in some neighbourhood of x.
Let us show that V j \ bd(V j+1 ) is dense in V j . Let x ∈ V j and let N be a neighbourhood of x. We have three mutually exclusive cases.

If
. . , k}. We have U j = int(C j ). Since F|U j has rank j, F|U j is a C r -subbundle of E|U j since, from any local generators for F in a neighbourhood of x ∈ U j , we can find j of them that are a local basis for sections. Thus F|U is also a C r -subbundle of E|U.
It remains to show that U is open and dense in M. Being a union of the open sets U 0 , U 1 , . . . , U m , it is certainly open. Now let x ∈ M \ U. Then x ∈ C j for some j ∈ {0, 1, . . . , m}. This means that x ∈ bd(V j+1 ). Thus any neighbourhood of x intersects at least one of V j or V j+1 , whence it intersects at least one of U j or U j+1 .
This gives the lemma when F is a C r -generalised subbundle. If F is a C r -cogeneralised subbundle, then Λ(F) is a C r -generalised subbundle. Thus, as we just showed, there is an open dense subset U ⊆ M such that Λ(F)|U is a C r -subbundle of E * |U. Thus, for x ∈ U, there is a neighbourhood V ⊆ U of x and k ∈ {0, 1, . . . , m} such that dim R (Λ(F) y ) = k for y ∈ V. Therefore, dim R (F y ) = m − k for y ∈ V, and so F|U has locally constant rank, and so is a C r -subbundle of E|U.
The following result gives an instance of generalised and cogeneralised subbundles.

2.21
Lemma: (The kernel and image of a vector bundle map are generalised and cogeneralised subbundles) Let r ∈ {∞, ω}, let π : E → M and θ : F → M be C r -vector bundles, and let Φ : E → F be a C r -vector bundle mapping. Then the following statements hold: Proof: (i) Let U ⊆ M be an open set for which there exists a basis ξ 1 , . . . , ξ k ∈ Γ r (E|U) of sections of E over U. Then Φ • ξ 1 , . . . , Φ • ξ k are local generators for image(Φ|(E|U)).
2.9. Generalised and cogeneralised affine subbundles. By virtue of Lemma 2.21, one can think of the subbundles of Section 2.8 as being either sets of linear equations in vector bundles (in the case of generalised subbundles) or the sets of solutions of linear equations (in the cogeneralised case). In this section we extend this to sets of affine equations. In the next section we will consider sets of solutions to affine equations. Our first definition is the following.

2.22
Definition: (Generalised affine subbundle) Let r ∈ {∞, ω} and let π : We call the sections (ξ i ) i∈Ix local generators for B on U x . We denote B • The following characterisation of generalised affine subbundles is one we shall frequently use.
2.23 Lemma: (Characterisation of generalised affine subbundles) Let r ∈ {∞, ω} and let π : E → M be a C r -vector bundle. For a subset B ⊆ E, the following statements are equivalent: (i) B is a C r -generalised affine subbundle; (ii) there exists ξ 0 ∈ Γ r (E) and a C r -generalised subbundle F ⊆ E such that Proof: We first prove a few simple linear algebraic facts. In the following, we shall take as our definition of an affine subspace of a vector space that by which an affine subspace is such that it contains the bi-infinite line passing through any two points.
1 Sublemma: Let V be a R-vector space. The following statements hold: (i) a subset A ⊆ V is an affine subspace if and only if there exists v 0 ∈ A and a subspace U ⊆ V such that A = v 0 + U; (ii) if, for an affine subspace A ⊆ V, we have for v 0 , v 0 ∈ A and for subspaces U, U ⊆ V, we have U = U and π U (v 0 ) = π U (v 0 ), where π U : V → V/U is the canonical projection.
The result will be proved if we prove that U is a subspace. Let v − v 0 ∈ U for some v ∈ A and let a ∈ R. Then which gives the result after we notice that for some u ∈ U, and so u ∈ U. Thus U ⊆ U. As the opposite inclusion is established similarly, we have U = U . We also have as desired. In particular, v 0 + 0 = v 0 + u for some u ∈ U and so v 0 − v 0 ∈ U, as desired.
Now we proceed with the proof. Suppose that B is a C r -generalised affine subbundle. Let U = (U a ) a∈A be an open cover for M such that, for each a ∈ A, we have local generators (ξ ai ) i∈Ia for B on U a . For a ∈ A, fix i 0 ∈ I a and denote ξ a0 = ξ ai 0 . As in the first part of the sublemma, for x ∈ U a , we have By the second part of the sublemma, the subspace F x is well-defined, independently of the choice of a ∈ A for which x ∈ U a . Note that this then defines a C r -generalised subbundle F. If U a ∩ U b = ∅, then the second part of the sublemma gives ξ a0 ( We will show that σ = π F (ξ 0 ), where ξ 0 ∈ Γ r (E) is such that and, by the second part of the sublemma, this will establish that B x = ξ 0 + F x . We will use constructions from sheaf andČech cohomology, and we refer to [Ramanan 2005, §4.5] for the background notions.
To do this, we first claim that, for any open set U ⊆ M, the sheaf G r F |U is acyclic. In the smooth case, this follows from [Wells Jr. 2008, Proposition 3.11] (along with [Wells Jr. 2008, Examples 3.4(d, e)] and [Wells Jr. 2008, Proposition 3.5]). In the real analytic case, we note that G ω F is coherent by [Lewis 2012, Corollary 4.11]. Thus G ω F |U is acyclic in the real analytic case by Cartan's Theorem B [Cartan 1957, Proposition 6].
For the other, suppose that we have ξ 0 ∈ Γ r (E) and a C r -generalised subbundle F ⊆ E such that Then, by the first part of the sublemma, for each x ∈ M, there exists a neighbourhood U x of x such that where (ξ i ) i∈Ix are local generators for F on U x . Thus (ξ 0 |U x + ξ i ) i∈I are local generators for an affine subbundle which equals B.
The generalised subbundle F is called the linear part of the generalised affine subbundle B and is denoted by L(B).
Based on the lemma, we make the following definition.
2.24 Definition: (Cogeneralised affine subbundle) Let r ∈ {∞, ω} and let π : E → M be a C r -vector bundle. A subset B ⊆ E is a C r -cogeneralised affine subbundle if there exists ξ 0 ∈ Γ r (E) and a C r -cogeneralised subbundle F ⊆ E such that Many properties of generalised or cogeneralised affine subbundles are immediately deduced from the corresponding properties of their linear part, which is a generalised or cogeneralised subbundle. We shall freely use such properties.
We will require the analogue of Λ(F) for a generalised or cogeneralised subbundle. Let us first indicate this just on the level of linear algebra.
2.25 Lemma: (Affine functions whose zeros prescribe an affine subspace) Let V be a R-vector space and let B ⊆ V be an affine subspace. Let u 0 ∈ V be such that B = u 0 +L(B).
as claimed.
Based on the lemma, for a C r -generalised or a C r -cogeneralised affine subbundle B of a C r -vector bundle π : If λ ∈ Γ r (Λ(L(B))), then we let F λ : E → R be defined by F λ |E x = F λ(x) . Note that We then have the following analogue of Corollary 2.18.

2.27
Remark: (On cogeneralised affine subbundles I) We note that a cogeneralised affine subbundle is not quite the natural idea of something dual to a generalised affine subbundle. Indeed, while for generalised and cogeneralised subbundles, one has the equation/solution duality, if one thinks of generalised affine subbundles as being affine equations, then cogeneralised affine subbundles do not play the rôle of the corresponding solutions. We shall have more to say on this in Remark 2.34 below. • The preceding remark notwithstanding, we shall discuss cogeneralised subbundles in some detail, since in Section 4 we shall present a theory for invariant subbundles. Since this is a subject that seems to not have received much consideration in the literature, it seems worthwhile to be comprehensive about this.
2.10. Affine subbundle varieties. With the closing Remark 2.27 of the preceding section in mind, let us turn to a discussion of what should be regarded as the object naturally dual to a generalised affine subbundle, and to a sort of object which features prominently in our results of Section 4 and of the application of these results in Section 7 to the problem of comparison of trajectories to nonholonomic and constrained variational systems.
To set the groundwork, let us consider a little linear algebra. In particular, we wish to recast the classical notion of a system of linear inhomogeneous algebraic equations. Thus let V be a finite-dimensional R-vector space, and let A ∈ End(V) and b ∈ V. We denote by the set of solutions to the corresponding system of inhomogeneous linear equations. Note that leading us to define the subspace Note that this subspace is distinguished by having positive codimension. The following lemma indicates the importance of this condition, and as well characterises the conditions for existence of solutions using subspaces of V * ⊕ R.

2.28
Lemma: (Systems of linear equations and subspaces of V * ⊕ R) For a finitedimensional R-vector space V, the following statements concerning a subspace ∆ ⊆ V * ⊕ R are equivalent: Finally, For the parts (iii) and (iv) of the lemma, suppose that ∆, A, and b are as posited.
Finally, for part (v), we note that the existence assertion follows by taking ∆ = Sol * (A, b). For uniqueness, let us make a trivial general observation.
is simply the annihilator of S, and is a subspace uniquely prescribed by S. The conclusion here follows by taking We note that the subspace ∆ is uniquely defined by the set of solutions Sol(A, b), while this set of solutions does not uniquely define A and b. However, one can recover the important ingredients of A and b from ∆. Let us see how to do this. First note that Sol(A, b) is determined by ker(A) (since Sol(A, b) is an affine space with linear part equal to ker(A)) and by (2.11) With this notation, we have the following result.
(ii) We have With the above considerations in mind, and as indicated by (2.5), we identity the set of affine functions on a vector bundle π : E → M with sections of the vector bundle E * ⊕ R M . We shall notationally distinguish these things, however, by Aff r (E) and Γ r (E * ⊕ R M ) in order to attempt to clarify the ways in which we will think of what is effectively the same thing.
With the preceding as motivation, we make the following definition.
We shall write A = A(∆) when we wish to prescribe the defining subbundle giving rise to the affine subbundle variety A.
This shows that, given an affine subbundle variety A, any defining subbundle ∆ for which It is not the case, however, that the defining subbundle is uniquely determined at points not in S(A), of course. Note that it is often most practical to talk of defining subbundles in the absence of the associated affine subbundle varieties since the former is always a perfectly well-defined subbundle, while the latter may be the empty set.
Similarly to Corollary 2.19 for cogeneralised subbundles, we have the following result.
2.31 Corollary: (Affine subbundle varieties are varieties) Let r ∈ {∞, ω} and let π : E → M be a C r -vector bundle. If A ⊆ E is a C r -affine subbundle variety, then it is a C r -variety.
Proof: Let ∆ ⊆ E * ⊕ R M be a defining subbundle for A. Let e ∈ A and let x = π(e). By Lemma 2.23 and the proof of Lemma 2.17, there is a neighbourhood U of x and globally sections showing that A is indeed locally the intersection of the zeros of finitely many C r -functions.
Another consequence of all of this is the following.
2.32 Corollary: (The base variety of an affine subbundle variety is a variety) Let r ∈ {∞, ω} and let π : E → M be a C r -vector bundle. If A ⊆ E is a C r -affine subbundle variety, then S(A) is a C r -variety if it is nonempty.
Proof: Let x ∈ S(A). As in the proof of Corollary 2.31, let U be a neighbourhood of x and let F 1 , . . . , F k ∈ Aff r (E) be such that The conditions for y ∈ U to be in S(A) are expressed by the conditions that there exists v ∈ E y for which We can assume that E is trivialised over U via local sections ξ 1 , . . . , ξ m . We denote λ (2.14) If we define matrices the condition for the existence of v 1 , . . . , v s ∈ R satisfying (2.14) is equivalent to the ranks of A 0 (y) and A 1 (y) being equal. For m ∈ Z ≥0 , let These two subsets are submanifolds, as the following general lemma proves.
is a C r -variety whenever it is nonempty.
and note that S(A, ≥ k) is an open subset of M, Indeed, since the condition for membership in S(A, ≥ k) is that there be a k × k subdeterminant of A which is nonzero, continuity of determinants gives the desired openness. Now, if x ∈ S(A, ≥ k), then we can permute rows and columns of A(y) to arrive at a matrix B(y) of the form where B 11 ∈ R k×k satisfies rank(B 11 (x)) = k. Since B(y) is obtained from A(y) by mere swapping of rows and columns, we have rank(B(y)) = rank(A(y)) for all y ∈ M. By continuity, there is a neighbourhood U of x such that rank(B 11 (y)) = k for y ∈ U. Now consider the matrix which is invertible and is a C r -function of y ∈ U. We directly compute . Thus This last set is the intersection of the zeros of finitely many C r -functions (namely the functions defined by the entries of the matrix that is required to be zero), and this gives the result.
Motivated by this, we have the following lemma which is well-known, but for which we give a proof since typically proofs of these simple facts are embedded in a more complicated setting.
2 Lemma: Finite intersections and unions of C r -varieties are C r -varieties.
Proof: It suffices to consider the intersection and union of two C r -varieties. Let S, T ⊆ M be C r -varieties. Let x ∈ S ∩ T and let U be a neighbourhood of x such that showing that S ∩ T is a C r -variety. Next let x ∈ S ∪ T and let U be a neighbourhood of x such that Conversely, suppose that y ∈ U − (S ∪ T). Then there exists i ∈ {1, . . . , k} and j ∈ {1, . . . , l} such that f i (y) = 0 and g j (y) = 0. Thus f i (y)g j (y) = 0 and so y ∈ i∈{1,...,k}, j∈{1,...,l} which gives the result.
The corollary follows immediately from (2.15) and the lemma.
The picture one should have in one's mind concerning an affine subbundle variety A ⊆ E is encoded in the following diagram: The column on the left is comprised of C r -varieties while the column on the right is comprised of C r -manifolds. Thus one should think of A as being a "singular affine bundle" over the "singular manifold" S(A). Let us introduce some terminology that we will find useful.

Remark: (On cogeneralised affine subbundles II)
In the case that a C r -defining subbundle ∆ is total, one might be inclined to say that the corresponding affine subbundle variety A(∆) should be the same thing as a C r -cogeneralised affine subbundle. This is not true, since it is generally not the case that one can find a C r -section ξ 0 of A(∆). Indeed, it may not even be the case that one can find a continuous section of A(∆). Such matters are discussed by Fefferman and Kollár [2013]. All that one can say with any generality is that, if the fibres of A(∆) have locally constant rank, then it is indeed a C r -cogeneralised subbundle, in fact a C r -subbundle. While true, this fact turns a blind eye to the interesting lack of correspondence between total defining subbundles and cogeneralised affine subbundles. • 2.11. Distributions on Riemannian manifolds. As essential rôle in our main results in Section 7 is played by certain constructions involving the interaction of distributions and Riemannian metrics. Our presentation here is derived in part from developments of Lewis [1998].
Let r ∈ {∞, ω}, let M be a C r -manifold, and let D ⊆ TM be a C r -subbundle. We shall sometimes call D a distribution on M. We shall need some particular constructions concerning the interaction of distributions and Riemannian geometry. Thus we additionally introduce a C r -Riemannian metric G and denote by D ⊥ the G-orthogonal complement to D. We denote by P D , P D ⊥ : TM → TM the G-orthogonal projections onto D and D ⊥ , respectively. Let us define a few objects that can be built from this data.
(i) The constrained connection for D is the affine connection (ii) The second fundamental form for D is the tensor field (iv) The geodesic curvature for D is the tensor field These definitions contain some implicit assertions that must be proved. We prove these, along with a few other facts, in the following lemma. In the statement of the lemma, we make use of the operation which is called the symmetric product by Lewis [1998].
2.36 Lemma: (Constructions for distributions on Riemannian manifolds) Let r ∈ {∞, ω}, let (M, G) be a C r -Riemannian manifold, and let D ⊆ TM be a C r -subbundle. Then the following statements hold: and, consequently, as claimed.
(ii) Using the computations from the first part of the proof, and the conclusion follows from this.
using part (i) and the fact that G ∇ is torsion-free. Part (v) follows similarly to part (iv) and part (vi) is immediate from the definitions.
The statement of part (iii) is a bit of an outlier since it essentially involves evaluating S D on an argument taking values in D ⊥ . However, S D is defined only for arguments taking values in D. Thus the statement is "improper," in some sense. However, we shall use this conclusion in the proof of Proposition 6.7, so we state it here. For We will also be interested in representations of these tensors with the orders of the arguments flipped. To this end, for Let us make some observations about the preceding constructions.

Remarks: (Constructions for distributions on Riemannian manifolds)
1. The constrained connection D ∇ is a C r -affine connection on M that restricts to a C r -linear connection in D. Of course, there are many affine connections on M that agree with D ∇ when restricted to D, though the one we give is arguably the most natural as it arises merely from the G-orthogonal decomposition of G ∇. The constrained connection seems to have originated in the work of Synge [1928], but the development we give is that of Lewis [1998].

The definition of the second fundamental form is a natural adaptation of the theory of
Riemannian geometry for submanifolds [e.g., Lee 2018, Chapter 8]. [Abraham, Marsden, and Ratiu 1988, Theorem 4.3.3]).

It is clear that the Frobenius curvature of D vanishes if and only if D is integrable (noting that D is integrable if and only if it is involutive as it is a subbundle of TM
4. An important difference between the theory for distributions as we present here and the theory for submanifolds concerns geodesic invariance. For submanifolds, one has the equivalence of the conditions (a) geodesic invariance of a submanifold (i.e., geodesics with initial conditions tangent to the submanifold remain in the submanifold), (b) the connection restricts to the submanifold, and (c) vanishing of the second fundamental form. For distributions, the second and third of these conditions are equivalent (as is clear), but they do not imply geodesic invariance (i.e., geodesics with initial conditions in the distribution have subsequence tangent vectors also in the distribution). In fact, Lewis [1998] shows that a distribution D is geodesically invariant for an affine connection ∇ if and only if From this we see that a distribution D is geodesically invariant if and only if its geodesic curvature vanishes. • 2.12. Characteristic subbundle of a distribution. In our study of the equivalence of the two equations for constrained motion, we shall encounter a geometric condition whose meaning will be helpful to understand. A basic concept in this understanding is the following.

Definition: (Vector fields and flows leaving distributions invariant)
For a distribution whose rank is not locally constant-sometimes called a "generalised distribution" to distinguish from the nice locally constant rank case-the relationship between infinitesimal invariance and invariance is that they agree when r = ω and they agree when r = ∞ under a finite generation hypothesis. Let us give here a full proof of the correspondence between these notions in the locally constant rank case.

Proposition: (Invariance and infinitesimal invariance of distributions under vector fields)
Proof: Suppose that D is invariant under X. Let x ∈ M and let V be a neighbourhood of x with the following properties: The existence of a neighbourhood V having the first property follows from the semicontinuity properties of the maximal interval of existence for integral curves of a vector field [Abraham, Marsden, and Ratiu 1988, Proposition 4.1.24]. The existence of a neighbourhood V having the second property follows in the smooth case since D is a subbundle, and using cutoff functions. In the real analytic case, the existence of such a neighbourhood V can be inferred by Cartan's Theorem A [Cartan 1957, Proposition 6], cf. Corollary 2.18. In any case, if The existence of such a neighbourhood U follows by continuity of the flow. Thus we can write [Abraham, Marsden, and Ratiu 1988 and let Ψ y : R → R m×m be the solution to the matrix initial value problem satisfy the same differential equation with the same initial condition. Thus they are equal. This gives Therefore, by [Abraham, Marsden, and Ratiu 1988, Theorem 4.2.19], we have With these notions of invariance, we can make the following definitions.
2.40 Definition: (Characteristic vector field, characteristic distribution) Let r ∈ {∞, ω}, let M be a C r -manifold, and let D ⊆ TM be a C r -subbundle.
The following characterisation of characteristic vector fields and of the characteristic distribution will be useful for us.

2.41
Lemma: (Characterisation of characteristic vector fields and the characteristic distribution) Let r ∈ {∞, ω}, let (M, G) be a C r -Riemannian manifold, and let D ⊆ TM be a C r -subbundle. Then the following statements hold: Proof: Since the second assertion follows immediately from the first, we just prove the first. We have that X ∈ Γ r (D) is a characteristic vector field if and only if as claimed.
We shall turn the preceding constructions on their head a little, since it is this altered form in which we shall be interested. We are interested in an understanding of ker(F * D (u)) for u ∈ D x .
The point is that, if u ∈ D x is a characteristic vector, then every vector in D ⊥ x is cocharacteristic for u, whereas, if u is not a characteristic vector, then there are noncocharacteristic vectors for u.

Sobolev spaces of curves on a manifold
In this section we develop a framework for performing geometric analysis with the space of curves on a Riemannian manifold. This is typically carried out by exploiting the structure of an infinite-dimensional Hilbert manifold possessed by the space of such curves [e.g., Klingenberg 1995, §2.3]. Rather than working with infinite-dimensional nonlinear geometry, we reduce the problem to infinite-dimensional linear analysis by working with function evaluations. In this section we also put our framework to use to describe some special classes of curves that will be useful for our analysis in Section 5 for deriving the governing equations for nonholonomic mechanics and constrained variational mechanics.
Since we have a lot of constructions, definitions, and results in this section, let us provide a roadmap to help the reader understand how the story will unfold.
1. In Section 3.1 we introduce the various classes of curves we use in the paper. The basic player is the space H 1 ([t 0 , t 1 ]; M) of absolutely continuous curves on M that are square integrable with square integrable derivative. We characterise curves γ in this space by characterising f • γ for f ∈ C ∞ (M), and it is this idea that characterises our approach, in general. We consider, specially, curves in H 1 ([t 0 , t 1 ]; E), where E is the total space of a vector bundle π : E → M. Here curves are characterised by composition, not with general smooth functions, but with smooth fibre-affine functions, cf. Definition 2.9. This is how the particular structure of the vector bundle is accounted for in our framework. As part of this development of curves in the total space of a vector bundle, we consider curves that are to be thought of as sections over a curve in H 1 ([t 0 , t 1 ]; M). These vector spaces of sections over a curve will be important for us in a multitude of ways. Some of the classes of curves considered in this section have mechanical significance, such as curves with fixed endpoints and curves with tangent vectors in a distribution. These will be studied in greater detail in Section 5.1.

The topology of H
Our definition of this space as a topological space relies only on the functions f • γ, and so gives a description of the topology that involves only the Hilbert space H 1 ([t 0 , t 1 ]; R). We prove, using this description of the topology, that various of the subsets of curves and subsets of sections along curves that we introduce are, in fact, closed subsets. This ensures that their relative topology is comparatively friendly. We postpone to Section 5.1 a discussion of the differentiable structure of some of these spaces of curves.

3.
A key ingredient in providing a differentiable structure to spaces of curves is to have at hand a notion of calculus in our framework. In Section 3.3 we develop some calculus in H 1 ([t 0 , t 1 ]; M) by considering the calculus of curves in the space of curves H 1 ([t 0 , t 1 ]; M). These differentiable curves give us access to subsequent definitions for tangent vectors, etc. Consistent with our approach, we do this by reducing definitions to those involving only H 1 ([t 0 , t 1 ]; R), where only standard differential calculus in Banach spaces suffices. We further simplify this approach by reducing questions of differentiability and derivatives of curves to questions involving elementary calculus of R-valued functions of two variables. These simple methods for determining differentiability and derivatives are the basis for the applicability of the methods we introduce.
4. Some tools in the calculus of variations are developed in Section 3.4. Specifically we define the notions of variation and infinitesimal variation of a curve γ ∈ H 1 ([t 0 , t 1 ]; M). We make full use of our simplified calculus developed in Section 3.3. These notions of variation and infinitesimal variation get us started towards defining the notion of a tangent vector in our approach.
5. Indeed, this notion of tangent vector, and the associated notion of tangent spaces, are described in Section 3.5. Using these notions we extend our calculus from curves with values in H 1 ([t 0 , t 1 ]; M) to mappings from H 1 ([t 0 , t 1 ]; M) to a manifold. Again, our methods enable one to reduce questions of differentiability to differentiability of R-valued functions of two real variables. 6. In Section 3.6 we consider mappings between spaces of curves, and we extend our calculus to such mappings. Once again, in our approach we are able to reduce the questions of differentiability to that of R-valued functions of two variables.
7. In Section 3.8 we define a technical device, the weak covariant derivative for distributional sections. This will come up in the proof of Lemma 1 from the proof of Theorem 5.22.
The preceding outline of what we do in this chapter to develop tools for nonlinear Sobolev-type analysis is a beginning of what ought to be possible. One should be able to develop a more comprehensive set of tools applicable to perform this analysis in far more general settings than we use here, while always reducing the analysis to that of scalarvalued functions. Some tools for working with the higher-order derivatives required for such analysis are presented by Jafarpour and Lewis [2014]. Ideas very much inline with what we describe here are given in the series of papers by Convent and Van Schaftingen [2016a, 2016b.
The constructions and results in this section do not depend on the regularity of manifolds, metrics, and connections, and for this reason we work in the smooth category in this section.
3.1. Curves and sections along curves. Let M be a smooth manifold. We first consider classes of curves on M. Let t 0 , t 1 ∈ R satisfy t 0 < t 1 and denote, for s ∈ Z ≥0 , Another way of introducing these classes of curve is to ask that, for each t ∈ [t 0 , t 1 ], there exists a chart (U, φ) about γ(t) such that the components of φ • γ are members of the usual Sobolev spaces H s (I t ; R) for some interval I t about t. Let us outline how this definition is equivalent to the one we give.
1. The coordinate definition is independent of coordinate chart because the classical Sobolev spaces are invariant under uniformly smooth changes of coordinate [Adams and Fournier 2003, Theorem 3.41].
The assumption of all derivatives being uniformly continuous will always hold if the domain is compact. This suffices in our case since our domain [t 0 , t 1 ] is compact, and so the image of a curve can be covered with finitely many relatively compact coordinate charts.
2. One may assume that coordinate functions are restrictions of globally defined smooth functions to the chart domain by use of bump functions. Equivalently, one can see this by using the Whitney Embedding Theorem [Whitney 1936]. This latter approach has the benefit of applying in any regularity category where embeddings in Euclidean space exist, e.g., the real analytic category [Grauert 1958].
3. A consequence of the preceding is that our definition of H s ([t 0 , t 1 ]; M) agrees with the standard one in the case of M = R.
Now suppose that we additionally have a smooth subbundle D ⊆ TM and that s ∈ Z >0 . Here we denote Next we consider a vector bundle π : E → M. We recall from (2.4) the notions of affine and linear functions on E. We use these functions to characterise curves in the total space of a vector bundle.
3.1 Lemma: (Sobolev spaces of curves in the total space of a vector bundle) If π : E → M is a smooth vector bundle and if s ∈ Z ≥0 , then Proof: This is a consequence of the fact that, about any point e ∈ E, one can choose a coordinate chart comprised of globally defined affine functions (e.g., the coordinates defined by a vector bundle chart).
We wish to think about sections of E along γ. For s ∈ Z ≥0 , we have that is, to characterise sections along a curve, it suffices to work with smooth affine functions, and not general smooth functions, just as in Lemma 3.1. In the case that s = 0, we replace the symbol "H 0 " with "L 2 ," this being sensible since the spaces are vector spaces.
The curve γ automatically inherits the regularity of the section.

Lemma:
(Regularity of curves covered by regular sections) Let π : E → M be a smooth vector bundle and let t 0 , as claimed.
Similarly as was done for curves, for

Topology on spaces of curves and sections along curves.
Part of what we do in this work is develop a means of rigorously working with the spaces H s ([t 0 , t 1 ]; M), which are not vector spaces, without needing to introduce infinite-dimensional manifolds, which is the standard methodology one uses in these cases. The approach we describe here uses evaluation by functions to replace the nonlinear space H The use of these maps is obviously suggested by our very definition of the space H s ([t 0 , t 1 ]; M). Here we use these maps to render H s ([t 0 , t 1 ]; M) a topological space with the family of semimetrics The resulting topology is then easily verified to be the initial topology for the mappings ev f , f ∈ C ∞ (M). Let us make some comments on this topology.
1. If M is connected, M can be embedded in R N for some N ∈ Z >0 , and so there are finitely many functions f 1 , . . . , f N ∈ C ∞ (M) (see Lemma 1.1(ii)) so that the topology of H s ([t 0 , t 1 ]; M) is determined by the semimetrics ρ s a,f j , a ∈ {0, 1, . . . , s}, j ∈ {1, . . . , N }. This implies that the topology can be described by its convergent sequences, and we shall frequently make use of this fact. This observation then immediately carries over to the case when M is not connected by applying it to each connected component. [Adams and Fournier 2003, Theorem 4.12], we have a continuous embedding

By the Sobolev Embedding Theorem
Since the semimetrics Let us verify that the subsets of H s ([t 0 , t 1 ]; M) specified in the preceding section are closed.

Lemma: (Closed subsets of curves)
Proof: (i) It suffices to show that the map and then by (3.2), This, in turn, implies that (γ j (t 0 )) j∈Z >0 converges to γ(t 0 ), giving continuity of ev t 0 .
(ii) Here we can show, similarly to the preceding part of the proof, that the mapping is continuous. (iii) We shall first show that the mappinĝ giving continuity of the mapping γ → γ from H s ([t 0 , t 1 ]; M) to H 0 ([t 0 , t 1 ]; TM). Now we show continuity of the mapping ξ → P D ⊥ • ξ from H 0 ([t 0 , t 1 ]; TM) to itself. As above, it suffices to show that, if (ξ j ) j∈Z >0 converges to ξ, then (F • P D ⊥ • ξ j ) j∈Z >0 converges to F • P D ⊥ • ξ for every F ∈ Lin ∞ (TM). However, this follows immediately since Parts (iv) and (v) follow from the preceding parts of the proof since the intersection of closed sets is closed.
We shall require these subsets to have more regularity than being merely closed. However, we shall have to wait until we have some calculus at hand before we can make sense of such additional regularity.
We will make a few more purely topological constructions before we start doing calculus. We shall topologise the spaces H This topology can be defined by the family of seminorms Again, these definitions are suggested by our very definition of these spaces, along with the fact that, since we are considering sections over one fixed curve γ in M, the semimetrics ρ s a,π * f will always evaluate to zero; that is, we need only consider linear functions when defining the topology. If G is a fibre metric on E, this topology can equivalently, and more easily, be defined by a single inner product: We leave to the reader the quite simple exercise of showing that the two topologies agree. We shall also use the following semi-inner product for H 1 ([t 0 , t 1 ]; γ * E): called the Dirichlet semi-inner product. If · D denotes the corresponding seminorm, note that ξ D = 0 if and only if ξ is constant. Therefore, as a consequence, ·, · D is an inner product on H 1 ([t 0 , t 1 ]; γ * E; x 0 , x 1 ). Let us verify that the subsets of H s ([t 0 , t 1 ]; γ * TM) specified in the preceding section are closed.
(ii) Here we can show that the map is continuous, rather as in the first part of the proof, and this gives the desired conclusion. (iii) We shall first show that the mappinĝ In particular, this holds if we replace "F " with "F • P D ⊥ ", and this then gives which establishes the desired continuity.
Parts (iv) and (v) follow from the preceding parts of the proof since the intersection of closed sets is closed.
3.3. Calculus on spaces of curves I. We shall need to do calculus involving mappings whose domain and/or codomain is one of our spaces of curves. We build up this calculus piece by piece, starting in this section with differentiability of curves in the space of curves.
First we define a suitable version of differentiability for mappings with values in H s ([t 0 , t 1 ]; M).

Definition: (Differentiability for mappings with values in H
• This definition of differentiability is natural, given our definition of the Sobolev spaces of curves H s ([t 0 , t 1 ]; M). However, to prove that the definition is useful requires some analysis. To do this, let us introduce some notation. Given Φ : and, for y ∈ N, we defineΦ Let us determine the properties ofΦ that characterise differentiability. It is possible to do this in general, however, we are solely interested here in C 1 -mappings defined on an interval in R. We are also primarily interested in working with H s ([t 0 , t 1 ]; M) when s = 1. Therefore, we focus our analysis on this case. The workings of the general situation can easily be deduced from what we do by adding some notation. Our first result is the following.

Lemma: (Curves of class C
; R) be of class C 1 (in the usual sense of a mapping between open subsets of Banach spaces), and let Dσ : J → H 1 ([t 0 , t 1 ]; R) be the derivative. Then the following statements hold: (i)σ s is absolutely continuous for every s ∈ J; (v) the mixed partial derivatives ∂ 1 ∂ 2σ and ∂ 2 ∂ 1σ exist almost everywhere and agree almost everywhere on J × [t 0 , t 1 ].

Proof: (i) This follows just because σ takes values in H
For s ∈ [s 0 , s 1 ], let h ∈ R be such that |h| ∈ (0, δ) and such that s + h ∈ [s 0 , s 1 ]. By the Mean Value Theorem [Abraham, Marsden, and Ratiu 1988, Proposition 2.4.8], there exists a between s and s + h such that This all shows that, for ∈ R >0 , there exists δ ∈ R >0 such that As this hold for any compact subinterval [s 0 , s 1 ] ⊆ J, we obtain the conclusions (ii)-(iv) of the lemma. ( This implies that, in particular, By continuity of the inclusion this means that, possibly by suitably modifying M , we have This then gives By [Minguzzi 2015, Theorem 7], we conclude that ∂ 1 ∂ 2σ (s, t) exists and that for almost every ( (i) f •σ s is absolutely continuous for every s ∈ J, (v) the mixed partial derivatives ∂ 1 ∂ 2 (f •σ) and ∂ 2 ∂ 1 (f •σ) exist almost everywhere and agree almost everywhere on J × [t 0 , t 1 ].
Proof: This follows immediately from the definition of σ being class C 1 and from Lemma 3.6.
As with our definition of H s ([t 0 , t 1 ]; M), we should verify that our definition of derivative agrees with standard coordinate versions. This follows along the lines of our brief discussion following (3.1), and noting that (1) the mapping between Sobolev spaces induced by changes of coordinate is continuous and (2) the definition of derivative for Banach space-valued functions is independent of equivalent norms. 6 Let us now investigate what one can say about continuously differentiable curves in H 1 ([t 0 , t 1 ]; M) by working with the above definitions and constructions using postcomposition with a smooth function. We note that From this expression, we note that the knowledge of ∂ 2 (f • σ)(s, t) for every f ∈ C ∞ (M) uniquely determines (σ s ) (t) ∈ Tσ (s,t) M. Let us denote νσ(s)(t) = (σ s ) (t). In like manner, we denote δσ(s)(t) = (σ t ) (s). We shall use all manner of notation associated with these constructions. It is our intention that the notation appear natural, even if it is a bit cumbersome. With apologies out of the way, we have the following notation: νσ(s)(t) = νσ(s, t) = νσ s (t) = νσ t (s), δσ(s)(t) = δσ(s, t) = δσ s (t) = δσ t (s).
In Figure 1 we illustrate how one should envision these quantities.
We next indicate where νσ and δσ takes their values.
3.8 Lemma: (The derivatives of a C 1 -curve in H 1 ([t 0 , t 1 ]; M)) Let M be a smooth manifold, let J ⊆ R be an interval, and let t 0 , t 1 ∈ R satisfy t 0 < t 1 . For a continuously differentiable mapping σ : for all s ∈ J. γ σ s νσ(s, t) δσ(s, t) Figure 1. A depiction of νσ and δσ. Note that νσ s is the tangent vector field forσ s and δσ t is the tangent vector field forσ t .
Proof: Throughout the proof, we fix s ∈ J.
The lemma then makes sense of the following definition.
3.9 Definition: (Derivative for mappings into spaces of curves) Let M be a smooth manifold, let J ⊆ R be an interval, and let t 0 , t 1 ∈ R satisfy t 0 < t 1 . For a continuously differentiable mapping σ : J → H 1 ([t 0 , t 1 ]; M), the derivative of σ is the mapping is, of course, easy: one merely throws away conditions and conclusions from our development above that involve " d dt ." We shall use this adaptation without further discussion when we require it.
2. It is also possible to extend the above development to higher-order Sobolev spaces of curves, H s ([t 0 , t 1 ]; M), s ≥ 2. To do so without using coordinates requires working carefully with jet bundles. Jafarpour and Lewis [2014] give some constructions along these lines that are useful. (ii) An infinitesimal variation of γ is an element of H 1 ([t 0 , t 1 ]; γ * TM). • One would like to think of an infinitesimal variation as being the "derivative" of a variation, and the following result makes this clear.
Proof: (i) This follows from Lemma 3.8.
(ii) Let G be a smooth Riemannian metric on M. Since δ is absolutely continuous, it is continuous. Thus it is bounded on the compact domain [t 0 , t 1 ], i.e., Therefore, by a compactness argument, there exists a ∈ R >0 such that exp(sδ(t)) is defined for s ∈ (−a, a) and t ∈ [t 0 , t 1 ]. Definê We claim that, if σ is defined as usual, i.e., σ(s)(t) =σ(s, t), then it is a variation with the desired properties. Let f ∈ C ∞ (M). We must show that , giving the first of our three desired conclusions.
To obtain continuity of s → f •σ s , we must show that one can make both of the integrals small by making s 1 and s 2 close. For the first, let ∈ R >0 . Continuity of (s, t) → f •σ(s, t) gives the existence of δ ∈ R >0 such that, if |s 1 − s 2 | < δ, then We then let δ 2 ∈ R >0 be such that, if |s 1 − s 2 | < δ 2 , then |s 1 − s 2 | 2 < 2M . By (3.5), we then have The preceding arguments give the second of our three desired conclusions. Finally, we compute since s → exp(sδ(t)) is the geodesic with initial velocity δ(t). Thus the definition of δσ gives δσ(0) = δ(t), giving the final of the three desired conclusions.
We can now define tangent vectors and the tangent spaces for H 1 ([t 0 , t 1 ]; M). We have not established-and will not establish-a manifold structure for H 1 ([t 0 , t 1 ]; M), and so we cannot really think of what we define as being a tangent space in the strictest sense of the word. However, our definitions obviously so closely represent the usual notions that we do not feel guilty when we eliminate the quotation marks around geometric names for objects that do not have their usual strict geometric meaning.
With this round of apologies out of the way, we proceed with definitions.
(i) A tangent vector at γ is an element of H 1 ([t 0 , t 1 ]; γ * TM). We close this section by proving an important result about swapping the order of covariant derivatives alongσ s andσ t . This can be seen as the geometric consequence of the equality of mixed partial derivatives in Lemma 3.6(v). Typically this sort of lemma is proved by using the Lie bracket of the vector fields defining the "time" and "variation" parameters. However, this is problematic, in general, since these vector fields are not regular enough to allow this. However, our use of smooth function evaluations allows us to give a geometric proof without needing Lie brackets of things that do not have Lie brackets.
by [Abraham, Marsden, and Ratiu 1988, Proposition 7.4.11]. We also have Combining the above formulae, and using the fact that G ∇ is torsion-free, gives the sublemma.
3.5. Calculus on spaces of curves II. We now extend our calculus from Section 3.3 from curves with values in spaces of curves to mappings from spaces of curves to manifolds. We do this by making use of the notion of tangent space from the preceding section.
We begin with the following definition.
The rôle of the post-composition with the smooth function g seems less necessary-indeed, possibly seems intrusive-in the present framework. However, in the next section we shall need a more sophisticated form of differentiability, and in this setting the post-composition by a smooth function will allow for simpler definitions, and will be consistent with our definition above.
Note that, for g ∈ C ∞ (N) and for a variation σ of γ ∈ H 1 ([t 0 , t 1 ]; M), we have
With this notation, we can further refine our notion of differentiability.
We call T γ Φ the derivative of Φ at γ. • We comment that this notion of derivative does not yet agree with the usual notion of Fréchet derivative, as for the latter one needs continuity of T γ Φ with respect to γ, cf. [Abraham, Marsden, and Ratiu 1988, Corollary 2.4.10]. While this is something we could impose and verify in all cases where we use the derivative, we pull up short of this since the technicalities would take us far afield.
3.6. Calculus on spaces of curves III. Now we extend our calculus to work for mappings with domain and codomain both being spaces of curves. Following Remark 3.10-1, we can extend the analysis to higher-order Sobolev spaces of curves, but we shall here use the lower-order space H 0 ([t 0 , t 1 ]; M), as this is the case we shall predominantly use.
3.17 Definition: (Variational differentiability for mappings with domain H 1 ([t 0 , t 1 ]; M) and codomain H 0 ([t 0 , t 1 ]; N)) Let M and N be smooth manifolds, and let t 0 , t 1 ∈ R satisfy t 0 < t 1 . Let Φ : We adopt our usual notation and write Then, according to our constructions preceding Lemma 3.8, we define by requiring that If δσ(0) = ξ, then we denote which we call the variational derivative of Φ at γ in the direction of ξ = δσ(0). With this notation, we make the following definitions.
We call T γ Φ the derivative of Φ at γ. ; M) assigns a topological space to a smooth manifold. We see that morphisms in the category of smooth manifolds induce morphisms in the category of topological spaces, and this allows us to think of the assignment as a functor. In this section we explore some attributes of this functor. We begin by establishing the natural way of assigning a mapping of curves to a mapping of manifolds.

Lemma: (The mapping of curves associated with a mapping of manifolds)
Let M and N be smooth manifolds, let Φ ∈ C ∞ (M; N), and let t 0 , t 1 ∈ R satisfy t 0 < t 1 . Then the mapping converges to Φ • γ and this gives the desired continuity.

If is clear that H
This then, indeed, defines a (covariant) functor H 1 ([t 0 , t 1 ]; •) from the category of smooth manifolds to the category of topological spaces.
Let us also prove that the morphism H 1 ([t 0 , t 1 ]; Φ) is differentiable, with differentiability as in Definition 3.18. This will require the reader to adapt Definition 3.18 from mappings into H 0 to mappings into H 1 . This is easily done and gives the following result.
3.8. Weak covariant derivatives along curves. Let π : E → M be a smooth vector bundle and let ∇ be a linear connection in E. For γ ∈ H 1 ([t 0 , t 1 ]; M), we have the operator We wish to extend this to an operator on "distributional sections" acting on "test sections" defined along γ. The difficulty in doing this geometrically is that the curve γ is generally not smooth, so the notion of a smooth compactly supported section along γ is problematic. Thus we here develop the framework for doing this. First we give the definition of the space of test sections. For a smooth vector bundle π : E → M, we denote by the sections of E with compact support. We topologise this space in the usual way, by requiring that a sequence (Ξ) j∈Z >0 in D (E) converges to zero if there exists a compact K ⊆ M such that supp(Ξ j ) ⊆ K, j ∈ Z >0 , and if (Ξ j ) j∈Z >0 and all of its jets converge uniformly to zero.
3.21 Definition: (Test sections along a curve) For a smooth vector bundle π : E → M and for γ ∈ H 1 ([t 0 , t 1 ]; M), the test sections of E along γ consists of the sections By γ * D (E) we denote the set of test sections of E over γ. We topologise γ * D (E) by requiring that a sequence (γ * Ξ j ) j∈Z >0 converges to zero if (i) there exists a compact K ⊆ M with K ∩ γ([t 0 , t 1 ]) ⊆ γ((t 0 , t 1 )) such that supp(Ξ j ) ⊆ K, j ∈ Z >0 , and (ii) (Ξ j ) j∈Z >0 and all of its jets converge uniformly to zero. • Now we can define distributional sections of a vector bundle.
3.22 Definition: (Distributional sections of a vector bundle) Let π : E → M be a smooth vector bundle and let γ ∈ H 1 ([t 0 , t 1 ]; M). A distributional section of E over γ is a continuous mapping from γ * D (E * ) to R. We denote the set of distributional sections of E over γ by γ * D (E). We use the weak topology for γ * D (E); explicitly, we define the topology by requiring that a sequence (θ j ) j∈Z >0 in γ * D (E) converge to zero if (θ j (γ * λ)) j∈Z >0 converge to zero for every γ One can verify easily that the mapping ξ → θ ξ is continuous. Now we define the covariant derivative of a distributional section along a curve. Thus we suppose that E possesses a vector bundle connection ∇. We note that, if ξ ∈ H 1 ([t 0 , t 1 ]; γ * E) and if γ * λ ∈ γ * D (E * ), then we have using the fact that λ • γ(t 0 ) = 0 and λ • γ(t 1 ) = 0. Motivated by this, for θ ∈ γ * D (E), we define its covariant derivative along γ by Showing that ∇ γ θ ∈ γ * D (E) amounts to showing that (∇ γ γ * λ j ) j∈Z >0 converges to zero if (γ * λ j ) j∈Z >0 converges to zero. This, however, is obvious by definition of convergence to zero in γ * D (E * ).

Invariant generalised or cogeneralised subbundles and affine subbundles, and affine subbundle varieties
In Section 7.1 we will lift the equation for constrained variational mechanics from the base space of a vector bundle to the total space of the vector bundle. The resulting equation is an affine equation in the vector bundle. Moreover, in Section 7.2 we shall see that it is interesting to consider when these affine equations leave invariant a cogeneralised subbundle obtained as the kernel of a certain vector bundle mapping. In this section we consider this setup in a general framework. A great deal of the required complication arises from the fact that we need to consider generalised and cogeneralised subbundles that have nonconstant rank. Moreover, as we are unaware of existing results of this nature, we are a little more comprehensive in our approach than is required by our subsequent use of these results.
In this section we carefully distinguish between smooth and real analytic regularity, as this plays an essential rôle in our results.

Varieties invariant under vector fields.
Before we specialise to generalised and cogeneralised subbundles, to generalised and cogeneralised affine bundles, and to affine subbundle varieties that are invariant under linear and affine vector fields, it is illustrative to first introduce our notions of invariance in a more general setting. To this end, we consider vector fields that leave invariant some subset of a manifold, allowing the case where the subset may not be a submanifold. We do require, however, that the subset have some structure, namely that of a variety as given in Section 2.7.
For a general subset S ⊆ M, we shall make the following construction.
With this terminology, we can make a definition of what we mean for a vector field to leave a subset invariant. To make the definition, we note that, if X ∈ Γ r (TM), we have a sheaf morphism (in the category of R-vector spaces) , This is, of course, simply the sheaf version of the ordinary Lie derivative of functions with respect to a vector field; thus we are guilt-free in using the same notation.

Definition:
(Invariant subsets for C r -vector field) Let r ∈ {∞, ω} and let M be a C r -manifold. Let X ∈ Γ r (TM) and let S ⊆ M.
While the preceding definitions are made for arbitrary subsets, they are really only useful for C r -varieties as it is only for varieties that one can usefully connect the notions of invariance and flow-invariance.

Proposition: (Relationship between invariance and flow-invariance)
Let r ∈ {∞, ω} and let M be a C r -manifold. For X ∈ Γ r (TM) the following statements hold: Proof: (i) Let U ⊆ M be open, let f ∈ I S (U), and let x ∈ U. Let T ∈ R >0 be such that Φ X t (x) exists and is in U for t ∈ (−T, T ). Since S is flow-invariant under X and since f ∈ I S (U), f • Φ X t (x) = 0 for every t ∈ (−T, T ). Thus, by [Abraham, Marsden, and Ratiu 1988, Theorem 4.2.10], and so L X f ∈ I S (U).
(ii) Let x ∈ M and let U be a neighbourhood of x. Let T ∈ R >0 be such that Φ X t (x) exists and is in U for t ∈ (−T, T ). By [Sontag 1998, Proposition C.3.12], the mapping Again by induction, if f ∈ I S (U), then Therefore, for f ∈ I S (U), we have, possibly after shrinking T , Let us show that the above arguments show that S is flow-invariant under X under the current hypotheses. Suppose that it is not. Then there exists x ∈ S and t ∈ R such that Φ X t (x) is defined, but Φ X t (x) ∈ S. We can assume for concreteness that t ∈ R >0 . Let Since S is closed and since t → Φ X t (x) is continuous, Φ X t * (x) ∈ S. The argument above then gives T ∈ R >0 such that, for t ∈ (−T, T ), in contradiction with the definition of t * . (iii) Let x ∈ S, and let U be a neighbourhood of x in M and f 1 , . . . , f k ∈ C r (U) be such that Since S is a submanifold, we can assume that df 1 (x), . . . , df k (x) are linearly independent. By shrinking U we can ensure that df 1 , . . . , df k are linearly independent on U. We now make use of a lemma.
1 Lemma: Let U ⊆ R n be a neighbourhood of 0, let S ⊆ R n be the subspace and let f ∈ C ∞ (U) satisfy f (x) = 0 for all x ∈ S ∩ U. Let pr S : R n → S be the natural projection onto the first k-components. Then there exists a neighbourhood V ⊆ U of 0 and g 1 , . . . , g k ∈ C ∞ (V) such that Proof: The hypothesis is that f vanishes on S ∩ U. Let W ⊆ S be a neighbourhood of 0 and let ∈ R >0 be such that B( , y) ⊆ U for all x ∈ W, possibly after shrinking W. Let We calculate By standard theorems on interchanging derivatives and integrals [Jost 2005, Theorem 16.11], we can conclude that g 1 , . . . , g k are smooth since f is smooth.
We conclude from this that X|U is tangent to S ∩ U. As this holds in some neighbourhood of any point in S, we conclude that X is tangent to S. From this we conclude that S is flow-invariant under X. 7

Generalised and cogeneralised subbundles invariant under linear vector fields.
We wish to consider linear vector fields on the total space of a vector bundle π : E → M that leave invariant a generalised or cogeneralised subbundle F ⊆ E, allowing the case where F may not be a subbundle. The following elementary lemma will be frequently called upon. We remind the reader of the notions of vertical evaluation introduced in Definition 2.1 and of the annihilator of a generalised subbundle introduced in Definition 2.15.

Lemma: (Vertical evaluations and the ideal sheaf of a cogeneralised subbundle)
Let r ∈ {∞, ω}, let π : E → M be a C r -vector bundle, and let F ⊆ E be a C r -cogeneralised subbundle. Define a sheaf morphism (of C r M -modules) Then (G r Λ(F) ) e ⊆ I F and, moreover, for e ∈ F, there is a neighbourhood V ⊆ E of e such that F ∩ V = {e ∈ V | λ e (e ) = 0, λ ∈ G r Λ(F) (π(V))}.
Proof: The only not completely obvious assertion is the final one, but this follows from Corollary 2.18.
The idea of the lemma is that, to carve out the cogeneralised subbundle F, it suffices to use vertical evaluations of sections of Λ(F). We note that, as a consequence of this, C r -cogeneralised subbundles are C r -varieties (stated as Corollary 2.19). However, C rgeneralised subbundles are not generally C r -varieties, e.g., they are not generally closed. Nonetheless, we will give useful theories of invariance and flow-invariance for both generalised and cogeneralised subbundles.
As a first step towards this, we now introduce the notions of invariance in which we shall be interested. To do so we first note that, as a consequence of Lemma 2.2(i) and parts (iv) and (vi) of Lemma 2.10, we have whenever X is a linear vector field on π : E → M. 7 We assume the well-known and "obvious" fact that, if a vector field is tangent to a submanifold, then the submanifold is flow-invariant under the vector field.

Remark: (Notions of invariance for subbundles)
There is an issue that must be addressed here. Note that, if F ⊆ E is a C r -cogeneralised subbundle, by virtue of being a C r -variety of E (by Corollary 2.19) it has an ideal sheaf I F . One can then ask whether the notion of invariance of F under a linear vector field agrees with that of Definition 4.2. This question boils down to whether (G r Λ(F) ) e generates I F as a C r E -module. This is certainly true when F is a subbundle, but we were not able to prove this when F is not regular. However, our approach and Lemma 4.4 obviates the need to know this, and gives the results that we want. Nonetheless, this does leave open an interesting question. • Let us state a more or less obvious result.

Lemma: (Correspondence between flow-invariance of F and Λ(F))
Let r ∈ {∞, ω}, let π : E → M be a C r -vector bundle, let F ⊆ E be a C r -generalised or a C rcogeneralised subbundle, let X 0 ∈ Γ r (TM), and let X ∈ Γ r (TE) be a linear vector field over X 0 . Then F is flow-invariant under X if and only if Λ(F) is flow-invariant under X * .
Proof: Suppose that F is invariant under X, let α ∈ Λ(F), and let x = π * (α). Let t ∈ R be such that Φ X 0 t (x) exists and compute, for e ∈ F Φ X t (π * (x)) , The proof of the other implication is carried out similarly.
Let us explore the relationship between subbundles that are invariant and those that are flow-invariant.

Proposition: (Relationship between invariance and flow-invariance under linear vector fields)
Let r ∈ {∞, ω}, let π : E → M be a C r -vector bundle, let F ⊆ E be a C r -generalised or a C r -cogeneralised subbundle, let X 0 ∈ Γ r (M), and let X lin ∈ Γ r (TE) be a linear vector field over X 0 . Consider the following statements: Then (i) =⇒ (ii) and, if either (1) r = ω or (2) r = ∞ and F is a subbundle, then (ii) =⇒ (i).
Proof: (i) =⇒ (ii) First we suppose that F is a C r -cogeneralised subbundle. Let U ⊆ M be open, let λ ∈ G r Λ(F) (U), and let e ∈ F|U. Let T ∈ R >0 be such that Φ X lin t (e) exists and is in π −1 (U) for t ∈ (−T, T ). Since F is flow-invariant under X lin and since λ e ∈ (G r Λ(F) ) e (U), λ e • Φ X lin t (e) = 0 for every t ∈ (−T, T ). Thus, by [Abraham, Marsden, and Ratiu 1988, Theorem 4.2.10], L X lin λ e (e) = d dt t=0 λ e • Φ X lin t (e) = 0, and so L X lin λ e ∈ (G r Λ(F) ) e (U) by Lemmata 2.2(i), and 2.10(iv) and (vi). Now let F be a C r -generalised subbundle that is flow-invariant under X lin . By Lemma 4.7 and the first part of the lemma, Λ(F) is invariant under X lin, * . By definition, this is the same thing as F being invariant under X lin .
(ii) =⇒ (i) Let us first suppose that F is a C ω -cogeneralised subbundle. Let e ∈ E and let U be a neighbourhood of π(e). Let T ∈ R >0 be such that Φ X lin t (e) exists and is in π −1 (U) for t ∈ (−T, T ). By [Sontag 1998, Proposition C.3.12], the mapping t → Φ X lin t (e) is real analytic. Thus, for λ ∈ Γ ω (E|U), t → λ e • Φ X lin t (e) is real analytic. Moreover, by an elementary induction on k ∈ Z >0 ,

Again by induction and by the current hypotheses
by virtue of (4.1). Therefore, for λ ∈ G r Λ(F) (U), we have, possibly after shrinking T , By Corollary 2.18, we conclude that, if e ∈ F, then Φ X lin t (e) ∈ F. Let us show that the above arguments show that F is flow-invariant under X lin under the current hypotheses. Suppose that it is not. Then there exists e ∈ F and t ∈ R such that Φ X lin t (e) is defined, but Φ X lin t (e) ∈ F. We can assume for concreteness that t ∈ R >0 . Let Since F is closed by Lemma 2.17 and since t → Φ X lin t (e) is continuous, Φ X lin t * (e) ∈ F. The argument above then gives T ∈ R >0 such that, for t ∈ (−T, T ), in contradiction with the definition of t * . The preceding gives this part of the proposition when F is a C ω -cogeneralised subbundle. Next suppose that F is a C ω -generalised subbundle invariant under X lin . Then, by definition, Λ(F) is a C ω -cogeneralised subbundle that is invariant under X lin, * . By the first half of this part of the proof, Λ(F) is flow-invariant under X lin, * . By Lemma 4.7, it follows that F is flow-invariant under X lin .
We conclude that X lin is tangent to F and so F is flow-invariant.
In the next section we shall give conditions for invariance of generalised and cogeneralised subbundles under linear vector fields as a special case of such conditions for affine vector fields.

Generalised and cogeneralised subbundles invariant under affine vector fields.
In this section we extend the analysis of the preceding section to generalised and cogeneralised subbundles invariant under affine vector fields. Thus the situation we consider is as follows. Let r ∈ {∞, ω}, let π : E → M be a C r -vector bundle, let X 0 ∈ Γ r (TM), and let X aff ∈ Γ r (TE) be an affine vector field over X 0 . If we suppose that E is equipped with a C r -linear connection, then we can write X aff = X h 0 + A e + b v for A ∈ Γ r (End(E)) and b ∈ Γ r (E), as in Lemma 2.2. Our conditions in this section then extend those from the previous section where b = 0.
We first have the following result concerning invariance of cogeneralised subbundles under affine vector fields.

Proposition: (Cogeneralised subbundles invariant under an affine vector field)
Let r ∈ {∞, ω}, let π : E → M be a C r -vector bundle, let ∇ be a C r -linear connection in E, let F ⊆ E be a C r -cogeneralised subbundle, let X 0 ∈ Γ r (TM), let A ∈ Γ r (End(E)), and let b ∈ Γ r (E). Consider the following statements: (i) F is flow-invariant under the affine vector field X aff X h 0 + A e + b v ; (ii) the following conditions hold: Then (i) =⇒ (ii) and, if either (1) r = ω or (2) r = ∞ and F is a subbundle, then (ii) =⇒ (i).
Proof: (i) =⇒ (ii) We shall prove each of the three conditions in sequence.
1 Lemma: Let r ∈ {∞, ω}, let π : E → M be a C r -vector bundle, let ∇ be a C r -linear connection in E, let E ⊆ E be a C r -subbundle, let ∇ be a C r -linear connection in E, let X 0 ∈ Γ r (TM), and let A ∈ Γ r (End(E)).
Thus we have X h 0 (e ) + A e (e ) ∈ T e E =⇒ ver(X h 0 (e ) + A e (e )) ∈ ver(T e E ) =⇒ A e (e ) ∈ V e E .
Since V e E E π(e ) , this gives A e (E x ) ⊆ E x for every x ∈ M. The result follows by definition of A e . Now, for λ ∈ Γ r (Λ(F)), we have L X h 0 +A e +b v λ e ∈ I F (M), L b v λ e ∈ I F (M), the first by hypothesis and by Proposition 4.3(i), and the second by (4.2). Therefore, (ii)(c) Now we can assume that b(x) ∈ F x and A(F x ) ⊆ F x for every x ∈ M. As in (4.2), Then, for e ∈ π −1 (U), we have L A e λ e (e) = (A * λ) e (e) = A * λ(π(e)); e = λ(π(e)); A(e) .

Therefore, for V ⊆ E open,
Thus we have Now, for U ⊆ M open and for λ ∈ G r Λ(F) (U), we have λ e ∈ I F (π −1 (U)). Therefore, with the current hypotheses, we have L X h 0 +A e +b v λ e ∈ I F (π −1 (U)), L A e λ e ∈ I F (π −1 (U)), L b v λ e ∈ I F (π −1 (U)), the first by hypothesis and by Proposition 4.3(i), the second by (4.4), and the third by (4.2). Therefore, by Lemma 2.10, we have ). Using Lemma 2.10, we have We point out that this Lie derivative not only vanishes on F, but it is everywhere zero. Therefore, L X h 0 +A e +b v ((G r Λ(F) ) e ) = L X h 0 +A e ((G r Λ(F) ) e ), simply by Lemma 2.10 and since Now suppose that F ∩ V = ∅. In this case we immediately have Next suppose that F ∩ U = ∅. In a similar manner, L A e λ e (e) = λ(π(e)); A(e) = 0, e ∈ F ∩ V.
Finally, L X h 0 λ e (e) = (∇ X 0 λ) e (e) = ∇ X 0 λ(π(e )); e = 0, e ∈ F ∩ V. Thus Using this fact, the proof of this part of the proposition can be carried out just as are the corresponding parts of Proposition 4.8.
The analogous result for generalised subbundles is the following.

Proposition: (Generalised subbundles invariant under an affine vector field)
Let r ∈ {∞, ω}, let π : E → M be a C r -vector bundle, let ∇ be a C r -linear connection in E, let F ⊆ E be a C r -generalised subbundle, let X 0 ∈ Γ r (TM), let A ∈ Γ r (End(E)), and let b ∈ Γ r (E). Consider the following statements: (i) F is flow-invariant under the affine vector field X aff X h 0 + A e + b v ; (ii) the following conditions hold: (c) ∇ X 0 (G r F ) ⊆ G r F . Then (i) =⇒ (ii) and, if either (1) r = ω or (2) r = ∞ and F is a subbundle, then (ii) =⇒ (i).

Generalised and cogeneralised affine subbundles invariant under affine vector fields.
Our next collection of subbundle invariance results concerns the invariance of affine subbundles invariant under the flow of affine vector fields. As with the preceding section, we separately consider the cases of generalised and cogeneralised affine subbundles.
First we give an affine analogue of Lemma 4.4, recalling the notation of (2.5).

4.11
Lemma: (Vertical evaluations and the ideal sheaf of a cogeneralised affine subbundle) Let r ∈ {∞, ω}, let π : E → M be a C r -vector bundle, and let B ⊆ E be a C r -cogeneralised affine subbundle given by B = ξ 0 + L(B). Define a sheaf morphism (of C r M -modules) Proof: Given Lemma 2.25, the only not completely obvious assertion is the final one, but this follows from Lemma 2.26.
Let us begin with a characterisation of an affine subbundle that is invariant under an affine vector field. 4.12 Lemma: (Characterisations of (co)generalised affine subbundles invariant under affine vector fields) Let r ∈ {∞, ω}, let π : E → M be a C r -vector bundle, let ∇ be a C r -linear connection in E, let B ⊆ E be a C r -generalised or a C r -cogeneralised affine subbundle given by B = ξ 0 + L(B) for ξ 0 ∈ Γ r (E), let X 0 ∈ Γ r (TM), and let X aff ∈ Γ r (TE) be an affine vector field over X 0 . Write X aff = X lin + b v for a linear vector field X lin and for b ∈ Γ r (E). Then the following statements are equivalent: Proof: We note that, by Proposition 2.5, is an affine mapping with linear part equal to Φ X lin t . Therefore, for e ∈ E, (ξ 0 (π(e))).
We shall use this formula in our proof.
(i) =⇒ (ii) Let u ∈ L(B) and let t ∈ R be such that Φ X 0 t (π(u)) is defined. We then have since B is flow-invariant under X aff and u + ξ 0 (π(u)), ξ 0 (π(u)) ∈ B. As a part of this argument, we have used the fact that Φ X aff t • ξ 0 (x) ∈ B, just since B is flow-invariant under X aff .
First we consider the case of a cogeneralised affine subbundle.

Proposition: (Cogeneralised affine subbundles invariant under an affine vector field)
Let r ∈ {∞, ω}, let π : E → M be a C r -vector bundle, let ∇ be a C r -linear connection in E, let B ⊆ E be a C r -cogeneralised affine subbundle given by B = ξ 0 + L(B) for ξ 0 ∈ Γ r (E), let X 0 ∈ Γ r (TM), let A ∈ Γ r (End(E)), and let b ∈ Γ r (E). Consider the following statements: (i) B is flow-invariant under the affine vector field X aff X h 0 + A e + b v ; (ii) the following conditions hold: Then (i) =⇒ (ii) and, if either (1) r = ω or (2) r = ∞ and L(B) is a subbundle, then (ii) =⇒ (i).
Thus, by Proposition 4.9 (applied to linear vector fields), parts (ii)(a) and (ii)(b) hold. Now let λ ∈ Γ r (Λ(L(B))) and recall from (2.9) the notation F λ . By Proposition 4.3 and Lemma 2.26, L X aff F λ ∈ I B . Using the decompositions and Lemma 2.10, we compute This, however, implies that by Corollary 2.18. Thus (ii)(c) holds as well.
For generalised affine subbundles, the invariance result is the following.
4.14 Proposition: (Generalised affine subbundles invariant under an affine vector field) Let r ∈ {∞, ω}, let π : E → M be a C r -vector bundle, let ∇ be a C r -linear connection in E, let B ⊆ E be a C r -generalised affine subbundle given by B = ξ 0 + L(B) for ξ 0 ∈ Γ r (E), let X 0 ∈ Γ r (TM), let A ∈ Γ r (End(E)), and let b ∈ Γ r (E). Consider the following statements: (i) B is flow-invariant under the affine vector field X aff X h 0 + A e + b v ; (ii) the following conditions hold: Then (i) =⇒ (ii) and, if either (1) r = ω or (2) r = ∞ and L(B) is a subbundle, then (ii) =⇒ (i).
Proof: (i) =⇒ (ii) From Lemma 4.12 we conclude that, under the current hypotheses, L(B) is flow-invariant under X lin . By Proposition 4.10, parts (ii)(a) and (ii)(b) hold, and we shall use the fact that these conditions hold in the rest of this part of the proof.
By Lemma 2.20, let U ⊆ M be an open and dense subset such that L(B)|U is a subbundle. Let α ∈ Λ(L(B))|U and note that, by Proposition 4.13, we have If α ∈ Λ(L(B)), let (α j ) j∈Z >0 be a sequence in Λ(L(B))|U converging to α, this being possible by Lemma 2.17. Then Thus we also have (ii)(c).
(ii) =⇒ (i) By Proposition 4.10 we have that L(B) is flow-invariant under X lin . Let x ∈ M and let t ∈ R be such that Φ X 0 t (x) is defined. Note that

) is a vertical curve and so is tangent to L(B) if and only if it is tangent to the fibres of L(B).
Since the tangent vector to this vertical curve in the fibre E x is This part of the result now follows from Lemma 4.12.

Affine subbundle varieties and defining subbundles invariant under affine vector
fields. Now we turn to characterising invariance of affine subbundle varieties and their defining subbundles, as defined in Section 2.9. To do this, consistent with Lemmata 4.4 and 4.11, we first characterise the ideal sheaf we use.

4.15
Lemma: (The ideal sheaf of an affine subbundle variety) Let r ∈ {∞, ω}, let π : E → M be a C r -vector bundle, and let A ⊆ E be a nonempty affine subbundle variety with defining subbundle ∆ ⊆ E * ⊕ R M . Then, for e ∈ A, there is a neighbourhood V ⊆ E of e such that A ∩ V = {e ∈ V | (λ, f ) e (e ) = 0, (λ, f ) ∈ G r ∆ (π(V))}. Proof: Since ∆ is a C r -generalised subbundle, this follows from Corollary 2.18.
Since the definition of an affine subbundle variety is made only relative to a defining subbundle, one expects that there should be a connection between the invariance properties of an affine subbundle variety and that of a defining subbundle. In order to talk about invariance properties of defining subbundles, we need to ascertain the relevant vector field on E * ⊕ R M with respect to which we discuss invariance. Let us set this up.
We let r ∈ {∞, ω}, and let π : E → M be a C r -vector bundle with ∇ a C r -linear connection in E. We consider an affine vector field for X 0 ∈ Γ r (TM), A ∈ Γ r (End(E)), and b ∈ Γ r (E). Define a connection ∇ on E ⊕ R M by This is the connection obtained as a direct sum of ∇ and the canonical flat connection on R M . We then define a linear vector field on the vector bundle E ⊕ R E by where the horizontal lift in the first term on the right is that associated with the connection ∇ and, for the second term on the right, The following lemma explains this definition of the linear vector field X aff , noting that there is a correspondence between Lin r (E ⊕ R M ) with Aff r (E). Let us be explicit about this. Let (λ, g) ∈ Γ r (E * ⊕ R M ). This defines a linear function F (λ,g) on E ⊕ R M by F (λ,g) (e, a) = λ(π(e)); e + ag(π(e)) and an affine function F (λ,g) on E in the usual way: F (λ,g) (e) = λ(π(e)); e + g(π(e)).
4.16 Lemma: (Linear vector fields on E⊕R M correspond to affine vector fields on E) We let r ∈ {∞, ω}, let π : E → M be a C r -vector bundle with ∇ a C r -linear connection in E. If X aff = X h 0 + A e + b v , is an affine vector field on E for A ∈ Γ r (End(E)) and b ∈ Γ r (E), then the linear vector field X aff on E ⊕ R M satisfies L X aff F (λ,g) (e, 1) = L X aff F (λ,g) (e), e ∈ E, (λ, g) ∈ Γ r (E * ⊕ R M ).
Proof: Using Lemma 2.10, we calculate Thus the lemma holds by making the identification Lin r (E⊕R M ) Aff r (E) indicated before the statement of the lemma. Now, since there is not a unique correspondence between defining subbundles and their associated affine subbundle varieties (many defining subbundles might give rise to the same affine subbundle variety), the way we shall characterise the invariance properties of an affine subbundle variety is as according to the following definition.

Definition: (Affine subbundle varieties and defining subbundles invariant under affine vector fields)
Let r ∈ {∞, ω}, let π : E → M be a C r -vector bundle, let X 0 ∈ Γ r (TM), and let X aff ∈ Γ r (TE) be an affine vector field over X 0 .
a defining subbundle ∆ such that A = A(∆) and such that ∆ is invariant under X aff .
A remark similar to Remark 4.6 can be made here concerning the possible conflicting notions of invariance. While the issues raised by this are interesting, we sidestep them in our approach by virtue of Lemma 4.15.
Let us prove a basic result regarding the relationship of the flow-invariance of an affine subbundle variety versus that of a defining subbundle.

Lemma: (Correspondence between flow-invariance of ∆ and A(∆))
Let r ∈ {∞, ω}, let π : E → M be a C r -vector bundle, let ∆ ⊆ E * ⊕ R be a C r -defining subbundle, and let A be a C r affine subbundle variety. Let X 0 ∈ Γ r (TM), let A ∈ Γ r (End(E)), let b ∈ Γ r (E), and consider the affine vector field Then the following statements hold: Proof: (i) The definition of X aff shows that t → (Υ(t), α(t)) is an integral curve for X aff if and only if Thus we see that t → Υ(t) is an integral curve for X aff if and only if t → (Υ(t), 1) is an integral curve for X aff , giving this part of the result.
(ii) By part (i) it is sufficient to show that, under the given hypothesis, the set Now let e ∈ A(∆), let x = π(e), and let t ∈ R be such that using (4.6). Using the characterisation of the integral curves of X aff from the proof of part (i), we have (4.6). This shows that, if ∆ is flow-invariant under X aff, * , then the set is flow-invariant under X aff , as desired.
(iii)(a) Let x ∈ S(A) and let t ∈ R be such that Φ X 0 t (x) exists. Let e ∈ A x . Then e Φ X aff t (e) ∈ A. Therefore, using (4.6). Thus Φ X aff, * Note that, since defining subbundles are generalised subbundles, their characterisation is made by reference to Proposition 4.10, specialising to the case of the proposition when the vector field is linear. To do this, it is convenient to define, for a defining subbundle ∆ ⊆ E * ⊕ R M and for x ∈ M, subspaces ∆ 1,x and ∆ 0,x of E * and E * x ⊕ R, respectively, as in (2.10) and (2.11). Note that Lemma 2.29(ii) associates ∆ 1,x with the linear part of the affine subspace fibre A(∆) x for x ∈ S(A(∆)).
We then arrive at the following result concerning flow-invariant defining subbundles.

4.19
Proposition: (Defining subbundles invariant under an affine vector field) Let r ∈ {∞, ω}, let π : E → M be a C r -vector bundle, let ∇ be a C r -linear connection in E, let ∆ ⊆ E * ⊕ R M be a C r -defining subbundle, let X 0 ∈ Γ r (TM), let A ∈ Γ r (End(E)), and let b ∈ Γ r (E). Consider the affine vector field and the following statements: (i) ∆ is flow-invariant under X aff ; (ii) the following conditions hold: Proof: In the proof, let us abbreviate subbundles of E * ⊕ R by We shall make the identification Λ 1 (E * ⊕ R M )/Λ 0 . Note that (i) =⇒ (ii) By definition, ∆ is (flow-)invariant under the affine vector field X aff if ∆ is (flow-)invariant under the linear vector field X aff, * . By Proposition 4.10 (specialised to linear vector fields), since ∆ is flow-invariant under X aff, * we conclude that Note that, for an open set U ⊆ M, Thus {(0, 0)}, x ∈ S(A(∆)).
By Lemma 2.29(i), we have G r ∆ 0 I S(A(∆)) . By Lemma 4.18(ii), A(∆) is flow-invariant under X aff . By Lemma 4.18(iii) Thus ∇ X 0 descends to a sheaf morphism on the quotient Moreover, under this identification, the morphism on the quotient sheaf is ∇ X 0 . Thus condition 2 above gives Note that the preceding result says nothing about whether the defining subbundle is total, partial, or null. The following result, all of whose conclusions follow from already proven results, summarises how one should approach the matter of determining the properties of a defining subbundle that is flow-invariant under X aff . This should be regarded as providing a list of constraints that can be enforced to determine whether one can find a partial defining subbundle that is flow-invariant under X aff .
4.20 Proposition: (Total, partial, and null defining subbundles invariant under an affine vector field) Let r ∈ {∞, ω}, let π : E → M be a C r -vector bundle, let ∇ be a C r -linear connection in E, let ∆ ⊆ E * ⊕ R M be a C r -defining subbundle, let X 0 ∈ Γ r (TM), let A ∈ Γ r (End(E)), and let b ∈ Γ r (E). Consider the affine vector field and assume that ∆ is flow-invariant under X aff . Then Moreover, if ∆ is partial or total, then (iv) L X 0 (I S(A(∆)) ) ⊆ I S(A(∆)) .
Proof: We adopt the notation from the proof of Proposition 4.19.
(i) Note that and so Λ 0 is invariant under (A, b) * . Thus (A, b) * descends to an endomorphism on the quotient Under the identification, this endomorphism is A * . This gives this part of the result.
(ii) This follows from Lemma 2.29(i). (iii) By Lemma 2.29(i), for x ∈ S(A(∆)) we have ∆ 0,x = {(0, 0)}. In this case, for x ∈ S(A(∆)), we have This gives this part of the result, in particular. The preceding results enable us to identify flow-invariant defining subbundles. Having identified one of these, by Lemma 4.18 one automatically gets a flow-invariant affine subbundle variety, at least when the defining subbundle is not null. What the result does not do is answer the question of whether, given a flow-invariant affine subbundle variety, one can find a corresponding flow-invariant defining subbundle. Fortunately, we are not required to answer this question since, as part of our constructions of the next section, we will be naturally led first to a flow-invariant defining subbundle.

Invariant affine subbundle varieties contained in subbundles.
We now investigate our discussion from the preceding sections from a different angle. The question in which we are interested is the following.

4.21
Question: (Integral curves of affine vector fields that leave invariant a cogeneralised subbundle) Let r ∈ {∞, ω}, let π : E → M be a C r -vector bundle, let ∇ be a C r -linear connection in E, let F ⊆ E be a C r -cogeneralised subbundle, let X 0 ∈ Γ r (M), let b ∈ Γ r (E), and let A ∈ Γ r (End(E)). Denote With this data, the basic question we consider is: Are there integral curves of X aff that leave F invariant?
The question has a few different components to it. First of all, it is an existential question. As well, assuming that the existential question has been answered in the affirmative, one can then ask about the character of all integral curves of X aff that leave F invariant. • We address the above question by first understanding the structure of all integral curves of an affine vector field that leave invariant a cogeneralised subbundle.
We start by considering the case of linear vector fields.

4.22
Theorem: (The largest invariant cogeneralised subbundle of a cogeneralised subbundle) Let r ∈ {∞, ω}, let π : E → M be a C r -vector bundle, let ∇ be a C r -linear connection in E, let F ⊆ E be a C r -cogeneralised subbundle, let X 0 ∈ Γ r (M), and let A ∈ Γ r (End(E)). Denote Then the following statements hold: Proof: Just as we shall show in the proof of the more general Theorem 4.23 below, we have Let us deduce an alternative characterisation of L(F, X lin ). Let and let Φ X 0 : D(X 0 ) → M be the flow, i.e., Φ X 0 (t, x) = Φ X 0 t (x). By Lemma 2.3(i), Denote by Φ X lin : D(X lin ) → E the flow, i.e., Φ X lin (t, e) = Φ X lin t (e). If λ ∈ Γ r (E * ), then there is the associated section Φ * X 0 λ of the pull-back bundle Φ * We claim that First let e ∈ L(F, X lin ) x , let λ ∈ Γ r (Λ(F)), and let t ∈ R be such that (t, x) ∈ D(X 0 ). Then, since e ∈ L(F, X lin ) and λ ∈ Γ r (Λ(F)), it follows that Now let e ∈ E x be such that Φ * X lin λ e (e, (t, x)) = 0 for all λ ∈ Γ r (Λ(F)) and t such that (t, x) ∈ D(X 0 ). Then we have (t, x)) = 0. By Corollary 2.18 we conclude that Φ X lin t (e) ∈ F and so we have verified our claim (4.7). As a consequence of this, is a subspace for every λ ∈ Γ r (E * ) and t ∈ R such that (t, x) ∈ D(X 0 ), we see that L(F, X lin ) x is a subspace of E x . We now turn our attention to the regularity of L(F, X lin ) by defining a subsheaf L * (F, X lin ) with the property asserted in (iii). Define a subsheaf L * (F, X lin ) of (G r Λ(F) ) e by (4.8) Then, by virtue of (4.7), for U ⊆ M open, i.e., L(F, X lin ) is the zero set for the local sections of L * (F, X lin ). This establishes (iii). Next we prove part (iv), considering two cases. First we consider the case of r = ∞. In this case, we have that, in the terminology of [Lewis 2012], L * (F, X lin ) is patchy. Therefore, by virtue of Proposition 3.23 of [Lewis 2012] and the definition of patchy sheaves, we conclude that, for each x ∈ M, there is a neighbourhood U of x and local sections (λ i ) i∈Ix from L * (F, X lin )(U) generating L * (F, X lin )(U) as a C ∞ M (U)-module. This shows that Λ(L(F, X lin )) is a C ∞ -generalised subbundle, and so L(F, X lin ) is a C ∞ -cogeneralised subbundle by (4.9). Now we consider the case when r = ω and X 0 is complete. In this case, for U ⊆ M open, the sections are generators for L * (F, X lin )(U) as a C ω M (U)-module. Thus, for each x ∈ M, there is a neighbourhood U of x such that L * (F, X lin )(U) is generated, as a C ω M (U)-module, by some family of sections of E * restricted to U. Thus Λ(F) is a C ω -generalised subbundle, and so L(F, X lin ) is a C ω -cogeneralised subbundle by (4.9).
In the proof we constructed a subsheaf L * (F, X lin ) of Γ r (E * ) by (4.8). We define a subset Λ(F, X lin ) of E * by Under the technical hypotheses of part (iv), Λ(F, X lin ) is a C r -generalised subbundle and the associated C r -cogeneralised subbundle is L(F, X lin ). We should think (1) of Λ(F, X lin ) as being the smallest subbundle of E * that annihilates F and is flow-invariant under X lin, * , and (2) of L(F, X lin ) as being the largest subbundle of E that is contained in F and is flow-invariant under X lin, * . Now we consider the general case of affine vector fields.

4.23
Theorem: (The largest invariant affine subbundle variety contained in a cogeneralised subbundle) Let r ∈ {∞, ω}, let π : E → M be a C r -vector bundle, let ∇ be a C r -linear connection in E, let F ⊆ E be a C r -cogeneralised subbundle, let X 0 ∈ Γ r (M), let b ∈ Γ r (E), and let A ∈ Γ r (End(E)). Denote Then the following statements hold: (vi) if B ⊆ F is a C r -affine subbundle variety that is flow-invariant under X aff , then B ⊆ A(F, X aff ).
Proof: The conclusion (i) follows by definition of A(F, X aff ). For (ii), let e ∈ A(F, X aff ) and let t ∈ R be such that Φ X 0 t (π(e)) exists. Then e Φ X aff t (e) ∈ F by definition of A(F, X aff ). To show that e ∈ A(F, X aff ), let t ∈ R be such that Φ X 0 t (π(e )) is defined. Then since e ∈ A(F, X aff ), giving e ∈ A(F, X aff ), as desired. For (iii), let x ∈ S(F, X aff ) and let t ∈ R be such that Φ X 0 t (x) exists. Let e ∈ A(F, X aff )∩ E x . Then we showed above that e Φ X aff t (e) ∈ A(F, X aff ). Therefore, as desired.
To show (vi), let B be as stated and let e ∈ B. Then, because B is flow-invariant under X aff and since B ⊆ F, Φ X aff t (e) ∈ F for all t ∈ R such that Φ X 0 t (π(e)) is defined. That is to say, e ∈ A(F, X aff ).
It thus remains to prove (iv) and (v).
Now let e ∈ E x be such that Φ * X aff λ e (e, (t, x)) = 0 for all λ ∈ Γ r (Λ(F)) and t such that (t, x) ∈ D(X 0 ). Then we have By Corollary 2.18 we conclude that Φ X aff t (e) ∈ F and so we have verified our claim (4.10). Note that, because the flow of X aff is not linear but affine, we have that (Φ X aff t ) * λ e is an affine function for λ ∈ Γ r (E * ). Thus we can regard (Φ X aff t ) * λ e as a section of E * ⊕ R M via (2.5). With this in mind, define a subsheaf A * (F, (4.11) Then, by virtue of (4.10), for U ⊆ M open, (4.12) i.e., A(F, X aff ) is the zero set for the local sections of A * (F, X aff ). This proves (iv). We consider the two cases for part (v). The first case we consider is that of r = ∞. In this case, we have that, in the terminology of [Lewis 2012], A * (F, X aff ) is patchy. Therefore, by virtue of Proposition 3.23 of [Lewis 2012] and the definition of patchy sheaves, we conclude that, for each x ∈ M, there is a neighbourhood U of x and local sections ((λ i , f i )) i∈Ix from A * (F, X aff )(U) generating A * (F, X aff )(U) as a C ∞ M (U)-module. This shows that A(F, X aff ) has a defining bundle that is a C ∞ -generalised subbundle, and so A(F, X aff ) is a C ∞ -affine subbundle variety by (4.12).
Next we suppose that r = ω and that X 0 is complete. In this case, for U ⊆ M open, the sections are generators for A * (F, X aff )(U) as a C ω M (U)-module. Thus, for each x ∈ M, there is a neighbourhood U of x such that A * (F, X aff )(U) is generated, as a C ω M (U)-module, by some family of sections of E * ⊕ R M restricted to U. Thus A(F, X aff ) has a defining bundle that is a C ω -generalised subbundle, and so A(F, X aff ) is a C ω -affine subbundle variety by (4.12).
In the proof we constructed a subsheaf A * (F, X aff ) of G r E * ⊕R M by (4.11). We define a subset ∆(F, Under the technical hypotheses of part (v), ∆(F, X aff ) is a C r -defining subbundle and the associated C r -affine subbundle variety is A(F, X aff ). The definition of ∆(F, X aff ) and Lemma 4.16 ensure that ∆(F, X aff ) is invariant under X aff, * . We should thus think (1) of ∆(F, X aff ) as being the smallest subbundle of E * ⊕ R M whose associated affine functions annihilate F and that is flow-invariant under X aff, * , and (2) of A(F, X aff ) as being the largest affine subbundle variety contained in F that is flow-invariant under X aff .
Let us see how to use Theorem 4.23 to answer Question 4.21, leaving aside the technicalities of when certain sheaves are sheaves of sections of generalised subbundles. We do this in the general case of affine vector fields, with linear vector fields being an easier special case.
1. Determine the smallest generalised subbundle ∆(F, X aff ) consisting of affine functions that vanish on F and which is invariant under the flow of X aff, * (F, X aff ). Explicitly, 2. Define the corresponding affine subbundle variety A(F, X aff ).

We then have
That is to say, A(F, X aff ) consists of all initial conditions through which integral curves of X aff remain in F.
Of course, the preceding "algorithm" is not practical, relying as it does on knowing the flow of the affine vector field X aff . The following two results give the corresponding associated infinitesimal conditions.
We begin with the case of invariance under linear vector fields.

Theorem: (Cogeneralised subbundles invariant under a linear vector field
and contained in a cogeneralised subbundle) Let r ∈ {∞, ω}, let π : E → M be a C r -vector bundle, let ∇ be a C r -linear connection in E, let F ⊆ E be a C r -cogeneralised subbundle, let L be a C r -cogeneralised subbundle, let X 0 ∈ Γ r (M), and let A ∈ Γ r (End(E)). Denote X lin = X h 0 + A e and suppose that L is flow-invariant under X lin . Consider the following statements: (i) L ⊆ F; (ii) the following conditions hold: . Then (i) =⇒ (ii) and, if either (1) r = ω or (2) r = ∞ and F is a subbundle, then (ii) =⇒ (i).
Since this holds for every λ ∈ Γ r (Λ(F)), from Proposition 4.9 we conclude that all integral curves of X lin with initial conditions in L remain in F. Since L is flow-invariant under X lin , we conclude that L ⊆ F.
Next we consider the case of affine vector fields. Here we wish to obtain conditions on a defining subbundle ∆ that ensure that its corresponding affine subbundle variety A(∆) remains in a given subbundle F. However, because A(∆) may be empty, we would like instead to make the problem into one that always has a solution, and then leave the matter of checking whether A(∆) is nonempty to something one can do afterwards. To this end, we note that, if A(∆) ⊆ E is flow-invariant under the affine vector field X aff , then is flow-invariant under X aff , according to Lemma 4.18. Clearly Therefore, we seek conditions on a defining bundle ∆ ⊆ E * ⊕ R M that is flow-invariant under X aff (meaning, by definition, that it is flow-invariant under X aff, * ) and satisfies The following result gives conditions to this end, recalling from (2.10) the definition of ∆ 1 pr 1 (∆).

4.25
Theorem: (Defining subbundles invariant under an affine vector field and annihilating a cogeneralised subbundle) Let r ∈ {∞, ω}, let π : E → M be a C r -vector bundle, let ∇ be a C r -linear connection in E, let F ⊆ E be a C r -cogeneralised subbundle, let ∆ be a C r -defining subbundle, let X 0 ∈ Γ r (M), let b ∈ Γ r (E), and let A ∈ Γ r (End(E)). Denote X aff = X h 0 + A e + b v and suppose that ∆ is flow-invariant under X aff . Consider the following statements: (ii) the following conditions hold: and, if either (1) r = ω or (2) r = ∞ and F is a subbundle, then (ii) =⇒ (i).
Proof: Let us first explore the linear algebra associated with the objects in the statement of the theorem.
1 Lemma: The following statements hold: Proof: (i) We note that This part of the result then follows from Lemma 2.26.
By letting λ = 0 and g be arbitrary, we see that this implies that a = 1 (of course). Thus we arrive at the conclusion that condition (i) is equivalent to as claimed.
Note that Λ(∆) is flow-invariant under X aff by Lemma 4.7, since ∆ is flow-invariant under X aff, * . Since E×{1} is flow-invariant under X aff (by Lemma 4.18(ii), taking ∆ = {0} and so A(∆) = E), we conclude that the cogeneralised affine subbundle Λ(∆) is flow-invariant under X aff , being the intersection of two flow-invariant sets. Thus, by Proposition 4.13 (specialised to linear vector fields), we have .
We conclude, from Proposition 4.13, that, when r = ω or when r = ∞ and F is a subbundle, that all integral curves of X aff with initial conditions in Λ(∆) remain in F. Since Λ(∆) is flow-invariant under X aff (as we pointed out in the preamble to the proof), this implies (i).
One can combine the previous results with Proposition 4.13 to obtain the following procedure for finding invariant affine subbundles contained in a given subbundle.
We first consider the linear case.

4.26
Remark: (Finding invariant cogeneralised subbundles contained in a cogeneralised subbundle) Let r ∈ {∞, ω}, let π : E → M be a C r -vector bundle, let ∇ be a C r -linear connection in E, let F ⊆ E be a C r -cogeneralised subbundle, let X 0 ∈ Γ r (M) be complete, and let A ∈ Γ r (End(E)). Denote Find a flow-invariant cogeneralised subbundle L ⊆ F satisfying the following algebraic/differential conditions: We shall say that L satisfying these conditions is (X lin , F)-admissible. The resulting cogeneralised subbundle L is then flow-invariant under X lin and is contained in F. • In the affine case, we have the following.

4.27
Remark: (Finding invariant affine subbundle varieties contained in a cogeneralised subbundle) Let r ∈ {∞, ω}, let π : E → M be a C r -vector bundle, let ∇ be a C r -linear connection in E, let F ⊆ E be a C r -cogeneralised subbundle, let X 0 ∈ Γ r (M) be complete, let b ∈ Γ r (E), and let A ∈ Γ r (End(E)). Denote Find a flow-invariant defining subbundle ∆ ⊆ E * ⊕ R M satisfying the following algebraic/differential conditions: The resulting affine subbundle variety A(∆) is then flow-invariant under X aff and is contained in F. • The methodology outlined in the preceding constructions involve some interesting partial differential equations with algebraic constraints. With some effort, it might be possible to apply the integrability theory for partial differential equations of, e.g., Goldschmidt [1967aGoldschmidt [ , 1967b to arrive at the obstructions to solving these equations. An application of the resulting conditions to the setup of Section 7 would doubtless lead to some interesting answers to the central questions of this paper.

Nonholonomic and constrained variational mechanics
In this section we derive the two sets of equations whose correspondences we study. The equations we produce here are derived by Kupka and Oliva [2001], and we fill in some of the missing steps in their proofs. Additionally, we provide intrinsic proofs for some steps that are carried out using coordinates by Kupka and Oliva. For the most part, however, our derivations are intended to be an illustration of the methodology of Section 3 for working with spaces of curves.
We begin in Section 5.1 by characterising the tangent spaces to various classes of curves, using our constructions from Section 3.6. In Section 5.2 we introduce the energy functions we consider in the paper and indicate how to differentiate these, using the calculus from Section 3.5. The equations of nonholonomic mechanics are derived in Section 5.3, and we reiterate here that it is these equations of Section 5.3 that correspond in physics to the Newton-Euler equations. This is developed in a general and geometric setting by Lewis [2017]. By contrast, the constrained variational equations developed in Section 5.4 do not generally produce equations that correspond to the physical equations of motion; instead, this setting does reproduce the equations for extremals in sub-Riemannian geometry, as we explore in Section 6.
For subsequent brevity, let us make a definition encompassing the data in which we shall be interested in this section.

Definition:
(Constrained simple mechanical system) Let r ∈ {∞, ω}, A C rconstrained simple mechanical system is a quadruple Σ = (M, G, V, D) where (i) M is a C r -manifold (the configuration manifold ), (ii) G is a C r -Riemannian metric on M (the kinetic energy metric), (iii) V is a C r -function (the potential energy function), and (iv) D ⊆ TM is a C r -subbundle (the constraint distribution). • 5.2 Remark: (The subbundle assumption for the velocity constraints) In some physical systems, the velocity constraints do not describe a subbundle. In the physical systems of which we are aware, the failure of the velocity constraints to be a subbundle is a result of constraint forces aligning, and so the annihilator of the velocity constraints drops rank in such configurations. As a consequence, when the velocity constraints fail to be a subbundle, they are instead a cogeneralised subbundle (not a generalised subbundle). Moreover, we are not aware of any means of determining the equations of motion when the velocity constraints are not a subbundle. It seems like there is something in the Laws of Nature that is yet to be understood for nonholonomic mechanical systems. The upshot of the above discussion is the following two points: 1. the assumption that the velocity constraints describe a subbundle is made with loss of physical generality; 2. we are not aware of any way of overcoming this loss of generality. • 5.1. Submanifolds of curves and their tangent spaces. In this section we consider how some of the subsets of curves from Section 3.1 can be thought of as submanifolds of H 1 ([t 0 , t 1 ]; M). We shall also describe the tangent spaces to these submanifolds. To do this, we shall use reasoning closely resembling the usual Implicit Function Theorem arguments, using our notions of derivative and tangent space from Sections 3.5 and 3.6.
5.1.1. Curves with endpoint constraints. We first consider the subsets We recall from the proof of Lemma 3.3 that we had defined the two evaluation maps and and had shown that they are continuous. Here we consider their differentiability and their derivatives.

5.1.2.
Curves with derivatives in a distribution. Now we consider the addition to the data of the subbundle D ⊆ TM. We first wish to prove that H 1 ([t 0 , t 1 ]; M; D) is a submanifold of H 1 ([t 0 , t 1 ]; M), and we do this in a manner similar to that in the preceding section, with an argument reminiscent of the Implicit Function Theorem, using the calculus from Section 3.6.
We recall from the proof of Lemma 3.3 the mappinĝ where we evidently have introduced a Riemannian metric G on M. Let us consider the differentiability properties of this map.

Lemma:
(Regular points for the projection onto a distribution) Let (M, G) be a smooth Riemannian manifold, let D ⊆ TM be a smooth subbundle, and let t 0 , t 1 ∈ R satisfy t 0 < t 1 . Then the following statements hold: Given this, to show thatP D ⊥ is continuous, it will suffice to show that, for F ∈ Aff ∞ (TM), the function (s, t) → F • P D ⊥ • νσ(s, t) defines a differentiable curve in H 0 ([t 0 , t 1 ]; TM), keeping in mind that, because we are working with H 0 and not H 1 , we can ignore the conditions for differentiability that depend on t-derivatives, cf. Remark 3.10-1. First let f ∈ C ∞ (M) and compute (5.1) As we argued in the proof of Lemma 3.3(iii) (making use of Lemma 1.1(i)), knowledge of an estimate for the right-hand side of this previous expression gives a corresponding estimate for F ∈ Lin ∞ (TM). Thus it suffices to show that, for any f ∈ C ∞ (M) and any variation σ is continuous. Now we turn to β. We first prove a technical sublemma.
It is clear that, if F 1 , . . . , F N ∈ Lin ∞ (R N M ) comprise the standard dual basis, then Thus we conclude that Σ 1 , . . . , Σ N : are continuously differentiable. This construction applies, in particular, to Σ = νσ, in which case we get the assertion of the sublemma by projecting νσ(s, t) =Σ 1 (s, t) X 1 (σ(s, t)) + · · · +Σ N (s, t) X N (σ(s, t)) onto TM. Moreover, the same construction applies if Σ takes values in H 1 ([t 0 , t 1 ]; R N M ), in which case we can apply the result to Σ = δσ, noting in this case that δσ is not continuously differentiable, but continuous.
Given the sublemma, we see that we can writê To show that β is continuous, let s 0 ∈ J and let (s j ) j∈Z >0 be a sequence in J converging to s 0 . Let K ⊆ J be a compact subinterval such that s j ∈ K, j ∈ Z ≥0 .
Let us denote where ζ : M → D ⊥ is the zero section and where we are making reference to H 1 ([t 0 , t 1 ]; ζ) as a functor as in Section 3.7. Since ζ is an injective immersion, by Lemma 3.20 we can assert that Z 0 ([t 0 , t 1 ]; D ⊥ ) is a submanifold of H 0 ([t 0 , t 1 ]; D ⊥ ). Since we clearly have and since the preceding lemma shows that points in H 1 ([t 0 , t 1 ]; M; D) are regular points for P D ⊥ , we can assert that
giving the desired conclusion.
The lemma allows us to assert that H 1 ([t 0 , t 1 ]; M; D; x 0 ) is a submanifold of H 1 ([t 0 , t 1 ]; M; D) with tangent space at γ ∈ H 1 ([t 0 , t 1 ]; M; D; x 0 ) given by Let us now consider curves in the distribution fixing the endpoint at t 1 as well. Here we must consider whether the mapping is differentiable at γ ∈ H 1 ([t 0 , t 1 ]; M; D; x 0 ) and whether the derivative is surjective. In this case, one verifies, just as in the proof of Lemma 5.5, that ev t 1 is differentiable. However, it is not generally the case that T γ ev t 1 is surjective. Because of this, we introduce the following terminology.

5.6
Definition: (D-regular curve, D-singular curve) Let (M, G) be a smooth Riemannian manifold, let D ⊆ TM be a smooth subbundle, and let t 0 , t 1 ∈ R satisfy t 0 < t is surjective and is (ii) D-singular if it is not D-regular.
• Let us consider some singular curves to show that the above classification has content. A simple situation where singular curves arise is given in the following result. 5.7 Proposition: (The tangent space of H 1 ([t 0 , t 1 ]; M; D; x 0 ) when D is integrable) Let (M, G) be a Riemannian manifold and let D ⊆ TM be a smooth subbundle. Then the following two statements are equivalent: (i) D is integrable; (ii) for every x 0 ∈ M, for every t 0 , t 1 ∈ R with t 0 < t 1 , and for every γ ∈ . Let Λ(D, x 0 ) be the leaf of the foliation associated with D through x 0 [Abraham, Marsden, and Ratiu 1988, Theorem 4.4.7]. Since γ is absolutely continuous with derivative almost everywhere in D, γ(t) ∈ Λ(D, x 0 ) for all t ∈ [t 0 , t 1 ]. By Lemma 3.12, let σ be a variation of γ for which ξ = δσ. By Lemma 3.7(ii), the curve s →σ(s, τ ) is a differentiable curve whose derivative at s = 0 is not in D γ(t) . Thus, for small s 0 ,σ(s 0 , τ ) ∈ Λ(D, x 0 ). Sinceσ(s 0 , t 0 ) = x 0 ∈ Λ(D, x 0 ) and since the absolutely continuous curve t →σ(s 0 , t) has tangent vector in D for almost every t ∈ [t 0 , t 1 ], it follows that σ(s 0 , τ ) ∈ Λ(D, x 0 ). This contradiction means that we must have where exp is the exponential map associated with the constrained connection giving integrability of D. x 0 ) be a variation of γ satisfying the distribution and endpoint constraints. As we saw above, we haveσ(s, t) ∈ Λ(D, x 0 ) for all (s, t) ∈ (−a, a) × [t 0 , t 1 ]. By Lemma 3.12, if ξ ∈ T γ H 1 ([t 0 , t 1 ]; M; D; x 0 ), we have ξ = δσ(0) for a variation σ of γ as just described. Thus This precludes T γ ev t 1 , restricted to T γ H 1 ([t 0 , t 1 ]; M; D; x 0 ) from being surjective. Thus γ is a D-singular curve.
However, even when the sections of D satisfy the bracket generating condition, it might still be the case that T γ ev t 1 is not surjective.

Example:
(A curve whose right endpoint mapping does not have surjective derivative [Liu and Sussmann 1994, §2.3]) We take M = R 3 and let D be the subbundle One can readily verify that this distribution is bracket generating. We consider the curve γ : [0, 1] → R 3 defined by γ(t) = (0, t, 0). Thus we take t 0 = 0, t 1 = 1, x 0 = (0, 0, 0) and x 1 = (0, 1, 0). To describe the tangent space of T γ H 1 ([0, 1]; R 3 ; D), we let G be the Euclidean metric for R 3 and compute the matrix representative of P D ⊥ to be Note that the vector field Y = ∂ ∂y has the property that Y (γ(t)) = γ (t) for t ∈ [0, 1]. Therefore, we compute We also compute We then see that It follows that and so γ is not a regular point for ev t 1 . • Finally, we can give a differential equation that characterises D-singular curves.

5.9
Proposition: (A characterisation of D-singular curves) Let (M, G) be a smooth Riemannian manifold, let D ⊆ TM be a smooth subbundle, and let t 0 , t 1 ∈ R satisfy t 0 < t 1 .
For γ ∈ H 1 ([t 0 , t 1 ]; M; D), the following statements are equivalent: Proof: Let us make some preliminary computations first.
If we write ξ = ξ + ξ ⊥ where ξ = P D • ξ and ξ ⊥ = P D ⊥ • ξ, then we have Thus we see that ξ can be freely chosen, with ξ (t 0 ) = 0, and then ξ ⊥ is obtained as the solution to the initial value problem and since ξ(t 0 ) = 0, Now we proceed with the two implications of the proof. Lemma 2.36(ii), we see that λ is indeed a section of D ⊥ over γ. Now, for ξ ∈ T γ H 1 ([t 0 , t 1 ]; M; D; γ(t 0 )), we have From (5.7), and noting the definition of λ, we have )(λ(t)))) dt.
As long as we take v ⊥ 1 = 0, it follows that λ is also nowhere zero since it is a solution to a linear differential equation with a nonzero final condition.

A summary of classes of curves and their tangent spaces.
We can summarise the preceding constructions with the following definitions.

5.10
Definition: (Tangent space to spaces of curves in a distribution with endpoint constraints) Let (M, G) be a smooth manifold, let D ⊆ TM be a smooth subbundle, let t 0 , t 1 ∈ R with t 0 < t 1 , and let x 0 , x 1 ∈ M.
(v) If γ is additionally a D-regular curve, then the tangent space to

Energies and their derivatives.
We shall work with the natural kinetic energy function associated with a Riemannian metric as a Lagrangian, as well as the simpler potential energy function. We shall also need to characterise an appropriate derivative of the actions associated with these energy functions using our derivative from Definition 3.16. First we consider kinetic energy.

5.11
Definition: (Kinetic energy function, kinetic energy action, constrained kinetic energy action) Let (M, G) be a smooth Riemannian manifold and let t 0 , t 1 ∈ R satisfy t 0 < t 1 .
(i) The kinetic energy function is the mapping (ii) The kinetic energy action is the mapping If, additionally, D ⊆ TM is a smooth subbundle, (iii) the constrained kinetic energy action is the mapping We should ensure that the kinetic energy action is well-defined. Proof: To show that A G is well-defined on H 1 ([t 0 , t 1 ]; M), note that, since γ ∈ H 0 ([t 0 , t 1 ]; TM), there are X 1 , . . . , X N ∈ Γ ∞ (TM) such that we can write for γ 1 , . . . , γ N ∈ L 2 ([t 0 , t 1 ]; R), cf. Sublemma 1 from the proof of Lemma 5.4. Then we have Since the function x → G(X l (x), X m (x)) is smooth, it is bounded on any compact subset of M containing image(γ). Thus is integrable since the product of square integrable functions is integrable by Hölder's inequality. This shows that A G is well-defined. To see that it is continuous, let (γ j ) j∈Z >0 be a sequence in H 1 ([t 0 , t 1 ]; M) converging to γ. First we calculate, for l, m ∈ {1, . . . , N }, Since the sequences (γ l j ) j∈Z >0 , l ∈ {1, . . . , N }, are bounded, we arrive at the conclusion that giving the desired continuity.
The following result characterises the derivative of A G .

5.15
Lemma: (Derivative of potential energy action) Let (M, G) be a smooth Riemannian manifold, let V ∈ C ∞ (M) be a potential energy function, let t 0 , t 1 ∈ R satisfy t 0 < t 1 , let γ ∈ H 1 ([t 0 , t 1 ]; M), and let ξ ∈ T γ H 1 ([t 0 , t 1 ]; M). Then A V is differentiable at γ and Proof: Let σ be a variation of γ such that δσ(0) = ξ. Using our definitions of variational derivative, we calculate Thus A V is differentiable with the asserted derivative.
As with the derivative of the kinetic energy action, we denote Now we can combine the two energies, and we record a formula for the derivative in the direction of fixed endpoint variations in the following result.
Having now shown that γ ∈ H 2 ([t 0 , t 1 ]; M), it is straightforward to complete the proof of this part of the theorem. By assumption we have d(A G − A V ); δ = 0 for every δ ∈ H 1 ([t 0 , t 1 ]; γ * D; x 0 , x 1 ). Thus, for δ ∈ H 1 ([t 0 , t 1 ]; γ * D; x 0 , x 1 ), by Lemmata 5.13 and 5.15, and by Sublemma 1 from the proof of Lemma 5.16, we have showing the existence of λ as asserted in part (iii) (iii) Note that part (ii), along with the condition that γ (t) ∈ D γ (t) for t ∈ [t 0 , t 1 ], can be expressed as . Differentiating the second of these equalities gives From the first of the above equalities we have Thus, by Lemma 2.36(ii), which is the desired conclusion. (iii) =⇒ (i) Lemma 2.36(ii) and the fact that Thus , δ(t)) dt = 0 for every δ ∈ H 1 ([t 0 , t 1 ]; γ * D; x 0 , x 1 ). Using Sublemma 1 from the proof of Lemma 5.16 gives . This part of the theorem now follows by Lemmata 5.13 and 5.15.
Based on the theorem, let us extend our notion of nonholonomic trajectories to arbitrary intervals.

5.19
Definition: (Nonholonomic trajectory on general interval) Let r ∈ {∞, ω} and let Σ = (M, G, V, D) be a C r -constrained simple mechanical system. Let I ⊆ R be an interval. A curve γ : I → M is a nonholonomic trajectory for Σ if it satisfies Of course, nonholonomic trajectories for a C r -constrained simple mechanical system are of class C r .
5.20 Remark: (Nonholonomic trajectories and geodesics) We note that, when the potential energy function is zero, the nonholonomic trajectories are geodesics of the constrained connection, restricted to initial conditions in D. This observation seems to have been first made by Synge [1928], and further observations are made by Lewis [1998]. Now we can state a few equivalent characterisations of a constrained variational trajectory.

5.22
Theorem: (Characterisation of constrained variational trajectories) Let r ∈ {∞, ω} and let Σ = (M, G, V, D) be a C r -constrained simple mechanical system. Let t 0 , t 1 ∈ R with t 0 < t 1 and let x 0 , x 1 ∈ M. For γ ∈ H 1 ([t 0 , t 1 ]; M; D; x 0 , x 1 ), the following statements are equivalent: (i) γ is a constrained variational trajectory; (ii) at least one of the following conditions holds: (iii) at least one of the following conditions holds: Proof: (i) =⇒ (ii) The proof of this implication has two cases.
Case II: γ is a D-regular curve As we saw in Lemma 5.4, the kernel of ∆ γ D is the tangent space at γ to H 1 ([t 0 , t 1 ]; M; D; x 0 , x 1 ) as a submanifold of H 1 ([t 0 , t 1 ]; M; x 0 , x 1 ).
Let us now state a couple of technical lemmata. The relevance of these is only understood when they are used in last part of the proof of this part of the theorem. Thus they are best referred back to, rather than read in order.
(ii) ⇐⇒ (iii) We shall consider two cases, the D-singular case and the D-regular case. In both cases, we shall compute the projection of the equations from part (ii) to both D and D ⊥ . Clearly the original equations hold if and only if both of the projected equations hold.
Case I: γ is a D-singular curve The equation we work with in this case is G ∇ γ λ + S * D (γ )(λ) = 0, (5.11) for nowhere zero λ ∈ H 1 ([t 0 , t 1 ]; γ * D ⊥ ). Let us take the projection of this equation onto D. Let X, Y ∈ Γ ∞ (D) and α ∈ Γ ∞ (D ⊥ ). We then have We may then compute This gives , which is the first of the equations from part (iii)(a). Now let us compute the projection of (5.11) onto D ⊥ . First, by Lemma 2.36(ii), we have Now, for X ∈ Γ ∞ (D) and α, β ∈ Γ ∞ (D ⊥ ), we have This then can be used to compute and so we conclude that This gives Combining this with (5.13) gives the second of equations from part (iii)(a).
Case II: γ is a D-regular curve In this case, the equation we work with is for λ ∈ H 1 ([t 0 , t 1 ]; γ * D ⊥ ). Let us first project this equation onto D. An application of P D to (5.14) gives using Lemma 2.36(ii). As we saw in the proof of the D-singular case above, Combining the preceding two equations gives the first of the equations from part (iii)(b). Next, an application of P D ⊥ to (5.14) gives using Lemma 2.36(ii). By Lemma 2.36(i) and (vi), we have As we saw in the proof of the D-singular case above, This, combined with the preceding two equations, gives the second of equations from part (iii)(b).
Case I: γ is a D-singular curve In this case, the conclusion follows immediately from Proposition 5.9 and the definition of D-singular curves.

5.24
Definition: (Adjoint field of a constrained variational trajectory) If Σ = (M, G, V, D) is a C r -constrained simple mechanical system, r ∈ {∞, ω}, and if the pair (γ, λ), γ : I → M and λ : I → D ⊥ , satisfy the conditions for γ to be a (either D-singular or D-regular) constrained variational trajectory, then λ is called the adjoint field along γ. • A D-regular constrained variational trajectory for a C r -constrained simple mechanical system is of class C r . However, there is no a priori requirement that D-singular constrained variational trajectories be anything but absolutely continuous.

5.25
Remark: (Constrained variational trajectories that are both D-singular and D-regular) Note that it is possible that a constrained variational trajectory will simultaneously satisfy both of the conditions (ii)(a) and (ii)(b) (or, equivalently, conditions (iii)(a) and ( 2. It may be the case that some constrained variational trajectories are both D-singular and D-regular, but others are just D-singular or just D-regular.
3. It may be the case that all constrained variational trajectories are both D-singular and D-regular. For example, one can see that this situation arises when This suggests introducing the notion of a curve γ ∈ H 1 ([t 0 , t 1 ]; M; D) being strictly Dsingular , meaning that it is D-singular but not D-regular. This is indeed an interesting notion to explore, but we shall not do so here. • Typically one assumes that M is connected and that D is bracket generating since this ensures that, given x 0 , x 1 ∈ M, H 1 ([t 0 , t 1 ]; M; D; x 0 , x 1 ) = ∅ [Chow 1940[Chow /1941. We can then define the length action by Then we make M into a metric space by the metric We say that γ ∈ H 1 ([t 0 , t 1 ]; M; D) is a sub-Riemannian geodesic if there is a partition To determine sub-Riemannian geodesics, one first determines the extremals which are the critical points of the length function G D on H 1 ([t 0 , t 1 ]; M; D). One can imagine, by comparing the length action to the kinetic energy action, that there might be some correspondence between sub-Riemannian geodesics and constrained variational trajectories. For now, we restrict to the case of sub-Riemannian geodesics, and define the energy action in this setting by We then have the following result.
6.1 Lemma: (Relationship between minimisers of the length action and of the energy action) Let (M, D, G D ) be a sub-Riemannian manifold, let t 0 , t 0 , t 1 , t 1 ∈ R satisfy t 0 < t 1 and t 0 < t 1 , and let x 0 , x 1 ∈ M be distinct. Then the following statements hold: (ii) for γ ∈ H 1 ([t 0 , t 1 ]; M; D; x 0 , x 1 ), the following statements are equivalent: Proof: (i) Note that γ is absolutely continuous if and only if f •γ is absolutely continuous for every f ∈ C ∞ (M). 9 Therefore, γ•τ is absolutely continuous if and only if f •γ •τ is absolutely continuous. Thus the absolute continuity of γ • τ follows from the fact that the composition of an absolutely continuous and Lipschitz function is absolutely continuous [cf. Ziemer 1989, Theorem 2.1.11]. Since τ is continuous, we also have f • γ • τ ∈ L 2 ([t 0 , t 1 ]; R). Since τ is Lipschitz, by Rademacher's Theorem [Federer 1969, Theorem 3.1.5] τ is differentiable almost everywhere, and its derivative is bounded by any Lipschitz constant for τ . Thus, Thus γ • τ ∈ H 1 ([t 0 , t 1 ]; M; D). Finally, 6.4 Remarks: (Properties of Pontryagin extremals) 1. When µ 0 = 1, then, according to Lemma 6.2, the maximisation condition from part (ii) of the definition uniquely determines ξ by Thus ξ is not time-varying in this case, and the Hamiltonian from part (iii) of the definition is simply the associated maximum Hamiltonian from Lemma 6.2: The Pontryagin extremals in this case we call normal . A consequence is that normal extremals are smooth. Note that we can then extend the notion of a normal extremal as a curve γ : I → M defined on an arbitrary interval that is a projection to M of an integral curve of the Hamiltonian vector field associated to the maximum Hamiltonian in this case.
2. When µ 0 = 0, then it is required that µ be nowhere zero along γ. More importantly, however, the maximisation condition from part (ii) of the definition requires that µ(t) ∈ Λ(D), according to Lemma 6.2. Also by Lemma 6.2, the maximum Hamiltonian is zero, and so places no constraints on the time-varying section ξ of ρ * D,µ 0 D. Thus one obtains no information about the velocity γ along the extremal γ directly from the maximisation condition. Indeed, one must look elsewhere, beyond the conditions for Pontryagin extremals, to get useful conditions on velocities. One such condition will arise in Corollary 6.9. Such conditions are also studied in detail in Chapter 12 of [Agrachev, Barilari, and Boscain 2018]. We note, however, that part (iii) of the definition is not vacuous, and gives conditions on the curve µ in Λ(D). The Pontryagin extremals in this case we call abnormal . As with normal extremals, we can extend the notion of an abnormal extremal to arbitrary intervals. Thus, an abnormal extremal in this case is the projection to M of an integral curve of the restriction to Λ(D) of the Hamiltonian vector field X ξ associated with the Hamiltonian H ξ (t, α) = α; ξ(t, π T * M (α)) , where ξ is a time-varying D-valued vector field with appropriate regularity, i.e., in L 2 Γ ∞ (ρ * D,0 D). An important open question in sub-Riemannian geometry is whether there are abnormal sub-Riemannian geodesics that are not smooth. At present, there is no general proof of this, but there are also no counterexamples.
• 6.2. The connection between sub-Riemannian geometry and constrained variational mechanics. In this section we establish the connections between the theory of sub-Riemannian geometry, as described in the preceding section, and the theory of constrained variational trajectories, as described in Section 5.4. For the purposes of the current presentation, when we say "constrained simple mechanical system," we shall always take the potential function V to be zero, so the data is a triple (M, G, D). Let us begin by showing how the data of a sub-Riemannian manifold arises from restricting the data of a constrained simple mechanical system. 6.5 Lemma: (Sub-Riemannian manifolds from constrained simple mechanical systems) If (M, D, G D ) is a sub-Riemannian manifold, then there exists a Riemannian metric G on M such that G D = G|D.
Proof: Let M be properly embedded in R N for sufficiently large N . Let D ⊥ ⊆ TM be the orthogonal complement to D ⊆ TM with respect to the Euclidean Riemannian metric in R N . Then D has the fibre metric G D while we can equip D ⊥ with the restriction of the Euclidean Riemannian metric, for example, which we denote by G D ⊥ . We can then define An immediate consequence of the lemma is that the energy action A G D in sub-Riemannian geometry is the same as the restriction of the energy action A G for a constrained simple mechanical system to H 1 ([t 0 , t 1 ]; M; D), provided that the kinetic energy is chosen so as to agree with the energy of the sub-Riemannian manifold on D. Note that this correspondence relies on the regularity of D; if D is not regular, then there may not be an extension of G D from D to a Riemannian metric on M. But in the regular case, the sub-Riemannian extremals and the constrained variational trajectories agree, being critical points of the same actions. However, somewhat more than this is true, as we shall now explore.
Let us begin by clarifying how the Riemannian metric G relates to the objects associated with G D . Given a smooth Riemannian manifold (M, G) and a smooth subbundle, denote Note that both G D and G D ⊥ define smooth (0, 2)-tensor fields on M. We shall, of course, think of (M, D, G D ) as a sub-Riemannian manifold. As such, associated with G D is the vector bundle mapping G D : T * M → TM and the (2, 0)-tensor field G −1 D on M, as described above. Let us describe these in terms of the Riemannian metric G. To do so, we denote by G −1 the vector bundle metric on T * M associated to G and we define subbundles Λ(D) and Λ(D ⊥ ) of T * M by We note that Λ(D) and Λ(D ⊥ ) are G −1 -orthogonal. We denote Note that both G −1 Λ(D) and G −1 Λ(D ⊥ ) define smooth (2, 0)-tensor fields on M. We then have the following elementary result.
Then we immediately deduce that γ is an integral curve of ξ. We then have, by (2.3) and (6.6), G ∇ γ λ = −S * D (γ )(λ), and this gives this part of the result, according to Theorem 5.22(ii). This result is asserted, but not proved, by Kupka and Oliva [2001]; they likely had a coordinate proof in mind since a coordinate proof is straightforward, if messy. Langerok [2003] gives a related result, proved partly in coordinates. We give, for what we believe is the first time, a coordinate-independent proof of an affine connection characterisation of extremals in sub-Riemannian geometry. Moreover, directly from Theorem 5.22 (iii) we have the following characterisation of sub-Riemannian extremals. (b) there exists a section λ : I → D ⊥ over γ so that γ and λ together satisfy (ii) the following statements are equivalent: (a) γ is an abnormal Pontryagin extremal; (b) there exists a nowhere zero section λ : I → D ⊥ over γ so that γ and λ together satisfy There are a few interesting conclusions one can draw from the preceding results. First of all, in the normal case, our result shows that extremals are projections of integral curves of a smooth linear vector field, and so are smooth. Second of all, we can use our geometric structure to give the following characterisation of abnormal extremals.
6.9 Corollary: (Property of abnormal extremals in sub-Riemannian geometry) Let (M, D, G D ) be a sub-Riemannian manifold. If γ : I → M is an abnormal extremal with adjoint field λ : I → D ⊥ over γ, then λ(t) ∈ ker(F * D ) γ (t) for every t ∈ I. In particular, if F * D is injective on fibres of π * D D ⊥ , then there are no abnormal extremals.

When are nonholonomic trajectories variational (and vice versa)?
In this section we address the question of when nonholonomic and constrained variational trajectories for a constrained simple mechanical system Σ = (M, G, V, D) coincide in some way. We recall from Theorems 5.18 and 5.22 that the equations governing nonholonomic trajectories are while the equations governing constrained variational trajectories are in the D-singular case and 7.1. Pulling back equations to D. Throughout the following discussion, we let r ∈ {∞, ω} and let Σ = (M, G, V, D) be a C r -constrained simple mechanical system. In order to compare the nonholonomic and D-regular constrained variational equations, we shall pull the equations back to equations from M to D. As we shall see, doing this will allow us to make use of the methods of Section 4 for determining subbundles invariant under an affine vector field. In particular, these pulled-back equations will be described by an affine vector field. In [Langerok 2003] the constructions we give here are presented in the context of "connections over bundle maps." Note that we have the following vector bundles over M: π TM : TM → M, π D : D → M, π D ⊥ : D ⊥ → M.
Explicitly and to fix notation, We can pull back v x ∈ T x M to u x ∈ D x according to the formula In particular, if X ∈ Γ r (TM), we can pull this back to a section π * D X of π * D TM by π * D X(v) = (X(π TM (v)), v), v ∈ D.
Finally, if γ : I → M is such that γ is locally absolutely continuous and satisfies P D ⊥ •γ = 0, and if ξ : I → TM is a vector field along γ, then we can pull back ξ to a section over γ according to π * D ξ(t) = (ξ(t), γ (t)) ∈ π * D TM. Note that π * D ξ is locally absolutely continuous. Similar constructions hold, of course, for π * D D and π * D D ⊥ .
We wish to pull-back some of the bundle maps from Section 2.11 to D. We remind the reader of the tensor constructions following Lemma 2.36, and, particularly, remind them that there is a (minor) difference between the superscript * and the superscript . Keeping this in mind, we have vector bundle mapŝ The vector bundle mapF * D , particularly its kernel, is important to us. Indeed, we give this kernel a name, referring to Definition 2.42 for the background for the terminology.
We can now give the form of the evolution of the adjoint field λ in the constrained variational equations. First we consider the case of D-singular curves.
Therefore, we also have a.e. t ∈ I.
The proposition now follows immediately.
Next we consider the evolution of the adjoint field in the case of D-regular curves for the constrained variational equations. 7.4 Proposition: (Lifted D-regular constrained variational equations) Let r ∈ {∞, ω} and let Σ = (M, G, V, D) be a C r -constrained simple mechanical system. Let γ : I → M be such that γ is locally absolutely continuous and such that P D ⊥ • γ = 0. Denote Υ = γ . Then, for a locally absolutely continuous λ : I → D ⊥ satisfying π D ⊥ • λ = γ, the following are equivalent: whereλ : I → π * D D ⊥ is defined byλ(t) = (λ(t), Υ(t)). Proof: Here, in addition to the computations from the proof of Proposition 7.3, we note that π * D G D (Υ, Υ) = (G D (γ , γ ), γ ) and π * D P D ⊥ • grad V • γ = (P D ⊥ • grad V • γ, γ ), and from this, combined with the computations from the proof of Proposition 7.3, the result follows. Now let us make these constructions "global," rather than concentrating on a single curve. As ever, let r ∈ {∞, ω} and let Σ = (M, G, V, D) be a C r -constrained simple mechanical system. We let X nh D ∈ Γ r (TD) be the vector field on D whose integral curves are curves Υ = γ , where γ : I → M is a nonholonomic trajectory: D ∇ γ γ + P D • grad V • γ = 0, γ (t 0 ) ∈ D γ(t 0 ) , for some t 0 ∈ I. That is, X nh D is the restriction of Z G,D + (P D • grad V ) v to D ⊆ TM, where Z G,D is the geodesic spray of D ∇. This makes sense since D is geodesically invariant under D ∇, cf. [Bullo and Lewis 2004, Theorem 4.87]. We denote by (X nh D ) h ∈ Γ r (T(π * D D ⊥ )) the horizontal lift of X nh D ∈ Γ r (D) to π * D D ⊥ by the connection D ⊥ ∇ * . Now define b D ∈ Γ r (π * D D ⊥ ) and A D ∈ Γ r (End(π * D D ⊥ )) by for u x ∈ D and (α x , u x ) ∈ π * D D ⊥ . We can then define the linear vector field X sing D ∈ Γ r (T(π * D D ⊥ )) by X sing in the D-singular case and the affine vector field X reg D ∈ Γ r (T(π * D D ⊥ )) in the D-regular case. Let us record the significance of the vector fields X sing D and X reg D , starting with the D-singular case. Note that, since we are considering constrained variational trajectories that project to nonholonomic trajectories, we can assume all curves to be of class C r , r ∈ {∞, ω}. 7.5 Proposition: (D-singular constrained variational trajectories along nonholonomic trajectories) Let r ∈ {∞, ω} and let Σ = (M, G, V, D) be a C r -constrained simple mechanical system. For a C r -curveλ : I → π * D D ⊥ , writeλ(t) = (λ(t), Υ(t)). Then the following statements are equivalent: (i) Υ is an integral curve of X nh D and λ is such that γ = π D • Υ and λ together satisfy the conditions for a D-singular constrained variational trajectory from Theorem 5.22(ii)(a) (or (iii)(a)); (ii) the following conditions hold: (a) λ(t) = 0 for every t ∈ I; (b)λ(t) ∈ ker(F * D ) Υ(t) for every t ∈ I; (c)λ is an integral curve of X sing D .
Proof: Note that (i) is equivalent to the three equations along with the condition that λ be nowhere zero. By Lemma 2.4, the conditions of part (ii) are equivalent to the three equations F * D (γ )(λ) = 0, along with the condition that λ be nowhere zero. The proposition follows immediately from the definition of A D . Now let us consider the D-regular case.
Proof: Note that (i) is equivalent to the three equations By Lemma 2.4, the conditions of part (ii) are equivalent to the three equations The proposition follows immediately from the definitions of A D and b D .
7.2. Main results. Now we assemble the preceding developments of the paper to prove the main results, giving answers to the questions posed at the beginning of this section.

When are all constrained variational trajectories also nonholonomic trajectories?.
Our first result is one that has been observed by many authors, sometimes for more general Lagrangians than we consider here [e.g., Cortés, de León, Martín de Diego, and Martínez 2002, Fernandez and Bloch 2008, Jóźwikowski and Respondek 2019, Kupka and Oliva 2001, Lewis and Murray 1995, Terra 2018]. The result does not rely on our results about invariant cogeneralised distributions or affine subbundle varieties from Section 4. Instead, it is proved just by direct comparison of the nonholonomic and constrained variational equations. The result is the following. 7.7 Theorem: (When all D-regular constrained variational trajectories are nonholonomic trajectories) Let r ∈ {∞, ω} and let Σ = (M, G, V, D) be a C r -constrained simple mechanical system. Then the following statements are equivalent: (i) every D-regular constrained variational trajectory is a nonholonomic trajectory; (ii) D is integrable.
Proof: (i) =⇒ (ii) Suppose that every D-regular constrained variational trajectory is a nonholonomic trajectory and that D is not integrable, i.e., that F D is nonzero, cf. Remark 2.37-3. Then there exists x ∈ M and u x , v x ∈ D x such that F D (u x , v x ) = 0. Thus there exists α x ∈ D ⊥ x such that Then we have F * D (v x )(α x ) = 0. Therefore, if γ : I → M is a D-regular constrained variational trajectory satisfying γ (t 0 ) = v x for some t 0 ∈ I and if λ : I → D ⊥ is the corresponding section over γ satisfying λ(t 0 ) = α x , then we have for t sufficiently close to t 0 . This, then prohibits γ from being a nonholonomic trajectory.
(ii) =⇒ (i) If D is integrable, then the Frobenius curvature F D vanishes, as above. It then follows from Theorem 5.22(iii) that, if γ is a D-regular constrained variational trajectory, then D ∇ γ γ + P D • grad V • γ = 0, and so γ is a nonholonomic trajectory by Theorem 5.18(iii).
Note that the question of when all D-singular constrained variational trajectories are nonholonomic trajectories does not make sense in our context, since the condition (SCV) for D-singular variational trajectories does not determine conditions for γ. It is possible that there are constrained simple mechanical systems, all of whose D-singular constrained variational trajectories are nonholonomic trajectories, but the determining of conditions for this would require studying higher-order conditions beyond the essentially first-order conditions yielded by Theorem 5.22. This is not something we do here. 7.8 Theorem: (When some nonholonomic trajectories are D-regular constrained variational trajectories) Let r ∈ {∞, ω} and let Σ = (M, G, V, D) be a C r -constrained simple mechanical system. Consider the following statements: (i) some nonholonomic trajectories are D-regular constrained variational trajectories; (ii) there exists a C r -affine subbundle variety A ⊆ ker(F * D ) that is flow-invariant under X reg D ; (iii) there exists a partial (X reg D , ker(F * D ))-admissible C r -defining subbundle ∆ ⊆ (π * D D ⊥ ) * ⊕ R D . Then Proof: If (i) holds, then, by Proposition 7.6, there is an integral curveΥ of X reg D over Υ for which image(Υ) ⊆ ker(F * D ). If r = ∞ or if X nh D is complete, by Theorem 4.23 we conclude that there is a C r -cogeneralised affine subbundle of ker(F * D ) that is flow-invariant under X reg D . This shows that (ii) holds. If (ii) holds, then there is an integral curveΥ of X reg D over X nh D with values in ker(F * D ). By Proposition 7.6, it follows that this integral curve is a D-regular constrained variational trajectory, showing that (i) holds.
The conditions of part (iii) are just those of Theorem 4.25 that are equivalent to the existence of a C r -defining subbundle ∆ ⊆ Λ( ker(F * D )) that is flow-invariant under the linear vector field on (π * D D ⊥ ) * ⊕ R D associated with the affine vector field X reg D . Since ∆ is partial, by Proposition 4.20, A(∆) is nonempty. By Lemma 4.18(ii), A(∆) is flow-invariant. This gives (ii). Now we consider when all nonholonomic trajectories are D-singular constrained variational trajectories.
7.9 Theorem: (When some nonholonomic trajectories are D-singular constrained variational trajectories) Let r ∈ {∞, ω} and let Σ = (M, G, V, D) be a C r -constrained simple mechanical system. Consider the following statements: (i) some nonholonomic trajectories are D-singular constrained variational trajectories; (ii) there exists a C r -cogeneralised subbundle L ⊆ ker(F * D ) that is flow-invariant under X sing ; (iii) there exists a (X sing D , ker(F * D ))-admissible C r -cogeneralised subbundle L ⊆ π * D D ⊥ . Then Proof: The proof follows from Theorems 4.22 and 4.24,and Proposition 7.5, in the same way as Theorem 7.8 follows from Theorems 4.23 and 4.25, and Proposition 7.6, noting that X sing D is a linear vector field over X nh D .
7.2.3. When is a given nonholonomic trajectory also a constrained variational trajectory?. We now consider the matter of when single nonholonomic trajectories are constrained variational trajectories. One can certainly make use of the general constructions in the preceding section, asking whether the initial condition for the nonholonomic trajectory is covered by a suitable initial condition for the constrained variational trajectory. However, because we are focussing on a single trajectory, the problem can be reduced, and so this should be done. We start with the D-regular case. Some words about this are printed in [Terra 2018], but a conclusive statement such as we now give is not quite given by Terra.
Proof: We shall use Theorems 4.23 and 4.25, after pulling all data back from D to I by Υ. We begin by performing all of the required pull-backs, and giving the properties of these.
10. IfΥ : I → E is a curve for which π •Υ = Υ, then image(Υ * Υ ) ∈ Υ * F if and only if image(Υ) ⊆ F. We now revert back to the unabbreviated notation. Note that Υ * A = A Υ and Υ * b = b Υ . If (i) holds, then, by Proposition 7.6, there is an integral curveΥ of X reg D over Υ for which image(Υ) ⊆ ker(F * D ). As pointed out in 10, this implies that image(Υ * Υ ) ⊆ Υ * ker(F * D ). If r = ∞ or if X nh D is complete, by Theorem 4.23 we conclude that there is a C r -cogeneralised affine subbundle of Υ * ker(F * D ) that is flow-invariant under Υ * X reg D . This shows that (ii) holds.
If (ii) holds, then there is an integral curve Υ * Υ of Υ * X reg D over τ with values in Υ * ker(F * D ). By the observation of 10, this implies thatΥ is an integral curve of X reg D with values in ker(F * D ). By Proposition 7.6, it follows that this integral curve is a D-regular constrained variational trajectory.
The conditions of part (iii) are just those of Theorem 4.25 that are equivalent to the existence of a C r -defining subbundle ∆ ⊆ Υ * Λ( ker(F * D )) that is flow-invariant under the linear vector field on Υ * ((π * D D ⊥ ) * ⊕ R D ) associated with the affine vector field Υ * X reg D . Since ∆ is partial, by Proposition 4.20, A(∆) is nonempty. By Lemma 4.18(ii), A(∆) is flow-invariant. This gives (ii). Now we consider the D-singular case, of which there is no discussion in the existing literature.
Proof: The proof follows from Theorems 4.22 and 4.24,and Proposition 7.5, in the same way as Theorem 7.10 follows from Theorems 4.23 and 4.25, and Proposition 7.6, noting that X sing D is a linear vector field over X nh D .
7.2.4. When are all nonholonomic trajectories also constrained variational trajectories?. Now we consider the situation where all nonholonomic trajectories are constrained variational trajectories. The results here follow easily along the lines of Theorems 7.8 and 7.9. We consider first the D-regular case.
7.14 Corollary: ( [Favretti 1998, Theorem 3.2]) Let r ∈ {∞, ω} and let Σ = (M, G, 0, D) be a C r -constrained simple mechanical system and suppose that D is the orthogonal distribution for a foliation F of M for which G is bundle-like. Then every nonholonomic trajectory is a D-regular constrained variational trajectory.
Proof: Under the hypothesis that D is geodesically invariant for G ∇ and that V = 0, b D = 0. We can then apply Theorem 7.12 with A the zero section.
Note that Favretti actually requires more than is needed, since there is no need for D to be the orthogonal distribution of a foliation for which G is bundle-like. All we require is that D be geodesically invariant for G ∇. Our next result has to do with certain nonholonomic trajectories that are also constrained variational trajectories.
Thus b Υ = 0, with Υ = γ . We can then apply Theorem 7.10 with A the zero section.
The next result we state is one that has been observed by many authors in many different ways. It is an essentially obvious result, but it is worth pointing to a few occurrences of it in order to make connections between various approaches. One such statement is given by Fernandez and Bloch [2008], and requires substantial translation to get from the stated result to something in our terminology. Crampin and Mestdag [2010] give a version of the result as their Corollary 1, although their setup is rather different than ours. Another occurrence is in the paper of Langerok [2003], and is given in a setting more reminiscent of our approach. The result of Fernandez and Bloch is stated for general Lagrangians, while the result of Langerok is given in the setting of sub-Riemannian geometry, and so applies only to kinetic energy Lagrangians. Fernandez and Bloch also attribute the result to Rumiantsev [1978], but we could not locate such a statement in Rumiantsev's paper. 10 7.16 Corollary: (e.g., [Langerok 2003, Proposition 37], [Fernandez and Bloch 2008, Proposition 2]) Let r ∈ {∞, ω} and let Σ = (M, G, V, D) be a C r -constrained simple mechanical system. Then a nonholonomic trajectory γ : I → M is a D-regular constrained variational trajectory if and only if there exists a smooth section λ : I → D ⊥ along γ that satisfies for some (w x , u x ) ∈ π * D D, and the claim follows by definition of F D . Thus ker(F * D ) ⊥ ⊆ π * D D (1) =⇒ π * D (D (1) ) ⊥ ⊆ ker(F * D ).
Thus, if π * D (D (1) ) ⊥ is flow-invariant under X reg D , then so too is ker(F * D ). Thus ker(F * D ) contains a C r -cogeneralised affine subbundle that is invariant under X reg D . Since a C rcogeneralised affine subbundle is a special example of a C r -affine subbundle variety whose base variety is M, the result then follows by Theorems 4.24 and 7.12.
Our hypotheses are not the same as those of Terra, but are equivalent to them by Proposition 4.13. Note that we are able to relax the assumption of Terra that D (1) be a subbundle.
Our final result is simply Corollary 7.14, stripped of the extraneous requirement that D be orthogonal to a foliation for which G is bundle-like. We note that all of the results quoted above are either reformulations of the "obvious" condition of Corollary 7.16 or they give conditions under which the purely algebraic conditions of Theorems 7.10 or 7.12 apply, without needing to resort to the differential conditions. It would be interesting to have physical examples-or even mathematical examples-of constrained simple mechanical systems for which every nonholonomic trajectory is a constrained variational trajectory, but for which a verification of this requires one to use the differential conditions of Theorem 7.12.