Constructions and Bounds for Mixed-Dimension Subspace Codes

Codes in finite projective spaces equipped with the subspace distance have been proposed for error control in random linear network coding. The resulting so-called \emph{Main Problem of Subspace Coding} is to determine the maximum size $A_q(v,d)$ of a code in $\operatorname{PG}(v-1,\mathbb{F}_q)$ with minimum subspace distance $d$. Here we completely resolve this problem for $d\ge v-1$. For $d=v-2$ we present some improved bounds and determine $A_q(5,3)=2q^3+2$ (all $q$), $A_2(7,5)=34$. We also provide an exposition of the known determination of $A_q(v,2)$, and a table with exact results and bounds for the numbers $A_2(v,d)$, $v\leq 7$.


Introduction
For a prime power q > 1 let F q be the finite field with q elements and F v q the standard vector space of dimension v ≥ 0 over F q . The set of all subspaces of F v q , ordered by the incidence relation ⊆, is called (v − 1)-dimensional (coordinate) projective geometry over F q and denoted by PG(v − 1, F q ). It forms a finite modular geometric lattice with meet X ∧ Y = X ∩ Y and join X ∨ Y = X + Y .
The study of geometric and combinatorial properties of PG(v −1, F q ) and related structures forms the subject of Galois Geometry-a mathematical discipline with a long and renowned history of its own but also with links to several other areas of discrete mathematics and important applications in contemporary industry, such as cryptography and error-correcting codes. For a comprehensive introduction to the core subjects of Galois Geometry readers may consult the three-volume treatise [28][29][30]. More recent developments are surveyed in [3].
It has long been recognized that classical error-correcting codes, which were designed for point-to-point communication over a period of now more than 60 years, can be studied in the Galois Geometry framework. Recently, through the seminal work of Koetter, Kschischang and Silva [36,44,45], it was discovered that essentially the same is true for the network-error-correcting codes developed by Cai, Yeung, Zhang and others [25,47,48]. Information in packet networks with underlying packet space F v q can be transmitted using subspaces of PG(v − 1, F q ) as codewords and secured against errors (both random and adversarial errors) by selecting the codewords subject to a lower bound on their mutual distance in a suitable metric on PG(v − 1, F q ), resembling the classical block code selection process based on the Hamming distance properties.
Accordingly, we call any set C of subspaces of F v q a q-ary subspace code of packet length v. Two widely used distance measures for subspace codes (motivated by an information-theoretic analysis of the Koetter-Kschischang-Silva model) are the so-called subspace distance and injection distance With this the minimum distance in the subspace metric of a subspace code C containing at least two codewords is defined as A subspace code C is said to be a constant-dimension code (or Grassmannian code) if all codewords in C have the same dimension over F q . Since d S (X, Y ) = 2·d I (X, Y ) whenever X and Y are of the same dimension, we need not care about the specific metric (d S or d I ) used when dealing with constant-dimension codes. Moreover, d S (C) = 2·d I (C) in this case and hence the minimum subspace distance of a constantdimension code is always an even integer. As in the classical case of block codes, the transmission rate of a network communication system employing a subspace code C is proportional to log(#C). Hence, given a lower bound on the minimum distance d S (C) or d I (C) (providing, together with other parameters such as the physical characteristics of the network and the decoding algorithm used, a specified data integrity level 2 ), we want the code size M = #C to be as large as possible. It is clear that constant-dimension codes usually are not maximal in this respect and, as a consequence, we need to look at general mixed-dimension subspace codes for a rigorous solution of this optimization problem.
In the remaining part of this article we will restrict ourselves to the subspace distance d S . From a mathematical point of view, any v-dimensional vector space V over F q is just as good as the standard space F v q (since V ∼ = F v q ), and it will sometimes be convenient to work with non-standard spaces (for example with the extension field F q v /F q , in order to exploit additional structure). Hence we fix the following terminology: The purpose of this paper is to advance the knowledge in the mixed-dimension case and determine the numbers A q (v, d) for further parameters q, v, d. In this regard we build upon previous work of several authors, as is nicely surveyed by Etzion in [15,Sect. 4]. In the parlance of Etzion's survey, our contribution partially solves Research Problem 20. 7 The Gaussian binomial coefficients v k q give the number of k-dimensional subspaces of F v q (and of any ambient space V ∼ = F v q ) and satisfy Since these numbers grow very quickly, especially for k ≈ v/2, the exact determination of A q (v, d) appears to be an intricate task-except for some special cases. Even more challenging is the refined problem of enumerating the isomorphism types of the corresponding optimal subspace codes (i.e. those of size A q (v, d)). In some cases such an exhaustive enumeration is currently infeasible due to the large number of isomorphism types or due to computational limitations. In this context we regard the determination of certain structural restrictions as a precursor to an exhaustive classification. The remaining part of this paper is structured as follows. In Section 2 we provide further terminology and a few auxiliary concepts and results, which have proved useful for the subsequent subspace code optimization/classification. In Section 3 we determine the numbers A q (v, d) for general q, v and some special values of d. Finally, in Section 4 we further discuss the binary case q = 2 and determine the numbers A 2 (v, d), as well as the corresponding number of isomorphism classes of optimal codes, for v ≤ 7 and all but a few hard-to-resolve cases. For the remaining cases we provide improved bounds for A 2 (v, d) using a variety of methods. On the reader's side we will assume at least some rudimentary knowledge of subspace coding, which e.g. can be acquired by reading the survey [17] or its predecessor [15].

Preliminaries
2.1. The automorphism group of PG(V ), d S . Let us start with a description of the automorphism group of the metric space PG(v−1, F q ) relative to the subspace distance. Since a general v-dimensional ambient space V is isomorphic to F v q as a vector space over F q (and hence isometric to (F v q , d S )), this yields a description of all automorphism groups Aut(V, d S ) and also of all isometries between different ambient spaces (V 1 , d S ) and (V 2 , d S ).
It is clear that the linear group GL(v, F q ) acts on PG(v − 1, F q ) as a group of F q -linear isometries. If q is not prime then there are additional semilinear isometries arising from the Galois group Aut(F q ) = Aut(F q /F p ) in the obvious way (component-wise action on F v q ). Moreover, mapping a subspace X ⊆ F v q ("linear code of length v over F q ") to its dual code X ⊥ (with respect to the standard inner product) respects the subspace distance and hence yields a further automorphism π of the metric space PG(v − 1, F q ). The map π also represents a polarity (correlation of order 2) of the geometry PG(v − 1, F q ).
Theorem 2.1. Suppose that v ≥ 3. 8 The automorphism group G of PG(v − 1, F q ), viewed as a metric space with respect to the subspace distance, is generated by GL(v, F q ), Aut(F q ) and π. More precisely, G is the semidirect product of the projective general semilinear group PΓL(v, F q ) with a group of order 2 acting by matrix transposition on PGL(v, F q ) and trivially on Aut(F q ).
Most of this theorem is already contained in [46], but we include a complete proof for convenience.
Proof. Let f be an automorphism of PG(v − 1, F q ). Then either f interchanges {0} and F v q or leaves both subspaces invariant (using the fact that {0}, F v q are the only subspaces with a unique complementary subspace). Moreover, if f fixes {0}, F v q then it preserves the dimension of subspaces and hence represents a collineation of the geometry PG(v − 1, F q ). By the Fundamental Theorem of Projective Geometry (here we use the assumption v ≥ 3), a collineation is represented by an element of PΓL(v, F q ). Since π interchanges {0} and F v q , either f or f •π stabilizes {0}, F v q and belongs to PΓL(v, F q ). This proves the first assertion and shows that PΓL(v, F q ) has index 2 in G. 9 Finally, denoting by φ the Frobenius automorphism of F q (over its prime field F p ), we have φ( Since the adjoint map (with respect to the standard inner product on F v q ) of x → Ax is y → A T y, the second assertion follows and the proof is complete.
In effect, Theorem 2.1 reduces the isomorphism problem for subspace codes to the determination of the orbits of GL(v, F q ), respectively ΓL(v, F q ), on subsets of PG(v − 1, F q ). In the most important case q = 2 the semilinear part is void, which further simplifies the problem. As a word of caution we remark that, in view of the presence of the polarity π, the dimension distribution of subspace codes is not an isomorphism invariant. Rather we have that δ(C) = (δ 0 , δ 1 , . . . , δ v ) leaves for a code C ∼ = C the reverse distribution δ(C ) = (δ v , δ v−1 , . . . , δ 0 ) as a second possibility.
The formula in (5) for the Gaussian binomial coefficients, which can be put into the form reflects the group-theoretical fact that GL(v, F q ) acts on the set of k-dimensional subspaces of F v q transitively and with a stabilizer isomorphic to We can go further and ask for a description of the orbits of GL(v, F q ) in its induced action on ordered pairs (X, Y ) of subspaces. Such a description has significance for modeling the transmission of subspaces in the Koetter-Kschischang-Silva model by a discrete memoryless (stationary) channel, which in essence amounts to specifying time-independent transition probabilities p(Y |X).

Lemma 2.2.
For any integer triple a, b, c satisfying 0 ≤ a, b ≤ v and max{0, a + b − v} ≤ c ≤ min{a, b} the group GL(v, q) acts transitively on ordered pairs of subspaces 10 Moreover, each such integer triple gives rise to an orbit of GL(v, F q ) on ordered pairs of subspaces of F v q with length Proof. The restrictions on a, b, c are obviously necessary, since dim( It remains to show that GL(v, q) acts transitively on those pairs of subspaces and compute the orbit lengths. Transitivity is an immediate consequence of the fact that the corresponding sequences of a + b − c linearly independent vectors, defined as above, can be isomorphically mapped onto each other. The stabilizer of (X, Y ) in GL(v, F q ) has the form which leads to the stated formula for the orbit length after a short computation. 11 2.2. Basic properties of the numbers A q (v, d; T ). In this subsection we collect some elementary but useful properties of the numbers A q (v, d; T ) and consider briefly the growth of k → A q (v, d; k) (the constant-dimension case). Henceforth V will denote a v-dimensional vector space over F q , if not explicitly stated otherwise, and we will use the abbreviations V T for the set of all subspaces X ⊆ V with dim(X) ∈ T ( V k in the constant-dimension case T = {k}) and C T = C ∩ V T for subspace codes C with ambient space V (with the usual convention Note that the following properties apply in particular to t q , and the unique optimal code in this if min |t − t |; t ∈ T, t ∈ T (i.e., the distance between T and T in the Euclidean metric) is at least d; where d = min |s + t − v|; s, t ∈ T (the distance between v and T +T ⊂ R in the Euclidean metric). 12 Proof. Only (iv), (v) and (vi) require a proof.
In (iv) we may assume For the proof of (v) assume V = F v q and use the fact that the map π : X → X ⊥ represents an automorphism of the metric space PG (This is clearly true if s + t ≤ v, and the case s + t > v can be reduced to the former by setting s = v − s, t = v − t and using (v).) In particular, the diameter of V s is 2 min{s, v − s}. Assertion (vi) now follows from the observation that min{s + t, 2v − s − t} = v − |s + t − v|.
Next we discuss the growth of the numbers A q (v, d; k) as a function of k ∈ {0, 1, . . . , v/2 }. 13 While not directly applicable to the mixed-dimension case, this analysis provides some useful information also for this case, since mixed-dimension codes are composed of constant-dimension "layers".
Since the minimum distance of a a constant-dimension code is an even integer, we need only consider the case d = 2δ ∈ 2Z.
Note that C(q, δ) is independent of v, k and satisfies lim q→∞ C(q, δ) = 1. In fact our proof of the lemma will show that for δ ≥ 2 the number C(q, δ) = 1 − q −1 may be replaced by the larger quantity ∞ i=δ (1 − q −i ), which is even closer to 1.
Proof. First we consider the case δ = 1, in which the numbers A q (v, 2δ; k) = A q (v, 2; k) = v k q are already known. Here the assertion follows from using v − k + 1 > k for the last inequality. Now assume δ ≥ 2. The lifting construction produces (v, q (k−δ+1)(v−k) , 2δ; k) q constant-dimension codes ("lifted MRD codes") and gives the bound A q (v, d; k) ≥ q (k−δ+1)(v−k) , which is enough for our present purpose. On the other hand, every (v, M, 2δ; k) q code satisfies d S (X, X ) = 2k − 2 dim(X ∩ X ) ≥ 2δ or, equivalently, dim(X ∩ X ) ≤ k − δ for any two distinct codewords X, X ∈ C. This says that (k − δ + 1)-dimensional subspaces of V are contained in at most one codeword of C and gives by double-counting the upper bound for such codes. Simplifying we get M < q (k−δ+1)(v−k) /C(q, δ, k) with C(q, δ, k) = k i=δ (1−q −i ). Replacing k by k−1 turns this into an upper bound for A q (v, 2δ, k−1) and, together with the previously derived lower bound for A q (v, d; k), gives the estimate .
Hence (and using δ ≥ 2), the quotient in (7) is q , as claimed. The remaining assertions of the lemma are clear.
From the lemma, the numbers A q (v, d; k) grow fast as a function of k in the range 0 ≤ k ≤ v/2. This implies that the following simple estimates yield quite a good approximation to A q (v, d).
Theorem 2.5. Suppose that for some parameters q, v, d we already know all of the and this constitutes the best bound for A q (v, d) that does not depend on information about the cross-distance distribution between different layers V k and V l . Proof. First note that 2δ, where δ = d/2 , is the smallest even integer ≥ d and hence A q (v, d; k) = A q (v, 2δ; k). Now the upper bound follows from the observation that the two (isomorphic) metric spaces consisting of all subspaces of V of dimension < δ (respectively, > v−δ) have diameter < d and thus contain at most one codeword of any (v, M, d) q code.
The lower bound follows from the inequality d S (X, Y ) ≥ |dim(X) − dim(Y )| and remains valid if we replace v/2 by an arbitrary integer r. In order to show that the lower bound is maximized for r = d/2 , let σ r denote the sum of all numbers A q (v, 2δ; k) with k ∈ [0, v] and k ≡ r mod d. Since σ r is d-periodic and satisfies σ r = σ v−r for 0 ≤ r ≤ v, it suffices to show σ r > σ r−1 for For r in the indicated range, σ r − σ r−1 is a sum of terms of the form where 0 ≤ t ≤ r/d and the convention A q (v, 2δ; k) = 0 for k / ∈ [0, v] has been used. From Lemma 2.4 we have 14 From this (and q ≥ 2) we can certainly conclude 2.3. Shortening and puncturing subspace codes. In [15,36] two different constructions of (v − 1, M , d ) q subspace codes from (v, M, d) q subspace codes were defined and both referred to as "puncturing subspace codes". Whereas the construction in [36] usually has M = M (as is the case for puncturing block codes), the construction in [15] satisfies M < M apart from trivial cases and behaves very much like the shortening construction for block codes. For this reason, we propose to change its name to "shortening subspace codes". We will now give a simple, coordinate-free definition of the shortening construction and generalize the puncturing construction of [36] to incorporate simultaneous point-hyperplane puncturing.
Definition 2.6. Let C be a subspace code with ambient space V , H a hyperplane and P a point of PG(V ). The shortened codes of C in H, P and the pair P, H are defined as C| H = {X ∈ C; X ⊆ H}, C| P = {X/P ; X ∈ C, P ⊆ X}, with ambient spaces H, V /P and H, respectively.
Note that the operations C → C| P and C → C| H are dual to each other in the sense that they are switched by the polarity π. Simultaneous point-hyperplane shortening C → C| P H glues these parts together by means of the projection map X → (X + P ) ∩ H. The puncturing construction in [15] is equivalent to C → C| P H with the additional assumption that P and H are not incident. This assumption implies C| P ∩ C| H = ∅ and that X → (X + P ) ∩ H maps C| P isomorphically onto the subspace code {Y ∩ H; Y ∈ C| P }. 15 Shortening in point-hyperplane pairs (P, H) with P ⊆ H seems of little value and will not be considered further in this paper.
Definition 2.7. Let C be a subspace code with ambient space V , H a hyperplane and P a point of PG(V ). The punctured codes of C in H, P are defined as C H = {X ∩ H; X ∈ C}, C P = (X + P )/P ; X ∈ C with ambient spaces H and V /P , respectively. Moreover, the punctured code of C in P, H with respect to a splitting C = C 1 C 2 is defined as Here mutatis mutandis the same remarks as on the shortening constructions apply. The original puncturing operation in [36] is C → C H , with attention restricted to constant-dimension codes and the following modification: If C has constantdimension k then C H can be turned into a code of constant-dimension k − 1 by replacing each subspace X ∈ C with X ∩ H = X (i.e. X ⊆ H) by some (k − 1)dimensional subspace contained in X. Simultaneous point-hyperplane puncturing has been defined to round off the construction principles and will not be used in later sections.
The next lemma provides general information about the parameters of shortened and punctured subspace codes. The lemma makes reference to the degree of a point P or a hyperplane H with respect to a subspace code C, which are defined as deg(P ) = {X ∈ C; P ⊆ X} = # C| P and dually as deg(H) = {X ∈ C; X ⊆ H} = # (C| H ), respectively. 16 The same is true of the punctured code (C 1 , C 2 ) P H with respect to any splitting C = C 1 C 2 satisfying d S (C 1 , C 2 ) ≥ d + 1.
The strong bound d ≥ d−1 in Part (i) of the lemma accounts for the significance of the shortening construction, as mentioned in [15].
The usefulness of the bounds in Part (ii), which are weaker, is less clear. The first assertion in (ii) was already observed in [36] (for the code C H ). The condition d S (C 1 , C 2 ) ≥ d + 1 in the second assertion is required, since cross-distances can decrease by 3 during puncturing. Alternatively we could have assumed d ≥ 4 and replaced d ≥ d − 2 by d ≥ d − 3 in the conclusion.
Proof of the lemma. (i) Since P H, the codes C| H and C| P are disjoint, and since d ≥ 2, the same is true of C| H and {Y ∩ H; Y ∈ C| P }. Hence we have #C| P H = # (C| H ) + # C| P = deg(H) + deg(P ). Since Y → Y ∩ H defines an isometry from PG(V /P ) onto PG(H), we need only check "cross-distances" d S (X, Y ∩ H) with X ∈ C| H , Y ∈ C| P . In this case we have 16 Incidences with the trivial spaces {0}, V (if they are in C) are thus not counted. and the assertion regarding d follows.
(ii) As in the proof of (i) one shows This implies the assertion about C P , and that about C H follows by duality. For the last assertion we need only check cross represented by a linearized polynomial of symbolic degree at most k − δ. 18 The code G forms a geometrically quite regular object. The most significant property, shared by all lifted MRD codes with the same parameters, is that G forms an exact 1-cover of the set of all (k − δ)-flats of PG(V ) that are disjoint from the special flat S = {0} × F q n . 19 A further regularity property, which we will need later, is that every point P / ∈ S has degree q (k−δ)(v−k) with respect to G. Indeed, P = F q (a, b) ∈ Γ f if and only if f (a) = b (using a = 0), which reduces f to a linear map on a (k − 1)-dimensional subspace of W . 20 From now on we assume that n = k (or v = 2k, the "square" case) and hence W = F q k . In this case the codes G = G 2k,k,δ , 1 ≤ δ ≤ k, are invariant under a correlation of PG(F q k × F q k ) fixing S, as our next theorem shows. In particular, every hyperplane H of PG(V ) with H S contains, dually so-to-speak, precisely q (k−δ)(v−k) codewords of G, This property will be needed later in Section 3.3. Before stating the theorem, let us remark that a non-degenerate bilinear form on The symbol ⊥ will denote orthogonality with respect to this bilinear form. Hence hyperplanes of 17 The distance actually drops by 3 in the case P ⊆ X + Y ∧ X ∩ (P + Y ) ⊆ H, and nontrivial examples of P, H, X, Y that satisfy these conditions are easily found. 18 Recall that every Fq-linear endomorphism of F q n is represented by a unique linearized polynomial of symbolic degree ≤ n − 1. Restriction to W then gives a canonical representation of Fq-linear maps f : W → F q n by linearized polynomials of symbolic degree ≤ k − 1. 19 Since the codewords of G are disjoint from S (since they are graphs of linear maps), it is clear that only flats disjoint from S are covered. The exact cover property is a consequence of Delsarte's characterization of MRD codes (cf. Footnote 20) and is proved in [32,Lemma 6], for example. 20 Here the following property of Gabidulin codes (or lifted MRD codes in general) due to Delsarte [10] simplifies the view considerably: Every Fq-linear map g : U → F q n , defined on an arbitrary (k − δ + 1)-dimensional subspace U of W , extends uniquely to a linear map f ∈ G. This property also gives #G immediately.
Theorem 2.9. The q-ary Gabidulin codes G = G 2k,k,δ are linearly isomorphic to their duals G ⊥ = {X ⊥ ; X ∈ G} and hence invariant under a correlation of PG(v − 1, F q ). Any correlation κ fixing G fixes also S. 21 The last assertion follows from the fact that S is the unique (k − 1)-flat complementary to all codewords of G. Remark 1. The automorphism group Aut(G) of G = G 2k,k,δ obviously contains all collineations of PG(V ), V = F q k × F q k , induced by linear maps of the form (x, y) → ax, by + f (x) with a, b ∈ F × q k and f (x) as above. These collineations form a subgroup of Aut(G), which has two orbits on the point set P of PG(V ), viz. S and P \ S. 22 From this and Theorem 2.9 we have that for any point P / ∈ S and any hyperplane H S there exists a correlation κ ∈ Aut(G) satisfying κ(P ) = H.

2.5.
A second-order bound for the size of the union of a finite set family. A recurring theme in subsequent sections of this paper will be the determination of the best lower bound for the size of a union A 1 ∪ · · · ∪ A δ of a family of t-sets A 1 , . . . , A δ with a prescribed upper bound s on the size of their pairwise intersections A i ∩ A j , 1 ≤ i < j ≤ δ. For example, we may ask for the least number of points covered by δ k-dimensional subspaces of F v q at pairwise subspace distance ≥ 2k − 2, in which case t = 1 + q + · · · + q k−1 and s = 1. 23 Assuming that A 1 , . . . , A δ ⊆ A for some "universal" set A, we define a i as the number of "points" a ∈ A contained in exactly i members of A 1 , . . . , A δ . 21 This also shows that the square of κ is the collineation induced by φ 2 : Fq(a, b) → Fq(a q , b q ). 22 Clearly this subgroup is also transitive on G, but this fact won't be used in the sequel. 23 It goes (well, almost) without saying that subspaces of F v q are identified with the sets of points (1-dimensional subspaces) they contain.
Our assumptions and easy double-counting arguments give the following "standard equations/inequalities" for the first three binomial moments of the sequence a 0 , a 1 , a 2 , . . . : From this we infer that #( This is a special case of the classical second-order Bonferroni Inequality in Probability Theory [23]. Our setting usually permits a i > 0 for fairly large i, so that we need a stronger bound. In the following proposition we state a bound suitable for our purposes. In essence this result is also known from Probability Theory [23]. We have adapted it to the present combinatorial setting and provide a self-contained proof, which may be of independent interest. Lemma 2.10. Suppose that A 1 , . . . , A δ are finite nonempty sets with moments µ 1 , µ 2 as defined in (8). 24 Then Moreover, (9) remains valid if µ 1 , µ 2 are replaced by any known bounds µ 1 ≤ µ 1 and µ 2 ≥ µ 2 .
Proof. Using the representation p(X) = p 0 + p 1 X + p 2 X(X − 1)/2, we can evaluate i≥0 p(i)a i in terms of µ 0 , µ 1 , µ 2 for every polynomial p(X) ∈ R[X] of degree at most 2. If p(0) > 0 and p(i) ≥ 0 for i = 1, 2, . . . , this will give an upper bound for a 0 and hence a lower bound for #(A 1 ∪ · · · ∪ A δ ) = i≥1 a i = µ 0 − a 0 . 25 The polynomials have this property, since they are convex and vanish at two successive integers. We obtain which is equivalent to . 24 The definition of µ 1 , µ 2 does not depend on the choice of A. 25 Exhibiting a quadratic polynomial p(X) suitable for a particular problem in Combinatorics is sometimes referred to as the "variance trick"; cf. [7, p. 6].
The best (largest) of these bounds can be determined by applying standard calculus techniques to the function Hence the maximum value of f at integral arguments is obtained at i 0 = 2µ 2 /µ 1 + 1. 26 For a proof of the last assertion note that the family of bounds (10) also holds for µ 1 , µ 2 in place of µ 1 , µ 2 . The rest of the proof is (mutatis mutandis) the same.

Classification results for general parameter sets
In this section we present old and new results on optimal subspace codes in the mixed-dimension case for general q, v, and d. We start with the largest possible minimum distances (i.e. d ≈ v) and later switch to small d. Whenever possible, we determine the numbers A q (v, d), the dimension distributions realized by the corresponding optimal codes, and a classification of the different isomorphism types. In order to avoid trivialities, we assume from now on v ≥ 3 and 2 ≤ d ≤ v.
3.1. Subspace distance v. Apart from the trivial case d = 1 covered already by Lemma 2.3(i), the case d = v is the easiest to settle. For the statement of Part (ii) of the classification result recall that the largest size of a (2k, M, 2k; k) q constant-dimension code is M = A q (2k, 2k; k) = q k + 1 and that optimal (2k, q k + 1, 2k; k) q codes are the same as (k−1)-spreads in PG(2k−1, F q ), i.e. sets of mutually disjoint (k − 1)-flats (or k-dimensional subspaces of F 2k q ) partitioning the point set of PG(2k − 1, F q ). The number of isomorphism classes of such spreads or, equivalently, the number of equivalence classes of translation planes of order q k with kernel containing F q under the equivalence relation generated by isomorphism and transposition [11,34], is generally unknown (and astronomically large even for modest parameter sizes).
(v, q k + 1, v) q subspace code has constant dimension k. The exact number of isomorphism classes of such codes is known in the following cases: q v # isomorphism classes The numbers A q (v, v) have also been determined in [22,Sect. 5]. 26 If 2µ 2 /µ 1 ∈ Z then there are two optimal solutions, viz. 2µ 2 /µ 1 and 2µ 2 /µ 1 +1. The (unique) Since v is odd, subspaces of the same dimension cannot be complementary, excluding the existence of three mutually complementary subspaces. This implies A q (v, v) = 2. The classification of optimal (v, 2, v) q codes is then immediate.
(ii) Suppose that C is an arbitrary (2k, M, 2k) q code. If C contains a codeword of dimension i = k then all other codewords must have dimension 2k − i = i and hence #C ≤ 2. Certainly C cannot be optimal in this case. Hence C has constant dimension k and size M = q k + 1.
Determining the isomorphism classes of the optimal (2k, q k + 1, 2k; k) q codes in the table amounts to classifying the translation planes of order ≤ 49 up to isomorphism and polarity. This has been done in a series of papers [9,12,13,26,40], from which we have collected the relevant information; cf. also [42,Sect. 5]. 27 We remark that it is easy to obtain the numbers A q (v, v; T ) for arbitrary subsets Here we can no longer expect that optimal subspace codes have constant dimension, since for example in a (2k, q k + 1, 2k; k) q constant-dimension code replacing any codeword by an incident (k−1)-or (k+1)-dimensional subspace produces a subspace code with d = v−1. However, it turns out that the largest constant-dimension codes satisfying d ≥ v − 1 are still optimal among all (v, M, v − 1) q codes and that there are only few possibilities for the dimension distribution of an optimal (v, M, v − 1) q code.
Before stating the classification result for d = v−1, let us recall that in the case of odd length v = 2k+1 the optimal constant-dimension codes in the two largest layers V k and V k+1 (which are isomorphic as metric spaces) correspond to maximal partial (k − 1)-spreads in PG(2k, F q ) and their duals. 28 The maximum size of a partial (k −1)-spread in PG(2k, F q ) is q k+1 +1, as determined by Beutelspacher [4, Th. 4.1]; cf. also [14,Th. 2.7]. This gives A q (2k+1, 2k; k) q = A q (2k+1, 2k; k+1) q = q k+1 +1. Moreover, there are partial spreads S of the following type: The q k holes (uncovered points) of S form the complement of a k-dimensional subspace X 0 in a (k + 1)dimensional subspace Y 0 , and X 0 ∈ S. We may call X 0 the "moving subspace" of S, since it can replaced by any other k-dimensional subspace of Y 0 without destroying the spread property of S.
All optimal subspace codes contain, apart from codewords of dimension k, at 27 Uniqueness of the projective planes of orders 4 and 8 gives the uniqueness of the (4, 5, 4; 2) 2 and (6, 9, 6; 3) 2 codes. The 8 translation planes of order 16 include 1 polar pair (the Lorimer-Rahilly and Johnson-Walker planes), accounting for 7 isomorphism classes of (8, 17, 8; 4) 2 codes. The 2 translation planes of order 9 (PG(2, F 3 ) and the Hall plane) are both self-polar, accounting for 2 isomorphism classes of (4, 10, 4; 2) 3 codes. The 7 translation planes of order 27 are all self-polar, accounting for 7 isomorphism classes of (6, 28, 6; 3) 3 codes. Among the translation planes of order 16, three planes (PG(2, F 16 ), the Hall plane and one of the two semifield planes) have a kernel of order 4. All three planes are self-polar, accounting for 3 isomorphism classes of (4, 17, 4; 2) 4 codes. Finally, there are 21 translation planes of order 25 including 1 polar pair (the two Foulser planes) and 1347 translation planes of order 49 including 374 polar-pairs, accounting for the remaining two table entries. 28 A partial spread is a set of mutually disjoint subspaces of the same dimension which does not necessarily cover the whole point set of the geometry. most one codeword of each of the dimensions k − 1 and k + 1. The dimension distributions realized by optimal subspace codes are In (ii) it is necessary to exclude the case v = 3, since A q (3, 2) = q 2 + q + 2; cf. Section 3.4. Some results on the numbers A q (v, v − 1) can also be found in [22,Sect. 5] Proof. (i) Let C be an optimal (2k, M, 2k −1) q code. Since V <k has diameter 2k −2, at most one codeword of dimension < k can occur in C, and similarly for dimension > k. This and Theorem 3.1(ii) give q k + 1 ≤ M ≤ q k + 3. Clearly there must be codewords of dimension k, and hence none of dimensions < k − 1 or > k + 1.
If there exists X 0 ∈ C with dim(X 0 ) = k − 1 then X 0 ∩ Z = ∅ for all other codewords Z ∈ C k . Similarly, if there exists Y 0 ∈ C with dim(Y 0 ) = k + 1 then Y 0 ∩ Z = P is a point for all Z ∈ C k , and if both X 0 and Y 0 exist then they must be complementary subspaces of V .
3.3. Subspace distance v − 2. The case d = v − 2 is yet more involved and we are still far from being able to determine the numbers A q (v, v − 2) in general. For even v = 2k the problem almost certainly includes the determination of the numbers A q (2k, 2k − 2; k), which are known so far only in a single nontrivial case, viz. A 2 (6, 4; 3) = 77 [32]. On the other hand, we will present rather complete information on the odd case v = 2k + 1, for which the corresponding numbers A q (2k + 1, 2k − 1; k) = A q (2k + 1, 2k; k) = q k+1 + 1, equal to the size of a maximal partial (k − 1)-spread in PG(2k, F q ), are known; cf. the references in Section 3.2. Our results are collected in Theorem 3.3 below. For the proof of the theorem we will need the fact that a maximal partial (k − 1)-spread S in PG(2k, F q ) covers each hyperplane at least once. This (well-known) fact may be seen as follows: If a hyperplane H of PG(2k, F q ) contains t members of S, it intersects the remaining q k+1 + 1 − t members in a (k − 1)-dimensional space and hence t(1 + q + · · · + q k−1 ) + (q k+1 + 1 − t)(1 + q + · · · + q k−2 ) ≤ #H = 1 + q + · · · + q 2k−1 , 34 The last point δ = q k+1 is best viewed as the right endpoint of the hole [q k+1 − 1, q k+1 ], since the function f (q; δ) not really matters. Alternatively, one could check the strict inequality at δ = 2 and δ = q k+1 − 1, respectively. 35 This comes not unexpected, since in these cases the optimal codes have size q k+1 + 1 and the sets A, B partition the point set of PG(2k, Fq). 36 The last computation could be replaced by another convexity argument involving the function i 0 → f i 0 ; 1 + (i 0 − 1)(1 + q + · · · + q k ) . or tq k−1 ≤ q k−1 + q k . The difference q k−1 + q k − tq k−1 = q k−1 (q + 1 − t) gives the number of holes of S in H, which must be ≤ q k (the total number of holes of S). This implies 1 ≤ t ≤ q + 1, as asserted.
We also see that the number of holes of S in every hyperplane of PG(2k, F q ) is of the form sq k−1 with s ∈ {0, 1, . . . , q}. Further, since the average number of members of S in a hyperplane is (q k+1 + 1)(1 + q + · · · + q k ) 1 + q + · · · + q 2k = 1 + q + · · · + q 2k+1 1 + q + · · · + q 2k > q, there exists at least one hyperplane containing q + 1 members, and hence no holes of S. The latter says that the set of holes of S does not form a blocking set with respect to hyperplanes and implies in particular that no line consists entirely of holes of S. This fact will be needed later in Remark 4.
Our strategy now is to bound the size of the "middle layer" #C k in terms of t = dim(X 0 ). If this leads to a sharp upper bound for C, which conflicts with the best known lower bound for A q (2k, 2k − 2; k), we can conclude C = C k , and hence A q (2k, 2k − 2) = A q (2k, 2k − 2; k).
Since any two codewords in C k span at least a (2k − 1)-dimensional space, we have that any (2k − 2)-dimensional subspace of V contains at most one codeword of C k . Conversely, every codeword of C k , being disjoint from X 0 , is contained in a (2k − 2)-dimensional subspace intersecting X 0 in a subspace of the smallest possible dimension, viz. max{t − 2, 0}. 38 Denoting by S the set of all such (2k − 2)dimensional subspaces and by r the (constant) degree of #C k with respect to S, we get the bound #C k ≤ #S/r. It is easily seen that 37 The bounds for A 2 (v, v − 2) were already established in [15,Th. 5] and A 2 (5, 3) = 18 in [18, Th. 14]. 38 The condition dim(S ∩ X 0 ) = t − 2 is of course equivalent to S + X 0 = V , but we need the former for the counting argument.
These bounds are sufficient to conclude the proof in the case where at most one codeword of dimension = k exists. But for the case δ k−1 ≥ 2 and its dual, and for several cases having δ k−2 + δ k−1 = δ k+1 + δ k+2 = 1 we need better bounds.
First we do the case δ k−1 ≥ 2. Let X 1 , X 2 be two distinct codewords in C k−1 . Then X 1 , X 2 are disjoint, and every X ∈ C k is simultaneously disjoint from both X 1 and X 2 . In this case we can bound #C k in the same way as above, using for S the set of (2k − 2)-dimensional subspaces S of V satisfying dim(S ∩ X 1 ) = dim(S ∩X 2 ) = k −3. Since the number of simultaneous complements of two disjoint lines in PG(n − 1, F q ) is q 2n−7 (q 2 − 1)(q − 1), 39 we obtain The degree r of X ∈ C k with respect to S is equal to the number of (k − 2)dimensional subspaces of the k-dimensional space V /X meeting the (not necessarily distinct) hyperplanes H 1 = (X 1 + X)/X and H 2 = (X 2 + X)/X in a (k − 3)dimensional space. By duality, r is also equal to the number of lines in PG(k −1, F q ) off two points P 1 , P 2 (which may coincide or not), and hence where the last inequality follows from a straightforward computation. 40 This new bound is sufficient for the range 2 ≤ δ k−1 ≤ 1 2 q k+1 (since we may obviously assume δ k+1 ≤ δ k−1 ), but there remains a gap to the known upper bound δ k−1 ≤ q k+1 + q 2 for k ≥ 4, respectively, δ 2 ≤ q 4 + q 2 + 1 for k = 3. However, for δ k−1 > 1 2 q k+1 the standard method to bound #C k in terms of the point degrees can be used: Since the δ k−1 (1 + q + · · · + q k−2 ) points covered by the codewords in C k−1 must have 39 This is probably well-known and perhaps most easily established by counting triples (L 1 , L 2 , U ) of mutually skew subspaces of F n q with dim(L 1 ) = dim(L 2 ) = 2, dim(U ) = n − 2 in two ways: Using canonical matrices, the number # (L 1 , L 2 , S) = # (S, L 1 , L 2 ) of such triples is easily found to be n n−2 q q 2(n−2) (q n−2 − 1)(q n−2 − q). Dividing this number by # (L 1 , L 2 ) = n 2 q n−2 2 q q 4 gives q 2n−7 (q 2 − 1)(q − 1), as asserted. 40 The inequality is sharp precisely in the case k = 3, all q. degree 0 in C k , we obtain The factor of δ k−1 is ≥ 2 in all cases except q = 2, k = 3, leading to the desired contradiction #C ≤ #C k + 2δ k−1 ≤ q 2k . in the exceptional case we have #C ≤ 64 + 1 + 1 7 δ k−1 ≤ 68, which also does the job. It remains to consider the cases with δ k−2 + δ k−1 = δ k+1 + δ k+2 = 1. We may assume k ≥ 4 and need only improve the previously established bound #C k ≤ q 2k by one. We denote the unique codewords of dimensions t < k and u > k by X 0 and Y 0 , respectively. From the proof we have #C k = q 2k if and only if every Then dim(S + Y 0 ) ≤ 2k − 1 and hence S + Y 0 , and a fortiori S, cannot contain a codeword of C k . 41 This gives #C k ≤ q 2k − 1 and #C ≤ q 2k + 1, as desired.
It remains to construct codes meeting the upper bound for k = 2, all q and for k = 3, q = 2.
In those cases where A q (v, v − 2) = A q (v, v − 2; k) one may ask whether all optimal codes must have constant dimension. The parameter set (v, d; k) q = (8, 6; 4) 2 illustrates the difficulties in answering this question: Presently it is only known that 257 ≤ A 2 (8, 6) = A 2 (8, 6; 4) ≤ 289, with the corresponding Gabidulin code G = G 8,4,3 of size 256 accounting for the lower bound. If the true value turns out to 44 Another construction for (5, 2q 3 + 2, 3)q codes was recently found by Cossidente, Pavese and Storme [8]. 45 In fact, we succeeded in a complete classification of codes with these parameters. The total number of equivalence classes is 20. Details will be given in [33]. 46 For q = 2 this is known, as we stated in the theorem.
be 257 then there are both constant-dimension and mixed-dimension codes attaining the bound, since G can be extended by any at least 2-dimensional subspace of its special solid S; cf. Section 2.4. 47 Remark 4. The dimension distributions realized by (2k + 1, 2q k+1 + 2, 2k − 1) q codes, provided that codes with these parameters actually exist, can be completely determined.
In the case k = 2 these are all four distributions that have "survived" the proof of Theorem 3.3(ii), viz. (0, 0, q 3 +1, q 3 +1, 0, 0), (0, 1, q 3 +1, q 3 , 0, 0), (0, 0, q 3 , q 3 +1, 1, 0) and (0, 1, q 3 , q 3 , 1, 0). 48 This can be seen as follows: The shortening construction used in the proof yields a code C in PG(H) with δ 2 = δ 3 = q 3 + 1 and such that the layers C 2 and C 3 are (dual) partial spreads of the type discussed before Theorem 3.2. Let E 1 , L 1 be the special plane (containing the holes) and the moving line of C 2 , and L 2 , E 2 the special line (meet of the dual holes) and moving plane of C 3 . Then, using the notation in the proof of Theorem 3.3, E 1 = (S + P ) ∩ H, L 1 = E ∩ H, L 2 = L , E 2 = E . In other words, L 1 , L 2 meet in a point (the point L ∩ L ∈ S), E 1 = L 1 + L 2 , and E 2 is some other plane through L 2 . Replacing the plane E 2 ∈ C by any point Q ∈ L 2 \ L 1 minimum distance 3, since Q ∈ E 1 \ L 1 is a hole of C 2 and E 2 is the only plane in C 3 containing Q, 49 and hence gives a (5, 2q 3 + 2, 3) q code with dimension distribution (0, 1, q 3 + 1, q 3 , 0, 0). Similarly, replacing L 1 by any solid T containing E 1 but not E 2 produces a (5, 2q 3 + 2, 3) q code with dimension distribution (0, 0, q 3 , q 3 + 1, 1, 0). Finally, since d S (Q, T ) = 3, the code {Q} ∪ (C 2 \ {L 1 }) ∪ (C 3 \ {E 2 }) ∪ {T } has parameters (5, 2q 3 + 2, 3) q as well and dimension distribution (0, 1, q 3 , q 3 , 1, 0). For k ≥ 3 the only possible dimension distribution is δ k = δ k+1 = q k+1 + 1. In order to see this, we may suppose by duality that (δ k−1 , δ k , δ k+1 , δ k+2 ) = (1, q k+1 + 1, q k+1 , 0) or (1, q k+1 , q k+1 , 1) and must reduce this ad absurdum. In the first case, the codeword of dimension k − 1 must be disjoint from the codewords in C k , which form a maximal partial spread in PG(2k, F q ). This is impossible, since k − 1 ≥ 2 but the set of holes of C k cannot contain a line. In the second case, let X ∈ C k−1 , Y ∈ C k+2 be the unique codewords of their respective dimensions, and note that the codewords in C k−1 ∪ C k are mutually disjoint and meet Y in at most a point. Since δ k = q k+1 is one less than the size of a maximal partial spread, C k has 1+q +· · ·+q k holes. The set of holes contains X and at least #Y − (q k+1 + 1) = q + q 2 + · · · + q k further points from Y . This gives the inequality 1 + 2(q 1 + q 2 + · · · + q k−2 ) + q k−1 + q k ≤ 1 + q + · · · + q k , which is impossible for k ≥ 3. Question 1. By our preceding considerations, any (2k + 1, 2q k+1 + 2, 2k − 1) q code has the form C = S 1 ∪ S ⊥ 2 , where S 1 , S 2 are maximal partial (k − 1)-spreads in PG(2k, F q ). To construct a (2k + 1, 2q k+1 + 2, 2k − 1) q code in this way, one has to pick S 1 , S 2 in such a way that the intersection of any element of S 1 with any element of S ⊥ 2 is at most a point. We refer to this as doubling construction. The natural question is: Which q and k admit a doubling construction? 47 Apart from such extensions, it is also possible to extend G by any 5-or 6-dimensional space containing S and by a solid meeting S in a plane. Extensions by more than one codeword are not possible (i.e. the minimum distance would necessarily be < 6). 48 Recall from the proof that δ 1 = 1 forces δ 3 ≤ q 3 , and similarly for δ 4 = 1. 49 For the latter note that the points of degree 1 with respect to C 3 are those on L 2 , since they are contained in q 2 dual holes of C 3 .
Apart from k = 2, where a doubling construction is possible for every q, the only decided case is (q, k) = (2, 3); cf. Theorem 3.3(ii). On the other hand, by the usual interpretation of q = 1 as the binary Hamming space, the case q = 1 corresponds to a (2k + 1, 4, 2k − 1) binary block code. This code exists for k ≤ 2, but it does not exist for k ≥ 3. This might be a hint in the direction that the doubling construction is not possible for all combinations of q ≥ 2 and k.
3.4. Subspace distance 2. The projective geometry PG(v−1, F q ) or, in the vector space view, the set of F q -subspaces of F v q under set inclusion, forms a finite modular geometric lattice. In particular PG(v − 1, F q ) is a ranked poset with rank function X → dim(X). The theory of finite posets can be used to determine the numbers A q (v, 2) and the corresponding optimal codes, as outlined in [1]. The proof uses a result of Kleitman [35] on finite posets with the so-called LYM property, of which the geometries PG(v−1, F q ) are particular examples. Partial results on the numbers A q (v, 2) can also be found in [22,Sect. 4].
The classification of optimal (v, M, 2) q subspace codes is stated as Theorem 3.4 below. We will provide a self-contained proof of the theorem. The underlying idea is to use information on the intersection patterns of a (v, M, 2) q code with the various maximal chains of subspaces of F v q for a bound on the code size M . Recall that a maximal chain in a poset is a totally ordered subset which is maximal with respect to set inclusion among all such subsets. The maximal chains in PG If we assign to a subspace X as weight w(X) the reciprocal of the number of maximal chains containing X, we can express the code size as Since w(X) = n i depends only on i = dim(X), the inner sums are all alike, and it turns out that they attain a simultaneous maximum at some subspace code, which then of course must be optimal. For other parameters the same method could in principle be applied using suitably chosen families of subsets of the lattice PG(v − 1, F q ), but it seems difficult to find families producing tight bounds. 50 The unique (as a set of subspaces) optimal code in PG(v − 1, F q ) consists of all subspaces X of F v q with dim(X) ≡ k mod 2, and thus of all even-dimensional subspaces for v ≡ 0 mod 4 and of all odd-dimensional subspaces for v ≡ 2 mod 4.
(ii) v = 2k + 1 is odd then 50 Usually much is lost through the fact that no subspace code can maximize all inner sums simultaneously. and there are precisely two distinct optimal codes in PG(v − 1, F q ), containing all even-dimensional and all odd-dimensional subspaces of F v q , respectively. Moreover these two codes are isomorphic.
Proof. Since the collineation group of PG(v−1, F q ) is transitive on subspaces of fixed dimension, the number n i of maximal chains through a subspace X of dimension i does not depend on the choice of X. Further, since each maximal chain passes through a unique subspace of dimension i, we must have n i = n/ v i q , where n denotes the total number of maximal chains. Hence (12) can be rewritten as In order to maximize one of the inner sums in (13), the best we can do (remember the constraint d ≥ 2) is to choose C such that either C ∩ K = {X 0 , X 2 , X 4 , . . . } or C ∩K = {X 1 , X 3 , X 5 , . . . }, depending on which of the sums i even Hence the inner sums are maximized by either choice, and for simultaneous maximization of all inner sums it is necessary and sufficient that the code C consists either of all even-dimensional subspaces or of all odd-dimensional subspaces of F v q . 51 For even v = 2k we use that the Gaussian binomial coefficients satisfy v i q > q v i−1 q for 1 ≤ i ≤ k; cf. the proof of Lemma 2.4. Together with symmetry this v i q and that the unique subspace code C simultaneously maximizing all inner sums in (13) consists of all subspaces X of F v q with dim(X) ≡ k mod 2.

Bounds and classification results for small parameters
In this section we present the best currently known bounds for the numbers A 2 (v, d), v ≤ 7. In most of those cases, where the numbers A 2 (v, d) are known, we also provide the classification of the optimal subspace codes.
The remainder of this section is devoted exclusively to the case q = 2. With a few notable exceptions, the numbers A 2 (v, d) are known for v ≤ 7; see Table 2.  Table 2. A 2 (v, d) and isomorphism types of optimal codes for v ≤ 7 The exact values in the table come from Section 3. The number of isomorphism types of optimal v, A 2 (v, d), d 2 codes is given in parentheses. The remaining entries in the table will be derived in the rest of this section, except for the number of isomorphism types in the two cases (v, d) = (7, 5) and (7,6), which will be derived in a subsequent paper [33].
Regarding the classification of the optimal codes corresponding to the exact values in the table, we have that the codes for d = 2 are unique (Theorem 3.4) and those for d = v ∈ {3, 5, 7} are classified into 2, 3 and 4 isomorphism types, respectively (Theorem 3.1(i)). The codes for (v, d) = (4, 4), (6,6) are unique, since they correspond to the unique line spread in PG(3, F 2 ), respectively, the unique plane spread in PG(5, F 2 ); cf. For the proof of this assertion we use that by Theorem 3.2(i) the dimension distribution of an optimal (4, 5, 3) 2 code up to duality must be one of (δ 1 , δ 2 , δ 3 ) = (0, 5, 0), (1,4,0), (1,3,1) and that the corresponding codes form a single GL(4, F 2 )-orbit. The latter has already been noted for the line spread and can be seen for the other two configurations as follows: The (element-wise) stabilizer in GL(4, F 2 ) of 3 pairwise skew lines L 1 , L 2 , L 3 , which may be taken as the row spaces of (  Fq). 53 The 6 points have the form F 2 (x, y) with x, y ∈ F 2 2 nonzero and distinct, so that regularity follows from the doubly-transitive action of GL(2, F 2 ) on F 2 2 \ {0}. 54 This follows from the fact that L 4 ∪ L 5 is uniquely a union of two lines; in other words, the spread containing L 1 , L 2 , L 3 is uniquely determined. 55 Using coordinates as above and writing P = F 2 (x|y), we have M = A A , where A ∈ GL(2, F 2 ) is the "transposition" satisfying xA = y, yA = x.
The matrix M cannot fix all three points on L 3 (otherwise it would fix all points in the plane L 3 + P and hence also L 1 , L 2 ). Since the three planes containing L 5 are transversal to L 3 and one of them (the plane containing P ) is fixed by M, the other two planes must be switched by M. This shows that the code with dimension distribution (1, 3, 1) is unique as well.
4.1. The case (v, d) = (7,4). This case is in a sense the most challenging, since its exact resolution will very likely encompass an answer to the existence question for a 2-analogue of the Fano plane. The lower bound is realized by adding the whole space V = F 7 2 to the best currently known (7, 329, 4; 3) 2 constant-dimension code [5,31,38]. 56 If a 2-analogue of the Fano plane exists, the same construction yields a (7, 382, 4) 2 code.
(ii) δ 3 + δ 4 ≤ 381; equality can hold only in the constant-dimension case (i.e., for a 2-analogue of the Fano plane and its dual).
Part (ii) of Lemma 4.2 will not be used in the sequel, but has been included since it seems interesting in its own right-showing A 2 (7, 4; {3, 4}) ≤ 381 and characterizing the corresponding extremal case.
Proof of the lemma. The codewords in C 3 cover each line of PG(6, F 2 ) at most once. Codewords in C 4 may cover a line multiple times (up to 9 times, the size of a maximal partial spread in PG(4, F 2 )), but at least they cannot cover the same line as a codeword in C 3 . Denoting by c(δ) the minimum number of lines covered by δ solids S 1 , . . . , S δ in PG(6, F 2 ) at mutual distance ≥ 4, we have the bound 7δ 3 + c(δ 4 ) ≤ 56 Note added in proof: Recently a (7, 333, 4; 3) 2 constant-dimension code has been found [27]. This improves the lower bound for A 2 (7, 4) to 334. 7 · 381 = 2667, the total number of lines in PG(6, F 2 ). Thus δ 3 ≤ 381 − c(δ 4 )/7 , and any lower bound on c(δ) will yield a corresponding upper bound for δ 3 .
For lower-bounding c(δ) we use Lemma 2.10. If b i is the number of lines contained in exactly i solids then where e denotes the number of edges of the distance-4 graph of {S 1 , . . . , S δ }. Since the degree of S i in the distance-4 graph is at most 7 · (21 − 1) = 140, 57 a reasonable upper bound for the second binomial moment of b 0 , b 1 , b 2 , . . . is Lemma 2.10 gives Together with the bound δ 3 ≤ 381 − c(δ 4 )/7 this proves Part (i).
For the next lemma recall that the size of a maximal partial line spread in PG(6, F 2 ) is 41 (leaving 127 − 3 · 41 = 4 = q 2 holes, the smallest number one can achieve for odd v; cf. [4,14]). This implies δ 2 ≤ 41 (and, by duality, also δ 5 ≤ 41). Proof of the lemma. The δ 2 codewords in C 2 form a partial line spread in PG(6, F 2 ). No plane E ∈ C 3 can meet any line L ∈ C 2 , since this would result in d S (L, E) ∈ {1, 3}. Hence, if g(δ) denotes an upper bound for the number of planes at pairwise subspace distance ≥ 4 and disjoint from any set L 1 , . . . , L δ of δ pairwise disjoint lines, we get the bound δ 3 ≤ g(δ 2 ).
We have found three such bounds. Each of them turns out to be stronger than the other two in a certain subinterval of {0, 1, . . . , 41}; cf. the subsequent Remark 5.
The first bound g 1 (δ) is derived from the observation that no line of PG(6, F 2 ) that meets some line L i can be covered by a plane in C 3 . Hence, denoting by m(δ) any lower bound for the number of such lines, we can set g 1 (δ) = 381 − m(δ)/7 .
A good lower bound m(δ) is obtained as follows. The δ lines cover 3δ points. Denoting by b i the number of lines in PG(6, F 2 ) containing exactly i of these points, 57 The bound is the same as for the distance-4 graph of a set of planes at mutual distance ≥ 4. 58 Going backwards, δ 3 + δ 4 = 381 implies c(δ 4 ) = 7δ 4 and e = 70δ 4 .
Remark 5. All three individual bounds are needed for the optimal bound g(δ). In fact, g(δ) coincides with g 1 (δ) in the range 1 ≤ δ ≤ 21, with g 2 (δ) in the ranges 22 ≤ δ ≤ 28, 39 ≤ δ ≤ 41, and with g 3 (δ) in the range 28 ≤ δ ≤ 41. 60 and jumps (i.e., h(δ) > h(δ + 1)) precisely at the corresponding arguments. 61 Proof. First we consider a 4-flat U ∈ C 5 and denote by a i = a i (U ) the number of planes E ∈ C 3 intersecting U in a subspace of dimension i, 1 ≤ i ≤ 3. Since d S (E, U ) ≥ 4, we must have a 3 = 0, and a 1 , a 2 are subject to the following restrictions: This follows from counting the lines L with dim(L ∩ U ) = i ∈ {0, 1, 2} covered by C 3 in two ways, observing that the number of such lines cannot exceed 2 10 for i = 0, (63 − 15) · 31 = 1488 for i = 1, and 5 2 2 = 155 for i = 2. Viewed as a maximization problem for δ 3 , (14) has the unique optimal solution a * 1 = 256, a * 2 = 120 with objective value δ * 3 = 376. This accounts for h(1) = 376 in the table. If a plane subspace code with these parameters actually exists, it must leave exactly 35 lines in U uncovered (and cover all lines not contained in U ).
Since the 4-flats in C 5 form a dual partial line spread, any two of them intersect in a plane and have at most 7 lines in common. This implies that for δ 5 > 1 it is impossible to leave simultaneously exactly 35 lines uncovered in each 4-flat in C 5 , and enables us to derive subsequently a stronger bound for δ 3 .
It is easily checked that a feasible solution of (14) with a 2 = 155 − t must have δ 3 ≤ 341 + t. For any t ∈ {0, 1, . . . , 35} either at least t lines remain uncovered in each of the δ 5 4-flats in C 5 or the bound δ 3 ≤ h 1 (t) = 340 + t holds. In the first case we will use a lower bound c(δ, t) for the union of δ line sets of size t pairwise intersecting in at most 7 lines (with a slight modification for large δ, see below) to get the bound δ 3 ≤ h 2 (δ 5 , t) with h 2 (δ, t) = 381 − c(δ, t)/7 . Putting the two bounds together and minimizing over t gives the final bound δ 3 ≤ h(δ 5 ) with A suitable bound c(δ, t) can again be obtained with the aid of Lemma 2.10. Suppose U 1 , . . . , U δ are 4-flats in PG(6, F 2 ) pairwise intersecting in a plane and L 1 , . . . , L δ line sets with #L j = t and L ⊂ U j for L ∈ L j . Denote by b i the number of lines of PG(6, F 2 ) contained in exactly i of the line sets. Then Applying Lemma 2.10 gives the lower bound A different lower bound, is obtained from the observation that every line L ∈ L 1 ∪ · · · ∪ L δ can be contained in at most 9 4-flats U j , since the 4-flats U j containing L form a dual partial line spread in PG(6, F 2 )/L ∼ = PG(4, F 2 ). Defining c(δ, t) as the maximum of the two bounds, we obtain the final bound h(δ) shown in the table. 62 With these lemmas at hand we can finish the proof of the theorem.
The codes realizing the remaining two dimension distributions contain a partial line spread S 8 of size 8 as a subcode. Up to equivalence, there are 9 types of S 8 , all contained in some maximal partial line spread S 9 of size 9 [24,Sect. 5.2]. For the dimension distribution (0, 0, 8, 1, 0, 0), the plane Y 0 represented by the unique codeword of dimension 3 must be disjoint from each of the 8 lines in S 8 . Thus, these codes C are exactly the partitions of V into 8 lines and a single plane. Extending S 8 by a line X 0 ⊂ Y 0 , we get an S 9 having a moving line X 0 in the sense of Section 3.2 or, using the terminology of [24], an S 9 of regulus type X with the 4 reguli sharing the line X 0 . Thus the S 8 contained in C is the unique regulus-free partial spread (regulus type O in [24]). If follows that up to equivalence there is a unique code with dimension distribution (0, 0, 8, 1, 0, 0). It is given by the lifted Gabidulin code G 5,2,2 together with its special plane. Now let C be a subspace code with dimension distribution (0, 0, 8, 0, 1, 0). It has the form C = S 8 ∪ {H} with a hyperplane (solid) H. The code C has minimum distance 4 if and only if for each line L ∈ S, dim(L ∩ H) = 1. Consider a maximal partial spread S 9 containing S 8 . Since H contains at most one line of S 9 , it must be one of the 3 solids containing the special plane Y 0 of S 9 and contain exactly one line L of S 9 . If L is contained in Y 0 , S 8 has regulus type O and C = S 8 ∪ {H} has the type mentioned in the proof of Theorem 3.2(ii) with H = Y . Moreover, it is readily checked that the 3 possible choices for Y yield equivalent codes. If L is not contained in Y 0 , then H = L + Y 0 and L is contained in 2 reguli of S 9 . This implies that S 8 has regulus type II and is again uniquely determined [24]. Since H is determined by S 8 (for example, as the span of the 7 holes of S 8 ), C is uniquely determined as well.
In all we have seen that up to equivalence there are 2 subspace codes realizing the dimension distribution (0, 0, 8, 0, 1, 0).