Locally repairable codes with high availability based on generalised quadrangles

Locally Repairable Codes (LRC's) based on generalised quadrangles were introduced by Pamies-Juarez, Hollmann and Oggier in \cite{PaHoOg2013}, and bounds on the repairability and availability were derived. In this paper, we determine the values of the repairability and availability of such LRC's for a large portion of the currently known generalised quadrangles. In order to do so, we determine the minimum weight of the codes of translation generalised quadrangles and characterise the codewords of minimum weight.


Introduction
Locally recoverable/repairable codes (LRC's) have been designed to provide reliability with small repair traffic in systems like cloud storage and distributed computing where high data availability is necessary [5,14].The dual codes of partial geometries were used to construct LRC's with high availability in [3] where bounds on the repairability and the availability for these codes were derived.In the same paper (see [3, Theorem 1 and Remark 1]) it was shown that the maximum rate of a balanced locally repairable binary code from a partial geometry is achieved when the partial geometry is a generalised quadrangle.In this paper, we determine the exact values of the repair degree and the repair availability of any LRC constructed from a classical or translation generalised quadrangle or a generalised quadrangle T * 2 (O), O a hyperoval (see Theorem 5.4).In order to do so, we will study the minimum weight of the linear codes generated by the incidence matrix of generalised quadrangles, where we focus on the case of translation generalised quadrangles (see Theorem 4.5).
The study of minimum weight codewords in linear codes is a classical problem in coding theory.Within this topic, the codes generated by the incidence matrix of points and blocks of certain incidence structures are of special interest.The codes generated by the incidence matrix of points and subspaces of affine and projective spaces are well understood since the 1970's.For partial geometries, the dual of these codes have attracted attention, not only as LRC's, but also because of their properties when seen as a Low Density Parity Check code (LDPC code).The minimum weight has been determined in several cases [6,8,12].However, when the partial geometry is not fully embedded in an affine or projective space, far less is known.In this paper, we study this problem for large families of generalised quadrangles, extending the results from [2] for the generalised quadrangles W (3, q) and H(3, q 2 ).

Preliminaries
A partial geometry with parameters (s, t, α) is an incidence structure of points and lines satisfying the following properties.
(1) Through every two distinct points there is at most one line and every two distinct lines meet in at most one point.(2) Every line contains s + 1 points.
(3) Every point lies on t + 1 lines.(4) For every point P and every line L, not through P , there are α lines through P which intersect L.
A partial geometry with α = 1 is called a generalised quadrangle.Generalised quadrangles form an important class of rank two geometries which are fundamental in the theory of buildings developed by J. Tits.
This research was made possible by an Erskine Fellowship from the University of Canterbury, Christchurch, New Zealand.The first author acknowledges the support of The Scientific and Technological Research Council of Turkey, T ÜB İTAK (project no.118F159).
The classical finite generalised quadrangles arise from quadratic and sesquilinear forms and are typically denoted by Q(4, q), W (3, q), Q(5, q), H(3, q 2 ) and H(4, q 2 ) (see [11] for more information).However, other families of generalised quadrangles are known.A large family of examples is given by the translation generalised quadrangles.The classical generalised quadrangles Q(4, q) and Q(5, q) can be described as a translation generalised quadrangle (see Remark 2.1).Each translation generalised quadrangle can be constructed from a certain set of subspaces in a projective space, called an egg.This correspondence will be extensively used in our proofs and is detailed below.A projective (affine) space of dimension n over the finite field with q elements is denoted by PG(n, q) (AG(n, q)).
An egg E n,m is a set of q m + 1 subspaces of PG(2n + m − 1, q), each of dimension (n − 1), such that any three different elements of E n,m span a (3n − 1)-dimensional subspace, and each element E of E n,m is contained in an (n + m − 1)-dimensional subspace, T E , which is disjoint from any element of E n,m \ {E}.The subspace T E is called the tangent space of E n,m at E. The only known examples of eggs are for parameters satisfying m = n and m = 2n.Examples with these parameters can be constructed by applying field reduction (see [10]) to the set of points of an oval (for m = n) or an ovoid (for m = 2n).The examples constructed in this way are called elementary.There exist examples of non-elementary eggs for m = 2n, but all known examples of eggs for m = n are elementary.For more information about eggs, we refer to [11,9].A complete list of the known examples can be found in [9,Section 3.8].
Each translation generalised quadrangle is isomorphic to the incidence structure T (E) constructed from some egg E in a (2n + m − 1)-dimensional projective space PG(2n + m − 1, q) over a finite field F q , q = p h , p prime (see e.g.[11,Theorem 8.7.1]).In order to construct T (E), embed PG(2n + m − 1, q) as a hyperplane The points of T (E) are of three types: The lines of T (E) are of two types: (a) the n-dimensional subspace of Π intersecting H ∞ in an element of E; (b) the elements of E.
Incidence is defined as follows.The point (∞) is incident with all lines of type (b) and with no line of type (a).The points of type (ii) are incident with the lines of type (a) contained in it, and with the unique line of type (b) contained in it.Points of type (i) are incident with all lines of type (a) containing it.
It easily follows from the definition that the generalised quadrangle T (E) has parameters (s, t) = (q n , q m ).
Remark 2.1.The generalised quadrangle T (E) is isomorphic with Q(4, q) if and only if E is an elementary egg obtained from a conic, and isomorphic to Q(5, q) if E is an elementary egg obtained from an elliptic quadric in a 3-dimensional projective space.

Linear codes from affine and projective spaces
Consider the incidence matrix G of points versus t-spaces of PG(n, q), q = p h , p prime, where we index the rows by t-spaces and the columns by points.The vector space generated by the rows of G over the finite field F p is denoted by C t (PG(n, q)).Similarly, the p-ary code generated by the incidence matrix of points versus t-spaces (also called t-flats) of AG(n, q) is denoted by C t (AG(n, q)).Note that in this paper, we are only considering the p-ary codes where p is the prime such that q = p h .We will make use of the following classical results (see e.g.[1, Theorem 5.7.9]).
Result 3.1.(1) The minimum weight of C t (PG(n, q)) is q t+1 −1 q−1 and the codewords of minimum weight are the scalar multiples of the incidence vectors of t-spaces.(2) The minimum weight of C t (AG(n, q)) is q t and the codewords of minimum weight are the scalar multiples of the incidence vectors of t-flats.
Result 3.1 (2) was established by Delsarte, Goethals and MacWilliams [4] by describing C t (AG(n, q)) as a subfield subcode of the generalised Reed-Muller code R q ((n − t)(q − 1), n).This fact allows us to deduce some further properties of C t (AG(n, q)).In 2010, Rolland [13] determined the second weight of the generalised Reed-Muller codes in almost all cases, including R q ((n − t)(q − 1), n).Applied to this case, he obtained the following result.
For q = 2, the next-to-minumum weight was already determined in 1974.
As a corollary, we see that there is a gap in the weight enumerator of C t (AG(n, q)); in particular, we find that there are no codewords of weight q t + 1 in this code.
Corollary 3.4.There are no codewords of weight q t + 1 in C t (AG(n, q)).
We end this section with a lemma concerning codewords of the dual code of C n (PG(2n + m, q)).Two vectors (v 1 , . . ., v m ) and (w 1 , . . ., w m ) in (F p ) m are said to be orthogonal if their dot product (v, w) The dual code of C t (PG(n, q)), denoted by C t (PG(n, q)) ⊥ , is the set of all vectors that are orthogonal to all codewords of C t (PG(n, q)).Observe that a vector c is contained in C t (PG(n, q)) ⊥ if and only if (c, v) = 0 for all rows v of G, where G is the incidence matrix of points versus t-spaces of PG(n, q).Lemma 3.5.Consider a hyperplane H ∞ of PG(2n + m, q).Let U and T be two (m + n − 1)-dimensional subspace of H ∞ , and r ∈ Π \ H ∞ .Let a be the incidence vector of U, r and b the incidence vector of T, r .Then a − b is a codeword of C n (PG(2n + m, q)) ⊥ .
Proof.Let π be an n-dimensional subspace of PG(2n + m, q), then π meets both U, r and T, r in 1 mod p points.Let v be the incidence vector of π.
4. Linear codes from generalised quadrangles 4.1.Linear codes from generalised quadrangles embedded in an affine or projective space.Let G be a generalised quadrangle.Consider the incidence matrix N of points versus lines, where we index the rows by lines and the columns by points.The vector space generated by the rows of G over the finite field F p is called the p-ary code of G.
The following theorem is a corollary of the results for codes from affine and projective spaces.
Corollary 4.1.Let G be a generalised quadrangle of order (s, t) which is fully embedded in an affine of projective space of order q, q = p h , p prime.Then the minimum weight of the p-ary code C(G) is s + 1 and the codewords of minimum weight are precisely the scalar multiples of the incidence vectors of the lines of G.
Proof.The code of the embedded generalised quadrangle is a subcode of the code of points and lines of the ambient space.The corollary follows from Result 3.1.
Remark 4.2.The above corollary shows that the generalised quadrangle T * 2 (O) (which is not a translation generalised quadrangle), where O is a hyperoval, has minimum weight q − 1 and that all codewords of minimum weight are scalar multiples of incidence vectors of lines of T * 2 (O).The dual of this code has been studied as an LDPC code; the minimum weight was determined in [12].
In order to prove this, we will put the incidence matrix of T (E) in a particular form: Order the points of T (E) such that the points of type (i) come first, then the points of type (ii) and finally the point (∞).Order the lines of T (E) such that the lines of type (a) come before the lines of type (b).With this ordering of the points and lines of T (E) the incidence matrix N has the form where the matrix A is the incidence matrix of the points of type (i) with lines of type (a); B is the incidence matrix of points of type (ii) with lines of type (a); O is the all zero matrix (a point of type (i) and a line of type (b) are never incident with each other); D is the incidence matrix of points of type (ii) and lines of type (b); and the last column consists of the all-zero column concatenated with the all-one column.Note that N has the following properties: (1) each row of A has weight q n ; (2) each row of B has weight 1; (3) each row of D has weight q n and each two rows of D have disjoint supports.
Theorem 4.4.The minimum distance of the p-ary code of points and lines of a translation generalised quadrangle of order (s, t), with s a power of the prime p, is s + 1.
Proof.Recall that a translation generalised quadrangle has order (s, t) = (q n , q m ) for some n, m and q = p h , p prime.Let c be a codeword in C(T (E)), say where the u i 's are incidence vectors of lines m i of type (a) and the v i 's are incidence vectors of lines ℓ i of type (b) of T (E).Suppose c has weight ≤ q n .(i) All µ i 's are zero.
Let ĉ denote the codeword in the code C n (AG(2n + m, q)) of points and n-dimensional subspaces of AG(2n + m, q), consisting of the entries of c in the positions corresponding to the points of type (i) of T (E).Then ĉ = λû i where the ûi 's are incidence vectors of n-dimensional subspaces mi of AG(2n + m, q).If wt(ĉ) > 0 then wt(c) = wt(ĉ) = q n since the minimum distance of the code C n (AG(2n + m, q)) is q n (see Result 3.1(2)).Moreover, by the characterisation of codewords of minimum weight in C n (AG(2n + m, q)), ĉ is a scalar multiple of the incidence vector v π of an n-dimensional subspace π of AG(2n + m, q).Since wt(c) = q n , (*) for each point x of type (ii), the restriction of the sum λ i u i to the sum over those i for which m i is a line of T (E) through x, is 0 mod p.
Let ūi denote the incidence vector of the projective completion mi of the affine subspace mi .By property (*) the sum λ i ūi is a codeword, in the code C n (Π) of the points and n-dimensional subspaces of the projective space Π = PG(2n + m, q), of weight q n , a contradiction (see Theorem 3.1(1)).
So wt(ĉ) = 0.With the same notation as above, it follows that the linear combination c = λ i ūi is a codeword in C n (Π) which is a linear combination of the incidence vectors of a subset S of E. Since wt(c) ≤ q n there are at most q n elements in S (ignore (n − 1)-dimensional subspaces whose incidence vector has a zero coefficient in λ i ūi ).Let E ∈ S and F ∈ E \ S, and r ∈ Π \ H ∞ .Let a be the incidence vector of T E , r and b the incidence vector of T F , r .Then, by Lemma 3.5, a − b is a codeword of the dual code of C n (Π).This contradicts c ∈ C n (Π) since by construction the dot product in F p of c with a is zero, while the dot product of c with b is nonzero.
(ii) All λ i 's are zero.
Since the rows of N containing a row of D have weight q n + 1, this implies that there is a linear combination of the rows of D, with at least two nonzero coefficients which results in a codeword of weight ≤ q n , contradicting property (3) of N .
(iii) Finally assume that at least one λ i and at least one µ i is nonzero.
If wt(ĉ) > 0 then as before wt(ĉ) = q n , contradicting the fact that at least one of the µ i 's is nonzero.Hence wt(ĉ) must be zero.Let E = {E 0 , . . ., E q m }.For each j ∈ {0, . . ., q m } denote by α j ∈ F p the coefficient of the incidence vector of E j ∈ E in the linear combination λ i ūi .As in case (i), let S ⊆ E denote the set of egg elements E j for which α j = 0.If S = E then wt( λ i u i ) ≥ q m + 1 and c has at least one nonzero entry in each of the sets of columns corresponding to the partition defined by the supports of the rows of D (cf.property (3) of N ).In order for the codeword c to have weight ≤ q n , for at least one such set T of columns of N the entries in λ i ūi in the positions corresponding to the columns in T must be a nonzero constant, since otherwise (again using property (3) of N ) no linear combination of the rows of D can make that part of the codeword zero.This means that the coefficient α j = 0 (it is a multiple of q n ), where E j is the element of E corresponding to T .This contradicts S = E.It follows that S is a proper subset of E and we can apply Lemma 3.5 and the same argument as in (i) using a codeword a − b in the dual of the code C n (Π) to obtain a contradiction.
This shows that the minimum weight of C(T (E)) is at least q n + 1.Since each row of N has weight q n + 1, the statement follows.
Theorem 4.5.The minimum weight codewords of the p-ary code of points and lines of a translation generalised quadrangle T (E) of order (s, t), with s a power of the prime p, are the incidence vectors of lines of T (E).
Proof.We will use the same notation as in the proof of Theorem 4.4.Let c be a codeword of C(T (E)) of weight q n + 1 as in (1).
If wt(ĉ) > 0 then the weight of ĉ must be q n , since the code C n (AG(2n + m, q)) does not contain any codewords of weight q n + 1 by Corollary 3.4.As in the proof of Theorem 4.4, the codeword ĉ is a scalar multiple of the incidence vector v π of an n-dimensional subspace π of AG(2n + m, q), say ĉ = λv π .Also, the codeword c has exactly one nonzero entry indexed by a point x 0 of type (ii) of T (E), equivalently, (**) the restriction of the sum λ i u i to the sum over those i for which m i is a line of T (E) through a point x of type (ii), is 0 mod p for x = x 0 and nonzero mod p for x = x 0 .
Consider the projective completion of the subspaces mi and define w by setting λw = λ i ūi .Then w is a codeword of the code C n (Π).The affine part of the support of w coincides with π, and by (**) the part of the support of w in H ∞ coincides with E 0 ∈ E, which is the unique line of type (b) incident with the point x 0 .Hence λw is a codeword of C n (Π) of minimum weight and must therefore be the scalar multiple of an n-dimensional subspace of Π (see Result 3.1(1)).It follows that E 0 is contained in the projective completion of π and that c is the scalar multiple of the incidence vector of a line of type (a) in T (E).
If wt(ĉ) = 0 then, as in part (i) of the proof of Theorem 4.4, c is a codeword of C n (Π) which is the linear combination of a subset S of E.
If n < m then the same argument as in part (i) of the proof of Theorem 4.4 applies to obtain a contradiction, by using a codeword a − b in the dual of the code C n (Π) as in Lemma 3.5.
If n = m then consider two elements E, F ∈ E and r a point of Π \ H ∞ .Let a be the incidence vector of T E , r and b the incidence vector of E, F, r .Then again, by Lemma 3.5, a − b is a codeword in the dual code of C n (Π), contradicting the fact that c is a codeword of C(T (E)).
(ii) All λ i 's are zero.
In this case it easily follows from property (3) of the incidence matrix N that c must be the incidence vector of a line of type (b) of T (E).
(iii) At least one λ i and at least one µ i is nonzero.
If wt(ĉ) = 0 then wt(ĉ) ≥ q n and ĉ is the incidence vector of an n-dimensional subspace m.Let a be the codeword of C(T (E)) corresponding to the line of type (a) defined by m.Then a − c ∈ C(T (E)) has weight at most 2, contradicting Theorem 4.4.Hence wt(ĉ) = 0.
As before, let α j ∈ F p denote the coefficient of the incidence vector of E j ∈ E in the linear combination λ i ūi , and put S equal to the set of egg elements E j for which α j = 0.
If n = m then consider two elements E, F ∈ E and r a point of Π \ H ∞ .Let a be the incidence vector of T E , r and b the incidence vector of E, F, r .Then a − b is a codeword in the dual code of C n (Π), see Lemma 3.5, contradicting the fact that c is a codeword of C(T (E)).
If n < m and S = E then wt( λ i u i ) ≥ q m + 1 and c has at least one nonzero entry in each of the sets of columns corresponding to the partition defined by the supports of the rows of D (cf.property (3) of N ).In order for the codeword c to have weight q n + 1, for at least one such set T of columns of N the entries in λ i ūi in the positions corresponding to the columns in T must be a nonzero constant, since otherwise (again using property (3) of N ) no linear combination of the rows of D can make that part of the codeword zero.This means that the coefficient α j = 0, where E j is the element of E corresponding to T .This contradicts S = E.
It follows that S is a proper subset of E and we can apply the same argument as in part (i) of the proof of Theorem 4.4, using a codeword a − b in the dual of the code C n (Π), to obtain a final contradiction.

Locally repairable codes
In [3], the authors study locally repairable codes (LRC's) arising from partial geometries.We use the following definitions from [3].Let C be a code.For every position i, the set Ω(i) is defined to be the set of all parity-check vectors repairing the i-th symbol, i.e.Ω(i) = {v ∈ C ⊥ : v i = 0}.
Definition 5.1.The repair degree for the i-th symbol is r(i) = min{wt(v) − 1 : v ∈ Ω(i)} and the overall repair degree r of a linear code of lenght m is its maximum repair degree: It was shown in [3, Theorem 1 and Remark 1] that the maximum rate of a balanced locally repairable binary code from a partial geometry is achieved when the partial geometry is a generalised quadrangle.
As before, consider the incidence matrix N of points and lines of a partial geometry where rows represent lines and columns represent points.In [3], the authors define a pg − BLRC-code as the dual code of the binary code generated by N .
Result 5.3.[3, Lemma 1]The repair degree r of binary pg-BLRC's C of a partial geometry with parameters (s, t, α) and its repair availability a satisfies r ≤ s and a ≥ t + 1.
Using the results of this paper we can derive the exact values of r and a in the case that the partial geometry is a classical or translation generalised quadrangle or a generalised quadrangle T * 2 (O), O a hyperoval.Theorem 5.4.Let G be one of the following generalised quadrangles: a classical generalised quadrangle, i.e.W (3, q), Q(4, q), Q(5, q), H(3, q 2 ), or H(4, q 2 ); T * 2 (O), where O is a hyperoval in PG(2, q), q even; or a translation generalised quadrangle of order (q n , q m ).The dual code C of the p-ary code of G, where q = p h , p prime has repair degree r = s and repair availability a = (p − 1)(t + 1).
Proof.We have that C ⊥ is the p-ary code of the generalised quadrangle, and we have shown in Corollary 4.1 and Theorem 4.5 that the minimum weight of C ⊥ is s + 1.This implies that r(i) = s for all i, and hence, r = s.Now Ω i consists of the set of all codewords of weight s + 1 through the point P i corresponding to i-th column, which is, again by Corollary 4.1 and Theorem 4.5, the set of scalar multiples of incidence vectors of the lines through P i .There are t + 1 lines through P i , each giving rise to (p − 1) distinct codewords.Hence a(i) = (p − 1)(t + 1), for all i, and therefore a = (p − 1)(t + 1).

Remark 5 . 5 .
Theorem 5.4  shows that the bounds of Result 5.3 are sharp for the binary codes of classical and translation generalised quadrangles of even order and for T * 2 (O), O a hyperoval.