CHANNEL DECOMPOSITION FOR MULTILEVEL CODES OVER MULTILEVEL AND PARTIAL ERASURE CHANNELS

. We introduce the Multilevel Erasure Channel (MEC) for binary extension ﬁeld alphabets. The channel model is motivated by applications such as non-volatile multilevel read storage channels. Like the recently proposed q -ary partial erasure channel (QPEC), the MEC is designed to capture partial erasures. The partial erasures addressed by the MEC are determined by erasures at the bit level of the q -ary symbol representation. In this paper we derive the channel capacity of the MEC and give a multistage decoding scheme on the MEC using binary codes. We also present a low complexity multistage p -ary decoding strategy for codes on the QPEC when q = p k . We show that for appropriately chosen component codes, capacity on the MEC and QPEC may be achieved.

1. Introduction.Non-volatile memories are prevalent in current computer memory technologies such as flash memory, ferroelectric RAM, and magnetic storage systems.In some applications, symbols may be prone to error events in which some partial information can be obtained about the symbol at the receiving end.For example, both NAND flash memory and phase change memory are susceptible to retention errors and read errors [1,2].Cohen and Cassuto [3] introduced the q-ary Partial Erasure Channel (QPEC) to model partial erasure events, and provided a thorough analysis of iterative decoding of LDPC codes on the QPEC [3].Their model is motivated by measurement channels, where information is read by assessing the level of charge in a cell or memory unit.Partial erasure events are characterized by incomplete read processes, such as when readouts of symbols from 2 C. MAYER, K. HAYMAKER AND C. A. KELLEY the channel terminate prematurely.Consequently, in their model, partially erased symbols are regarded as sets of a fixed cardinality M of possible symbols.
In this paper, we introduce another channel model that captures partial erasure events that may happen in practice.For example, in applications using q-ary symbols, typically from GF(2 k ), the symbols may be stored or transmitted as binary k-tuples.This prompts the study of erasure events at the bit level.A symbol that has an erased bit (or bits) may also have partial information at the receiving end.To address these types of errors, we introduce the Multilevel Erasure Channel (MEC) model in which a partially erased symbol may belong to a set of size 2 j , where j ranges from 1 to k.Thus, a symbol that is erased has partial information available at the receiver, provided j < k.When j = k, the symbol is fully erased and may be any one of the q field elements.The MEC model may be applied, for example, when a erasure event is caused by a symbol readout terminating before completion.
Moreover, the MEC is tolerant to varying bit error probabilities that are inherent to the flash storage medium and possibly other storage media.Multilevel cell (MLC) flash and Triple-level cell (TLC) flash are two common settings, where in the MLC case, 4-ary symbols are stored as two bits, and in the TLC case, 8-ary symbols are stored as three bits.Moreover, in the storage architecture, each bit of a symbol is stored on a different "page," and the bits on each page are prone to different bit error probabilities.Thus, the channels characterizing bit transmission on each page are different.In [4], [5], the authors explain this phenomenon for MLC and TLC flash, and also examine the dependence among positions of error events.In [6], bitinterleaved schemes were analyzed to determine the effect on decoding performance.Our proposed MEC model is relevant for these applications in that it can allow for different bit error probabilities within each symbol.Codes for the MEC have similarities to codes designed with unequal error protection (e.g.[7,8]).
Due to their simplicity, low complexity encoding/decoding, and capacity-achieving performance, Low-Density Parity-Check (LDPC) codes have become ubiquitous in modern technologies [10,11,12].Low-density codes over the QPEC are also amenable to iterative and linear programming decoding [3], however, the analysis of the exact behavior of the decoder for the QPEC relates to the subset sum problem in group theory, and is a difficult problem [13].Motivated by this, we examine the achievable rates of codes over the MEC and QPEC using multistage decoding.Imai and Hirakawa introduced multilevel coding in [14], leading to much work in the area of multilevel codes and multistage decoding [8,15,16].In [17], a multilevel coding approach was proposed for magnetic storage applications.
While the idea of multilevel coding is not new, this paper gives the first application of multilevel coding to partial erasure channels.We determine that the MEC over GF (2 k ) is equivalent to k independent BEC channels with transition probabilities given by the MEC.Prompted by this result, we examine the breakdown of the QPEC into knowledge per bit, and establish a multistage decoding scheme for codes on the QPEC over GF(2 k ) using binary codes on the BEC.Although the BEC subchannels are no longer independent, we determine their erasure probabilities in terms of the parameters of the QPEC.When M = 2, we prove that capacity can be achieved by using binary codes optimized for channels with these erasure probabilities.Moreover, we show that for the QPEC with other values of M and q, multistage decoding may still be applied, but the subchannels may not be simple erasure channels.This paper is organized as follows.Basic background and notation is given in Section 2. In Section 3 we present the multilevel erasure channel and derive the capacity of the channel.In Section 4 we briefly discuss coding techniques for LDPC codes on the MEC and QPEC.In Section 5, we analyze multistage decoding on the MEC and QPEC and determine when low complexity decoders may be used.Conclusions are given in Section 6.

2.
Preliminaries.We will focus on codes over nonbinary alphabets, particularly binary extension fields.Let α be a root of a primitive polynomial of degree k over GF (2).Since field elements can be represented as binary k-tuples or as powers of α, we will use the following mapping between these representations: where a i ∈ GF (2).When the context is clear, we will denote (a k−1 , . . ., a 1 , a 0 ) by a k−1 a k−2 . . .a 1 a 0 .We will also let [n] denote the set of integers {1, 2, . . ., n}.
While there is no restriction on which types of codes may be used on the MEC, QPEC, or multilevel settings, we will assume low-density parity-check (LDPC) codes and focus on regular LDPC code ensembles [10].An LDPC code over GF(q) is defined by a parity-check matrix H that is sparse in the number of nonzero entries, and is (d v , d c )-regular if there are d v nonzero elements in each column and d c nonzero elements in each row.A Tanner graph is a bipartite graph with a variable node (vertex) for each column of H, and a check node for each row of H.There is an edge from variable node v i to check node c l with label ψ precisely when ψ ∈ GF(q) is entry (l, i) in the parity-check matrix.The sparsity of H leads to a sparse graph representation, making these codes amenable to efficient graph-based iterative decoding algorithms [11].
Introduced by Cohen and Cassuto in [3], the q-ary partial erasure channel (QPEC) is a generalization of the BEC.Given a q-ary input, the QPEC outputs the original input with probability 1 − ε, or a partial erasure with probability ε.A partial erasure is a set containing the original input as well as M − 1 other q-ary symbols.Each of the q−1 M −1 possible erasures for a symbol occur with equal probability.An example of the QPEC for q = 2 2 and M = 2 is given in Figure 1.The capacity of the QPEC over GF(q) with partial erasures of size M is 1 − ε log q M [3].
3.1.Channel Model.A 2 k -ary symbol x sent across the multilevel erasure channel (MEC) with erasure probability ε will either be received without error with probability 1 − ε, or an erasure event will occur with probability ε.An erasure event may consist of any combination of bits in the binary representation of x being erased.For i = 1, . . ., k, we will use γ i to denote the probability that bit i is erased.For each j = 1, . . ., k, there are k j sets of symbols whose binary representations differ from the binary representation of x in at most j selected bits.Using similar notation as in [3], we define the super-symbol ?B x to be the set of 2 k -ary symbols differing from x in at most the bits given by index set For a 2 k -ary symbol x in which each bit has the same probability, γ, of being erased, the transition probability of the MEC is given by Here, the binary representation of each 4-ary symbol is shown.for j = 1, . . ., k, and where  Example 1.For an 8-ary symbol in which each bit has the same probability, γ, of being erased, the transition probability of the MEC is given by where For example, assume x = (0, 0, 0) is transmitted corresponding to the symbol 0 ∈ GF (8).Then the possible output sets and their transition probabilities are .
The MEC over GF (2 k ) that has the same erasure probability, γ, for each bit in the 2 k -ary symbol will be referred to as the constrained MEC.The general case of the MEC where bit i of the 2 k -ary symbol has erasure probability γ i , for i = 1, 2, . . ., k, will be referred to as the unconstrained MEC.We will now consider the unconstrained MEC case.
For q = 2 k , the transition probability of the unconstrained MEC is given by Example 2. Assume again that x = (0, 0, 0) is transmitted corresponding to the symbol 0 ∈ GF (8).Then the possible output sets and their transition probabilities are .
Remark 1.We introduce the multilevel erasure channel for binary extension fields for practical applications, but note that it can be extended in the obvious way to fields of order q = p k for any prime p > 2, or more general alphabets in which symbols can be represented as words using a subalphabet.

Capacity.
Let C be a channel with input X, output Y , and let p x := P r(X = x) be the input distribution to the channel where x ∈ (GF(2)) k .By definition, the channel capacity, cap(C) is given by where I(X; Y ) is the mutual information between X and Y , H(Y ) is the entropy of Y , and H(Y |X) is the conditional entropy of Y given X.
A channel is said to be uniformly dispersive if the multiset is identical for each input symbol x [18].We first show that the constrained MEC has this property, which can be expected since the bit-erasure probabilities are symbol-independent.
Proof.For any input x on the constrained MEC, there are k j possible outputs of size 2 j for each j = 0, . . ., k.For each output y of size 2 j , P Y |X (y | x) = γ j (1−γ) k−j .Moreover, an output set of size 2 j is obtained by choosing which bits to vary in k j ways and choosing the remaining bits in 2 k−j ways.Thus, there are a total of Since the constrained MEC is a uniformly dispersive channel, it is easy to show that for any input distribution, Thus, to determine capacity, it is enough to maximize H(Y ).For the uniform input distribution {p x } where P r(X = x) = 1/q for each x ∈ GF(q), we have Given the fact that 1 − ε = (1 − γ) k , it follows that given a uniform input distribution and q = 2 k , I(X; Y ) = k(1 − ε) 1/k measured in bits per channel use, or measured in q-ary symbols per channel use.
Theorem 3.2.The capacity on the constrained multilevel erasure channel is A proof of Theorem 3.2 is given in Section 5. Additionally, a proof using a numerical search for k ≤ 2 is included in the appendix.Observe that when k = 1, the channel is simply the Binary Erasure Channel (BEC) with q = 2 and capacity 1 − ε as expected.4. Coding Techniques for the MEC.In this section, we discuss two approaches for LDPC code design on the MEC.The first is based on a q-ary error correcting code and q-ary iterative decoder, and the second uses multilevel coding with multistage decoding.
In the first approach, an (N, K) q-ary LDPC code is used where the code symbols are transmitted across the MEC, and the receiver uses an iterative decoder on the received code symbols with partial erasures.Several iterative decoders have been proposed for q-ary LDPC codes, including an efficient FFT-based implementation, variations of the min-sum decoder, and others.In addition, Cohen and Cassuto [3] give an iterative decoding algorithm for LDPC codes over GF(q) on the QPEC.Specifically, they incorporate the partial erasures into the standard sum-product algorithm.
Channel information on the MEC takes the form of subsets of GF(2 k ) of cardinality 2 j , for some j ∈ {0, 1, . . ., k}.If a variable node is assigned a subset of size one, no error has occured.In the QPEC iterative decoding algorithm, check node calculations involve taking Minkowski sums of subsets of GF(q), and variable node calculations are performed by taking the intersection of the incoming subsets [3].Decoding succeeds when the intersection at each variable node is a set with cardinality one.Moreover, the codeword estimate that results when all variable node subsets have size one is guaranteed to be the original codeword.The decoder fails if the size of an erasure subset at one or more variable nodes never reduces to one.In the case of the QPEC, density evolution may be approximated by tracking the probability that message sets have cardinality m = 1, 2, . . ., q [3].
In the second approach, multilevel codes [14] may be used.In this paper, we focus on a detailed analysis of the multilevel coding with multistage decoding approach for the MEC, rather than analyzing iterative decoding of q-ary codes on the channel.Using this approach to obtain an overall (N, K) q-ary code for the MEC, we first consider the case where q = 2 k .(The case for q = p k , for prime p, is a natural generalization of the p = 2 case.)Observe that N q-ary symbols can be represented using N k bits and K q-ary symbols can be represented using Kk bits.Thus, the Kk information bits are subdivided into k groups with 1 , 2 , . . ., k information bits in each group, respectively.
The information bits within each group are encoded to a length N binary codeword using a component code that is specific for each group.In particular, group i uses a component code C i that represents an (N, i ) binary code.Thus each group is encoded into N bits.The first encoded bits from each of the k groups are combined to form the first q-ary symbol for transmission on the MEC, where the bit from group i forms the i-th coordinate of the binary representation of the q-ary symbol.Similarly, the second encoded bit from each of the groups are combined to form the second q-ary symbol, and so on.Thus, a length N q-ary codeword is obtained via k binary codewords of length N .From the above, we have Kk = 1 + • • • + k , and the overall code rate of the multi-level code is N is the code rate of the i-th binary component code C i .Recall that for x ∈ GF(2 k ), we will denote its binary representation by x = (x 1 , x 2 , . . ., x k ).Thus, in the above scheme, the code C i represents the code over the i-th coordinate of the binary representation of the input alphabet, for i = 1, 2, . . ., k.
To obtain an information-theoretic perspective of the multilevel code and a corresponding multistage decoding scheme, let X i be a random variable representing the i-th coordinate, for i = 1, . . ., k.We now discuss techniques for designing these codes so that the capacity of the MEC is achieved.
First, observe that the random variable X representing the input to the channel has a one-to-one correspondence with the vector (X 1 , . . ., X k ).Thus, the mutual information between the input X and the output (or, received word) Y is Using the chain rule of mutual information, we can write the following: This can be interpreted as follows.The decoder first decodes Y using the code C 1 , and obtains an estimate for X 1 .Using this knowledge and the received word, the decoder then decodes using C 2 to obtain an estimate for X 2 , etc.In this multistage decoding scheme, as the stages of decoding progress, the channels typically improve due to greater side information.Intuitively, it makes sense to assign the strongest code to the weakest channel (i.e., the channel with the largest erasure probability).The code C 1 will have the lowest code rate R 1 , and in general, the codes will be organized so that R 1 ≤ R 2 ≤ • • • ≤ R k where R i is the rate of code C i .(However, note that it is not always the case that I(X i ; Y |X 1 , . . ., X i−1 ) ≥ I(X j ; Y |X 1 , . . ., X j−1 ) for i ≤ j, as these conditional mutual informations depend on the choice of the signal set in the channel.)To achieve capacity via multistage decoding, the code rates should be chosen so that R i = I(X i ; Y |X 1 , . . ., X i−1 ) for each i = 1, . . ., k [14,15].Remark 2. Density evolution may be performed on each component code of the multilevel scheme on the MEC, assuming multistage decoding.
The decoding complexity of the multilevel scheme is much smaller than that of the q-ary LDPC scheme.Typically, for a blocklength N q-ary LDPC code with average column weight d, the decoding complexity of an efficient message-passing implementation is in the order of O(N dq 2 ) [9].The decoding complexity of a binary LDPC code of blocklength N and average column degree d is in the order of O(N d).The multilevel scheme with k binary LDPC component codes requires k binary decoders.Thus, the overall decoding complexity for such a scheme is O(kN d) as opposed to a single q-ary LDPC coding scheme that requires O(N d2 2k ) operations per decoding iteration.Furthermore, in terms of practical implementation, if the same type of error-correction code (i.e.LDPC/BCH etc.) is used for each component code of the k parallel channels, then the same hardware may be used for each level and the hardware may be programmed with parameters appropriate for each code/level. 5. Channel decomposition of the MEC and QPEC.In this section we consider multistage decoding of multilevel codes designed for the MEC and QPEC.We examine how the subchannels break down and determine cases when the subchannels are simple erasure channels.

5.1.
Multistage decoding on the MEC.We now calculate the effective erasure probabilities ε 1 , ε 2 , . . ., ε k for each subchannel assuming an unconstrained multilevel erasure channel C with erasure probability ε and bit error probabilities γ 1 , γ 2 , . . ., γ k .Theorem 5.1.For the MEC channel over GF(2 k ) with erasure probability ε and bit erasure probabilities γ 1 , γ 2 , . . ., γ k for X 1 , . . ., X k , respectively, for i = 1, 2, . . ., k and where X 0 = {}.That is, the MEC can be decomposed into k independent binary erasure channels (BECs), each with erasure probability γ i , and the overall capacity of the MEC is Proof.We show that the channels representing I(X i ; Y ) and I(X i ; Y | X 1 , . . ., X i−1 ) each have transition probability γ i .To see this, consider sending a 2 k -ary symbol across the channel.The probability of receiving an output with uncertainty in the i-th bit is then where the sum is over all subsets that do not include the i-th bit.Now suppose that a 2 k -ary symbol is sent across the channel and bits 1, . . ., i − 1 are known.The probability of receiving an output with uncertainty in the i-th bit is then where the sum is over all subsets excluding the first i positions.Therefore Note that in the special case that γ is the same for each of the k bits, the same code may be applied to each bit or subchannel.We now use Theorem 5.1 to prove Theorem 3.2.
Proof.Theorem 5.1 shows that I(X; Y ) = k i=1 (1 − γ i ) bits/channel-use can be acheived using multi-level codes on the MEC.In the case of the constrained MEC, γ i = γ for all i, so . We now show that for any input distribution {p x } of X where p x = P r(X = x), I(X; Y ) ≤ k − kγ in bits/channel use.
Write the MEC output Y as a vector (b 1 , . . ., b k ), where b i takes the value of the i-th bit if it is known, and b i =? otherwise.For example, the output set {00, 01} will be written as (0, ?). Next, define a random vector E = (a 1 , . . ., a k ) corresponding to Y as follows.Let a i = 1 (resp., a i = 0) if the i-th position of the bit representation of Y is erased (resp., not erased).Thus E may be regarded as a random variable representing an outcome of the MEC.Moreover, H(Y ) = H(Y, E) = H(E) + H(Y |E) by the chain rule of entropy, where the first equality follows from the fact that E is a function of Y .
Observe that P r[E = (a 1 , . . ., a k ) and a j = 1 for exactly which can be shown to equal Write y = (b 1 , . . ., b k ) where b i is 0, 1, or erased.Note that P r(Y = y|E = a) = 0 if the positions in Y that are erased are not identical to the set {i|a i = 1 ∈ a}.
We have where a i denotes a vector in GF(2) k with weight i for i = 0, . . ., k.
Observe that each H(Y |E = a j ) is at most k − j since j components of Y are erased and there are 2 k−j possible values of Y conditioned on E = a j .So the maximum entropy for where the last equality is from the binomial identity.This proves that I(X; Y ) ≤ k −kγ.Since we have already shown that the mutual information I(X; Y ) = k − kγ is achievable using multi-level codes and multi-stage decoding, the capacity of the MEC is indeed k − kγ bits/channel-use (or, (1 − ) 1/k symbols/channel-use).
Example 3. Recall the 4-ary MEC from Figure 2. By using the transition probabilities in Figure 2 and assuming equally likely inputs, the effective channel for X 1 is a binary erasure channel with erasure probability γ, shown in Figure 3.To see this, observe that if X 1 = 0, the outputs {00, 10}, {00, 01, 10, 11}, {01, 11} from input symbols {00}, {01} lead to uncertainty in the value of X 1 at the output.The probability of receiving these outputs is Conditioning on X 1 = 0, the probability is (γ/2) = γ.Without any side knowledge of X 1 , the variable X 2 would see an effective channel that is a BEC with erasure probability γ.
In fact, the effective erasure probability for the BEC for X 2 conditioned on the value of X 1 is also γ, so X 2 can be decoded independent of X 1 .The subchannel for X 2 assuming that X 1 = 0 is shown in Figure 5.

5.2.
Multistage decoding on the QPEC.While it is perhaps not surprising in hindsight that the MEC can be decomposed into independent binary erasure channels and decoded bitwise in parallel, we show in this subsection that the QPEC over GF(2 k ) may also be decomposed into subchannels and decoded via multistage decoding.When M = 2, these subchannels are binary erasure channels, and capacity may be achieved by optimizing the component codes with respect to each channel's capacity.While the subchannels are not independent, it is notable that codes over the 2 k -ary QPEC with M = 2 may be decoded using standard efficient decoders over the BECs, as an alternative to q-ary decoding over the QPEC.For other values of M and q, the QPEC decomposes into simpler channels but they are not necessarily p-ary erasure channels or BECs, for q = p k or q = 2 k , respectively.
We first calculate the effective erasure probabilities ε 1 , ε 2 , . . ., ε k for each subchannel assuming a 2 k -ary partial erasure channel C with erasure probability ε for different values of k and M = 2 when multistage decoding is performed.
Example 4. Consider the QPEC channel with partial erasure probability ε, q = 4, and M = 2. Recall the 4-ary QPEC from Figure 1.The mutual information I(X; Y ) for this channel can be written as I(X 1 , X 2 ; Y ) = I(X 1 ; Y )+I(X 2 ; Y |X 1 ).By using the transition probabilities in Figure 1 and assuming equally likely inputs, the effective channel for X 1 is a binary erasure channel with erasure probability 2  3 ε, shown in Figure 6.To see this, observe that if X 1 = 0, the outputs {00, 10}, {01, 10}, {01, 11} and {00, 11} from input symbols {00}, {01} lead to uncertainty in the value of X 1 at the output.The probability of receiving these outputs is 4( ε 3 ).Conditioning on X 1 = 0, the erasure probability is ε 1 = 1 2 (4( ε 3 )) = 2 3 ε.Now suppose without loss of generality that the bit X 1 is known to equal 0. Figure 7 shows the 4-ary subchannel seen by bit X 2 .We can observe that X 2 conditioned on X 1 is a BEC with probability of erasure derived using the outputs in Figure 7 that create an uncertainty on the value of X 2 .Assuming X 2 = 0 for example, only the output {00, 01} leads to uncertainty.The probability this output occurs when X 2 = 0 is just ε 3 .Thus, the effective erasure probability for the binary erasure channel seen by X 2 conditioned on X 1 is ε 3 .Observe that the sum of the mutual information for the subchannels equals the capacity of the QPEC.In general, we have the following result.Theorem 5.2.For the QPEC with erasure probability ε, q = 2 k and M = 2, the mutual information for the subchannels with multistage decoding is given by Thus, each subchannel is a binary erasure channel with erasure probability ε i .Moreover, multistage decoding of binary codes optimized for each subchannel achieves the capacity on the QPEC.
Proof.Note that for the QPEC with q = 2 k and M = 2, the probability of any specified partial erasure set is ε 2 k −1 .For any given input symbol, there are a total of 2 k−i possible erasure sets with bits 1, . . ., i − 1 known and uncertainty in the i-th bit.To see this, note that once i − 1 bits are fixed, there are k − 1 − (i − 1) = k − i bits other than the (erased) i-th bit for which to choose values.
The probability of receiving an output with uncertainty in the i-th bit when bits 1, . . ., i − 1 are known is thus To see that capacity on the QPEC is achieved, note that The decomposition of the QPEC into simple erasure subchannels does not happen for general p, M > 2. However, the next theorem shows that when q = p k , the k-th subchannel does have a simple mutual information expression.When p = 2, this coincides with the capacity of a BEC with transition probability in terms of ε.
Theorem 5.3.For the QPEC with erasure probability ε, q = p k and any M , the last (k-th) subchannel has mutual information A proof of Theorem 5.3 is given in the appendix.
Corollary 1.For the QPEC with erasure probability ε, q = 2 k and M > 2, multistage decoding may not decompose entirely into simple BECs.However, the last (k-th) subchannel is a simple BEC with transition probability Example 5.For the QPEC with erasure probability ε, q = 2 2 and M = 3, Observe that the sum is 2 − ε log 2 3 bits per channel use.This is equal to (1 − ε log 4 (3)) = (1 − ε log q (M )) symbols per channel use, which is the capacity of the QPEC.Thus, if binary codes are used for X 1 and X 2 that have these rates, the capacity of the QPEC is achieved.
When q is a perfect square, the mutual information of the subchannels may also be calculated in terms of ε and p, as Theorem 5.4 demonstrates.Moreover, one may observe that the subchannels are not simple erasure channels.Proof.Observe that when k = 1, the channel is simply the Binary Erasure Channel (BEC) with q = 2 and capacity 1 − ε as expected.
Let P (•) be a capacity achieving input distribution for the constrained MEC.We show the proof for k = 2.
For each input x i ∈ GF(4), let Y i be the set of possible outputs.Then and the sum is similar for each x i ∈ GF (4).By the Karush-Kuhn-Tucker (KKT) conditions [18], a capacity achieving input distribution for a channel with capacity C must satisfy I(x; Y ) = C for all x with P r(x) > 0. If each y∈Yi (P (y | x i ) log(P (y))) is equal, then we have the following system of equations.Assuming each input symbol occurs with positive probability, we have verified through a numerical search that the uniform input distribution is the unique distribution satisfying the system of equations.Therefore for k = 2, cap(MEC) = k log 2 k (2)(1 − ε) 1/k as claimed.
For q = 2 k , it can be shown that with the uniform input distribution, I(x i ; Y ) = I(x j ; Y ) for each x i , x j ∈ GF(2 k ).Proof.Define the symbol ?0 to be the set of all M -sets containing the all-zeros vector.

Figure 2
Figure 2 is an example of the MEC for q = 2 2 .

Figure 3 .
Figure 3. I(X; Y ) as a function of ε, given a uniform input distribution.