IMPROVED ATTACKS ON KNAPSACK PROBLEM WITH THEIR VARIANTS AND A KNAPSACK TYPE ID-SCHEME

. In the present study we consider two variants of Schnorr-Shevchenko method (SS) for solving hard knapsack problems, which are on average faster than the SS method. Furthermore, we study the compact knapsack problem i.e. the solution space is not { 0 , 1 } as in knapsack problem but some larger set, and we present an algorithm to attack this problem. Final-ly, we provide a three move sound id-scheme based on the compact knapsack problem.


Introduction-statement of results
In the present study we consider the subset sum or knapsack problem. Given a list of n positive integers {a 1 , ..., a n } and an integer s, find a binary vector x, x i a i = s, if such a binary vector exists. We define the density of the knapsack to be, The decision version of the problem is known to be NP-complete [13]. The hard knapsack problems, as we shall see, have density close to 1. In this case, with high probability, there is only one solution e.g. [16,17]. For low density knapsack problems, d < 0.94, we can apply [5]. If d > 1, then there are more solutions of the knapsack problem. There are some cryptographic schemes with d > 1 [33], but there are attacks against them [18,34]. In our study we are interested in knapsacks with density close to one. We shall optimize the algorithm of Schnorr-Shevchenko (SS) [30] and apply it to solve efficiently knapsack problems. We further address the problem of finding integer solutions of a linear Diophantine equation, where the solutions are in specific intervals (this problem is called compact knapsack problem). The solution to this problem heavily depends on the choice of the coefficients and on how to choose the solution space.
Finally, we provide a three move id-scheme based on the compact knapsack problem. We shall prove that this system is sound, which is the minimal notion of security for id-schemes.
Roadmap. The paper is organized as follows. In the next section we present some preliminaries for lattices. In section 3 we present relevant work that has been done. Section 4 presents two variants of the SS method and we provide some experimental results. In section 5 we address the problem of finding constraint solutions in linear Diophantine equations. Section 6 is dedicated to a construction of an id-scheme based on compact knapsack problem and we use the results of section 5 to provide a possible secure selection of the parameters of our id-scheme. Finally, in the last section we provide some concluding remarks.

Background on lattices
See [10,22] for a recent account on lattices. All the bases have the same number of elements, and this common number is called rank of the lattice. Also, we call the number n dimension of the lattice.
The most famous problem in lattices is the Shortest Vector Problem (SVP), which is the task of finding a shortest vector in L(B). This problem is proved to be NPhard under randomized reductions [3]. Also the SVP γ is an approximation of SVP with factor γ. That is, we are looking for lattice vectors x = 0 with, ||x|| < γ(n)||y|| for every y ∈ L(B) − {0}. A famous conjecture in lattices is the following (see [23,Section 4,p.87]) and [22,Conjecture 1]
A similar problem is the Closest Vector Problem (CVP). In this case given a target vector t ∈ span(B) =

Previous work
Shroeppel-Shamir Algorithm [31]. This algorithm was the best for solving hard knapsacks until 2009, with time complexityÕ(2 n/2 ) and memory requirement O(2 n/4 ). For simplicity assume that n is divided by 4.
First, the set S = { n/2 i=1 a i x i : x i ∈ {0, 1}} with all possible values of this sum is constructed. We decompose the sum s = n i=1 a i x i as: where each σ i is a smaller knapsack of n/4 elements: We construct the following lists for a value σ ∈ S which is chosen appropriately: • Σ 2 : all possible 2 n/4 values of σ 2 . We find these values and then we sort the list. For example, if n = 40 we must compute a list σ 2 of 2 10 elements, where each vector x of the list, contains the binary digits of a number in [0, 2 10 − 1]. • Σ 1 : all possible values of σ 1 such that • Σ 4 : all possible 2 n/4 values of σ 4 (we sort this list as before). • Σ 3 : all possible values of σ 3 such that Thus we get σ 12 = σ (mod 2 n/4 ) and σ 34 = s − σ (mod 2 n/4 ). We search for a collision between the lists {σ 12 } and {s − σ 34 }. For the right choice of σ we have found elements such that σ 12 + σ 34 = s, thereby we have solved the initial knapsack problem. The two lists {σ 12 } and {σ 34 } requireÕ(2 n/4 ) time, as well the collision after sorting. Besides, we must find the right value of n/4-bit number σ. The total running time isÕ(2 n/2 ) and the memory requirementÕ(2 n/4 ).
The Howgrave-Graham Joux Algorithm. In 2010, Howgrave-Graham and Joux [15] improved the Schroeppel-Shamir algorithm. They claimed that the new algorithm can solve hard knapsack problems with heuristic running timeÕ(2 0.3113n ) and memoryÕ(2 0.256n ). But this running time was proven wrong by May and Meurer who estimated that the running time of this improved algorithm isÕ(2 0.337n ) [4].
This improvement depends on the more degrees of freedom. Here, the two lists we construct can overlap and that was not allowed in the previous algorithm. We consider the knapsack s = n i=1 a i x i and for simplicity assume that n is divided by 4 and n i=1 x i = n/2. There exist pairs (σ 1 , σ 2 ) with Hamming weight n/4 such that σ 1 + σ 2 = s. This decomposition is not unique and we define these pairs as following: After choosing a value M ≈ 2 n/2 and a random R modulo M we construct two lists such that: and then using all possible vectors y and z, we search for collisions, in order to find a solution of the initial knapsack. Since the Hamming weight is n/4 for each sub-knapsack, the number of solutions we expect to find is k = ( n n/4 ) M . The required time for this procedure is O(k).
New improvement by Becker, Coron and Joux. An other improvement of the previous algorithm, that was presented in Eurocrypt 2011 [4], reduces the (heuristic) running time down toÕ(2 0.291n ). The basic idea of Becker, Coron and Joux algorithm is adding a bit more degrees of freedom. The solutions of the two sub-knapsacks consist of coefficients from {−1, 0, 1}. As a result, there are more representations of the solution of the original knapsack.
3.1. The method of Schnorr-Shevchenko. In this subsection we present Schnorr and Shevchenko method for knapsack problems with density close to 1. This method is clearly faster (in practice), than the one that Becker, Coron and Joux presented. There is some theoretical evidence for this method, which we shall provide after presenting the method.
Here BKZ-reduction is used into a specific basis to find the solution. We use the lattice L(B) generated by the rows of the matrix B.
The basis of the lattice L(B) consists of these n + 1 row-vectors, with n + 3 elements each one. In the following examples we consider that n = 80 and (a i ) i are random integers in the set A = (1, 2 n + 1] ∩ Z. This choice provides densities very close to 1, since the denominator of the fraction d = n log 2 maxi{ai}i will be close enough to n. To see this, we write the set A as A = {2, 3, . . . , 2 n + 1} = {2, . . . , 2 n−1 + 1} ∪ {2 n−1 + 2, . . . , 2 n + 1}.
Then, we can reveal the solution x = (x 1 , . . . , x n ) from with the property n i=1 x i = n/2 (here we considered totally balanced solutions, i.e. with Hamming weight H = n/2). The inverse fact is that every solution is written as previous. The integer n/2 (assume that n is even) can be an integer H ∈ {1, ..., n − 1}. More precisely the algorithm works by the following way: • In the first 5 steps we apply BKZ-reduction to the basis B without pruning and with blocksizes 2 k for k = 1, 2, 3, 4, 5. Before the reduction we permute the rows of the matrix in order to: first rows have a nonzero element in column n + 2 sort in ascending order the other rows according to their Euclidean norm as row-vectors. Every step takes as input the matrix B from the previous iteration. If the solution is found, the algorithm stops.
• In case the algorithm did not find the solution in the first 5 steps, BKZ-reduce the basis independently with blocksizes: bs = 30, 31, 32, . . . , 60. The pruning parameter is 10 for bs = 30, 11 for bs = 31, 12 for bs = 32 and then 10 again for bs = 33 and so on. Always terminate if the solution has been found. We provide the following auxiliary Lemma. Subsequently, we substitute n/2 with an integer H ∈ {1, 2, ..., n − 1}.
Also, the inverse is true.
Proof. There are integers λ j (j = 1, 2, ..., n + 1) such that, every vector b ∈ L(B) is written as b = n+3 j=1 λ j B j , where B j denotes the j-th row vector of the matrix B (2). Then, where the b i 's satisfy the conditions, λ i a i = −λ n+1 s, and from relation (6) we get To simplify the exposition, assume that λ n+1 = 1. The other option leads to similar results. If b i = 1, then λ i = 0. Otherwise, λ i = −1. Therefore, λ i ≤ 0. The solution is revealed by, As a result λ i 's are 0 or −1, and satisfy x i = H corresponds to some b ∈ L(B), that satisfies the relations (3).
So, the inverse is also true.

3.2.
Theoretical evidence of the method. First we provide the following Theorem.
Theorem 3.2. [17] Assume that there is an SVP-oracle and the knapsack problem has a solution. Then with high probability we can solve all knapsack problems with density < 0.6463.
With SVP-oracle we mean a probabilistic polynomial algorithm which given a lattice L, it provides a shortest vector of L with high probability. Unfortunately, in practice we do not have SVP-oracles (at least for lattices with large rank). Experiments made by Nguyen and Gama [12] suggest that LLL behaves as an SVP-oracle for dimensions ≤ 35 and BKZ-20 algorithm [29] (i.e. BKZ with blocksize 20), for dimensions ≤ 60. Furthermore, two more simplified proofs of the previous Theorem were also given in [5,9]. Also in [27] the algorithm was tested experimentally providing some improvements.
The previous Theorem was improved by Coster et al. [5]. The new density bound was improved to 0.9408. Their approach is in the same spirit of [17]. So the assumption of the existence of an SVP-oracle remained (this assumption is a serious drawback of these methods, when we try to apply them in practice). They applied LLL reduction algorithm to the lattice generated by the rows of the matrix: where N is a positive integer > √ n/2. The SS-method uses a similar matrix B. In fact, experimentally we get that SS-method provides a very good approximation of an SVP-oracle for lattices of dimension ≈ 80.

A new strategy for hard knapsack problems
We consider two variants of SS's method. In our experiments we used fpLLL interface for python [25,26] and in some cases we used Sagemath [32]. The code we used is online in github (see [8]). In our experiments we used a PC with 8Gb ram and i3 3.5GHz cpu.
Our first observation is that the SS's method is influenced by the density of the knapsack problem. For instance,  15 7.9 7.7 5.5 Table 1. Relation of the density of random knapsack problems and the average number of rounds until SS-method terminates successfully.
Also, after some critical round say R, the SS method is really slow for rounds ≥ R. This is because the execution of SS's method is dominated by the running time of BKZ, which heavily depends on the blocksize and pruning. For dimension 80, experimentally this value is R ≈ 18. For instance see figure 1. Finally, if we reduce the original knapsack problem to some easier problems, i.e. having smaller dimension and density, then it would be faster to solve the reduced knapsack problems instead of the first initial knapsack.
But there is also a theoretical reason for this. As we already wrote in subsection 3.2, there is strong evidence that lower density knapsack problems, i.e. d < 0.94 are easier than hard knapsack problems, i.e. with density close to 1. So our first variant is supported from the previous theoretical evidence.
We summarize our first variant in two simple steps. First Variant 1. Execute SS algorithm until round = 5 (the first stage of SS's method) 2. Brute force on the b initial bits of the solution and for each value (of the choice of the b initial bits) reduce the initial knapsack to some with smaller densities and dimension. Then apply SS's method until round R (overall rounds: 5 + R).
This method depends on two parameters : b which is the number of the initial bits, where we shall apply brute force, and the parameter R which is the maximum round for the second stage of SS method.
After enough experiments, for dimension n = 80 and 84 we concluded that the pair (b, R) = (4, 11) provides the best results when we compare the variant with the original method.
If the first step fails, then we start a brute force on the four initial bits of the candidate solution. Say we start with [0, 0, 0, 0], i.e. we assume now that the solution is of the form [0, 0, 0, 0, ...]. Then we consider the reduced knapsack problem n j=5 a j x j = s. This has dimension 76 (if n = 80) and density ≈ 0.95. We run SS's method until the 11th round. If it fails, then we continue to the next quadruple of bits, say [0, 0, 0, 1], then the reduced knapsack is n j=5 a j x j = s − a 4 . We execute again SS and so on, until the solution is found. We summarize the results in figure  5. To collect the data for this diagram we executed the following experiment : we chose a random knapsack of density 1 with a random solution having Hamming weight n/2, and then we run SS's method and the SS-variant. Finally, we measured the time for each experiment.
Also, note that the distance from round 10 to round 16 is shorter than the distance from round 16 to round 20. This is because after some rounds BKZ is very slow.  figure  3. This method can be easily parallelized, since each reduced knapsack problem corresponding to some possible solution can be run in one CPU Core. If we have a cluster with 16 PC's, we dedicate each one to solve one such reduced problem. In this way we can optimize the method.
Second Variant The second variant is even simpler than the previous. In the first variant we decreased the numerator of the density d = n log 2 maxi{ai}i . Now we shall increase the denominator. The idea is seemingly very simple and works in practice. We fix a number, say B n = 2 n+b , for some positive integer b. Also, assume that x 1 = 1. If we substitute a 1 ← a 1 + B n and s ← s + B n , then the seemingly new knapsack will have density This is because all a i 's, for i ≥ 2 are smaller than 2 n + 1. Someone, may suggest to solve the new knapsack of dimension n − 1 since we know the first bit. This would have negligible impact to the new density. In fact the new density would be n−1 log 2 maxi{ai} 2≤i≤n , which is again very close to 1. In order to choose the most suitable value of b we made enough experiments in dimension 80. We found that for b ∈ {5, 6} we get the best results. Remark that for b large enough the new density is close to 0.5 so we expect the variant to be very fast. Unfortunately, for b ≥ 7 this method failed for the majority of instances we executed. Furthermore, for b < 5 the average times of success were almost the same.
If n = 80 and b = 6, then we get d ≈ 0.93. Someone would expect that BKZ, fast enough would remove B n from the matrix and then again the situation would be the same as before the substitution. But it seems that this is not the case. For instance, in figure 4 we compare the two methods, the original and the previous variant (for b = 6). The vertical axis is the time. The only drawback of this variant is the assumption that we know the first bit. In other words this variant succeeds with probability 1/2. There is a small number of instances(≈ 10%) where the original method is faster. 4.1. Final remarks. We provide one more figure, that contains data when we consider both the variants. On average the second variant is slightly faster. The drawback of the second variant is that it is probabilistic i.e. we need to know the first bit and even then, the 10% of the random instances fail.
The asymptotic complexity of SS attack (and its variants) is dominated by the time complexity of BKZ. Furthermore, for the specific lattices, the density of the knapsack contributes to the time complexity. That is, for low density the SS attack is faster than for density close to 1. If we restrict to density close to 1, then the complexity of the SS attack is dominated only by the complexity of BKZ. The  asymptotic complexity of BKZ is exponential with respect to the blocksize, see [14, section 3]. So, for large dimensions and density close to 1, the time complexity of the original SS-attack is that of BKZ. On the other hand, in the variants we managed to decrease the density, therefore, except the running time of BKZ also the density has a positive impact, and we expect the overall algorithm to run faster. Since the analysis of BKZ is not well studied, we can not provide specific formula for the complexity.

4.2.
Theoretical evidence of the variants. Knapsack problems with density close to 1, are considered hard since there is no efficient lattice attack. For low density knapsack was shown that can be attacked via lattices attack. In fact in [16,Proposition 1.2], was proved that density close 1 is the hardest case for knapsack problems. So reducing a hard knapsack problem to a small number of knapsack problems that have lower density will efficiently attack hard knapsack problems. This is the idea behind the two variants and is theoretically based on the result [16] (see also [15,Introduction]). In the first variant this is achieved by solving in the worst case 2 4 = 16 knapsack problems with density close to 0.95 and dimension 76. This has a major impact on the problem since such knapsack problems are easier than the original. So we expect to get faster a solution. Furthermore, this variant can be easily parallelized. In the second variant we artificially lower the density by substituting the constant with some larger. The dimension remains the same but the new density is ≈ 0.93.

Compact knapsack
In this section we study a more general problem, the compact knapsack problem, in which we allow solutions in a specific set.

5.1.
A CVP attack to compact knapsack. An approach to attack compact knapsack is to reduce it, to a suitable Closest Vector Problem (CVP). The idea is the following. We denote with I α the set of integers having α− bits and with t α the integer 2 α−1 + 2 α−2 . Let n i=1 a i x i = a 0 , and we restrict our solutions to a set S ⊂ Z n . Let also y be a solution (for instance we can use Euclidean algorithm). Let L be the lattice generated by the integer solutions of n i=1 a i x i = 0 and choose a suitable target vector t ∈ span(L) (with t ∈ S). Then we solve the CVP instance CV P (L, t, | · | ∞ ) and say b its output. We expect the solution x = y + jb (for some small integer j) to be in S. In fact, using the previous attack, for dimension n ≤ 200, R ≤ 200, a ∈ I n R , and with x ∈ S = I n R we always get a solution in I n R using the target vector t = (t R , ..., t R ). The situation becomes harder if we consider groups of {x j } j 's having different bits. Assume that, R and n are even integers. For instance, if the first n/2 entries of x = (x j ) j have R− bits and the other half have R/2− bits, we get a solution having (on average) the half of entries in I R ∪ I R/2 . We used the target vector, t = (t R , .., t R , t R/2 , ..., t R/2 ), where the first n/2 entries are equal to t R and the rest to t R/2 . We provide the pseudocode of this attack.

CVP-attack:
INPUT: S ⊂ Z n , a ∈ Z n >0 , a 0 such that the equation n i=1 a i x i = a 0 has a solution in S, and a target vector t ∈ R n ∩ S. OUTPUT: A solution x ∈ S or in the worst case returns a solution that satisfies some constraints. polynomial running time) with the usual Euclidean norm || · ||. Since, || · || ∞ < || · || < √ n|| · || ∞ in R n , we get a very good approximation of CV P (L, t, || · || ∞ ). We also tried the exact CVP algorithm (using enumeration) of FpyLLL [26]. For R > 80 the exact-CVP algorithm was very slow and we did not manage to get a solution. But, for R ≤ 80 we got exactly the same results as provided by Babai.

compute a solution y of
In tables 2 and 3 we provide some results using the previous algorithm.  Table 3. Here a j $ ← − I R/8 . We executed 80 random instances for each row.
We can improve on this attack if we apply the following strategy. We explain for the case x ∈ S 2 . We choose K randomly from the set I R/2 . Then we apply CVPattack to n/2 j=1 a j x j = a 0 −K. Almost always we get a solution x 1 = (x 1 , ..., x n/2 ) ∈ I n/2 R . Then, we check if the equation n j=n/2+1 a j x j = K has a solution in I n/2 R/2 . In this way, we improved the 50% to 64% in the second row of table 2 and 3. The same if the solution set is S 3 .

A branch and bound algorithm.
Since the previous method does not find all the right entries, but it finds a vector x that meets enough constraints, we can search in the neighbourhood of x for a better vector.
Let {b 1 , . . . , b n−1 } be a basis of the lattice {y ∈ Z n : n j=1 a j y j = 0}. We apply the CVP-attack of the previous subsection and let x the output of the CVP-attack. The method we provide uses this basis to get a new better solution x , that satisfies, all or some, of our bound constraints. Starting from the solution x that does not meet the constraints, we update it using the relation x ← x + kb j for 1 ≤ j ≤ n − 1 and for some k ∈ Z − {0} with |k| ≤ rad for some positive integer rad. In every step we calculate the number of right entries V = V S (x) of the update vector x. If the new V satisfy V ≥ V, then we update the solution x and increase (or decrease) k for a better solution, else we use another vector b j from the basis. We provide the pseudocode of this algorithm.

The function UPDATE:
OUTPUT: A solution x ∈ S that satisfies more constraints than the initial solution x or in the worst case returns a solution that satisfies the same constraints as the initial solution.
find the new V = V S (x ) 06.
x ← x 08. k end if 14. end while 15. return x, V Once we have the update function, we apply the pseudocode below.
Branch and bound algorithm for compact knapsack: INPUT: L = {b j } j , x, n, S, rad, K OUTPUT: A solution x ∈ S or a better solution which satisfies more constraints or in the worst case returns a solution that satisfies the same constraints as the initial solution.
end if 08. end while 09. return x The efficiency and the success rate of the algorithm depends on how many nodes we consider i.e. the value of K, and how we choose the coefficients {a j } j and the set A.
This algorithm can be easily parallelized. For each value of k ∈ {−rad, ..., rad} − {0} (in the UPDATE algorithm) we consider a different node.
Using the previous algorithm we did not get any solution for the case (n, R) = (60, 100) and x ∈ S 2 for (K, rad) = (20, 2 15 ). We have an advantage, on average 6 bits, in the case (n, R) = (30, 40) for x ∈ S 2 . Although we did not find any solution we do not consider such values as safe for the hardness of the problem.
6. An Id system based on compact knapsack In public key Id (Identification) protocols an entity (the prover) holding a secret key wants to prove its identity to an entity (the verifier) holding only the public key. The minimal notion of security in this scheme, is that of a passive attacker, knowing only the public key. This notion, as we shall see below coincides with a property called soundness of the Id-scheme. In this case the adversary is not capable to impersonate the prover, knowing only the public key. In a real world situation, the adversary is allowed to communicate with the prover, hoping to extract some information and then impersonate the prover. The usual model of security that we use is the second one. In this case we say that the Id-scheme is secure in active attacks. See [6] for a recent account in Id-schemes. Most of public key Id-schemes are based on specific problems from number theory such as factorization and discrete logarithm problem. Thus are quite expensive because of the exponentiations they use. That is they are not lightweight. A second possible problem is that they are not quantum resistant, since in the factorization and discrete logarithm problem, Shor's algorithm can be applied, assuming the existence of a (large) quantum computer.
We are interested in three moves Id-schemes. Here Alice (the prover) holds a secret key and sends to Bob (the verifier) a message which we call commitment. Bob responds with a random string which we call it challenge (or exam). Alice provides a response. Finally, Bob applies a verification algorithm which has as input, the public key of Alice and the previous conversation, in order to decide if he will accept or reject the id of Alice. The length of the challenge is the security parameter. We shall provide an Id scheme which is not based on discrete logarithm or factorization problem, but on the compact knapsack problem i.e. we provide a proof of knowledge for the compact knapsack problem. Let, , where I α is the set of integers having α−bits (R, are positive integers and assume for simplicity that R, n are even) and (a j ) j positive integers with at most R−bits. If n is odd we define S n,R, = I We describe our Id system. Alice picks a vector a = (a j ) j , with entries to be positive integers of at most R− bits as entries and a vector x ∈ S n,R, , such that a · x = b. She publishes (a, b, R, ). The private key is x.
The following scheme is repeated t−times. • Alice picks a random vector k ∈ S n,R,0 . Then she computes a · k = r and sends it to Bob (commitment). ♦ Bob picks a random integer e ∈ {0, 1} and sends it to Alice (challenge).
• Alice computes s = k + ex and sends s to Bob (response). ♦ Bob verifies the equality a · s = r + eb and that s ∈ S n,R,0 if e = 0. Now, if e = 1 Alice can choose from the beginning R, such that s ∈ S n,R, with large probability (≈ 1). See corollary 6.4.

Proof of correctness.
a · s = a · k + ea · x = r + eb.
Also, s satisfies the constraints of the scheme.

Remark 1.
To be precise the previous scheme is a probabilistic Id-scheme, since Bob is convinced with high probability.
The first basic requirement for an Id-scheme is the soundness. Let Eve be an adversary. The scheme is sound if Eve knowing only the public key, can pass the verification test with only negligible probability. The soundness of the scheme depends on the number of iterations t. Assume that, for simplicity t = 1. Say that Eve, by tossing up a fair coin, picks the right e ∈ {0, 1}. Then, she computes a random vector s ∈ S n,R,0 if e = 0, else she chooses s ∈ S n,R, . Then, she sends to Bob the pair (r = a · s, s) if e = 0 and (r = a · s − b, s) if e = 1. The pair passes the verification test since a · s = r + e b. The success rate is 1/2. In general is 2 −t . So for t = 80 the success rate is negligible. To prove that the scheme is sound we have to show that this success rate can not be improved unless the compact knapsack problem is easy. So assume that Eve has a Monte Carlo algorithm A with inputs the public key and a random e ∈ {0, 1} t , that provides t−passing pairs (r i , s i ) for the verification test. In fact, as we shall see, A is a parallel Monte Carlo algorithm e.g. [24, chapter 12]. Also, the time complexity for algorithm A is |A|. Furthermore, suppose that the probability of success is ε. We shall show that using the probabilistic algorithm A we can find with constant probability, a solution x ∈ S n,R, , such that a · x = b. This is enough for the adversary to impersonate Alice. Thus, we consider all the class of the solutions Σ b = {x ∈ S n,R, : a · x = b} as one solution. Also, assume that it is difficult to find elements in Σ b knowing only the vector a and the integer b. We shall now state our Theorem. Theorem 6.1. (soundness). Let A be a probabilistic algorithm having as inputs the public key of the Id-scheme and a random vector e ∈ {0, 1} t , and output t−passing pairs (r, (s i ) i ) = ((r i ) i , (s i ) i ), with probability ε > 2 −t+2 . Suppose that, = 0.58 − log 2 (1 − 0.99 1/n ) . Then, with constant probability and running time O(|A|/ε) we can find an element of Σ b (which is equivalent to knowing the secret key x).
We need some auxiliary Lemmas.
Let also x 1 , x 2 are chosen uniformly from N 1 , N 2 , respectively. Then, Proof. We count the pairs (x 1 , x 2 ) ∈ N 1 × N 2 such that, x 2 − x 1 < c. We fix x 1 = b. Then, the maximum x 2 , such that, So we start counting from c + b − 1 until x 2 = c. Thus for x 1 = b, we get (c + b − 1) − c + 1 = b possible x 2 's. Similar for x 1 = b − 1, we get b − 1 possible x 2 's and so on. Finally, we get (a+b)(b−a+1) .
The number of pairs (x 1 , x 2 ) ∈ N 1 × N 2 such that, x 2 + x 1 > d, is equal with the number of pairs (x 1 , x 2 ) ∈ N 1 × N 2 such that, x 2 − x 1 < c. To see this we repeat the previous arguments, but starting from x 1 = a and x 2 = d − a + 1 ∈ N 2 and continue as previous. The Lemma follows. Corollary 6.3. Let N 1 = I R , N 2 = I R+ ( > 1), and x 1 , x 2 are chosen uniformly from N 1 , N 2 , respectively. Then, the corollary follows from Lemma 6.2.
Proof. We set N 1 = I R , N 2 = I R+ . Then, from corollary 6.3, P r(x j − k j ∈ N 2 ) ≈ 1 − 3 2 +1 . Let x = (x 1 , x 2 ) where x 1 contains the first half entries of x and x 2 contains the rest entries. Similar for k = (k 1 , k 2 ). Let also the events Since A 1 , A 2 are independent, corollary 6.3 provides, P r(x − k ∈ S n,R, ) = P r((k 1 , x 1 ) ∈ A 1 and (k 2 , , the previous equality is no longer true. But for n ≥ 2 and p → 1 − , then (1 − 3 2 +1 ) n → 1. So the result follows. Similar for P r(x + k ∈ S n,R, ). This completes the proof.
Proof. Eve has two queries e = (e i ) i , e = (e i ) i ∈ {0, 1} t (e = e ) such that, the corresponding values of (r i , s i ),(r i , s i ) (i = 1, 2, ..., t) give positive result and r i = r i for every i. Thus without loss of generality, there are e j = 1, e j = 0, for some j ∈ {1, 2, .., t} such that a · s j = r j + b, a · s j = r j .

Then,
a · (s j − s j ) = b.
We set S = (s j − s j ). By corollary 6.4, for = 0.58 − log 2 (1 − 0.99 1/n ) with p = 0.99, and s j ∈ S n,R, , s j ∈ S n,R,0 , we get S ∈ S n,R, with high probability. So, S ∈ Σ b , with high probability. For the proof of Theorem 6.1 we follow [6].
Proof of the Theorem 6.1. Assume that A is a probabilistic algorithm with input the public key of prover, pk = (a, b, R, l) and a random e = (e 1 , ..., e t ) ∈ {0, 1} t . It works like a black box and depends on an internal state, which is a random string. It gives with probability ε, as output a pair out = (r, (s i ) i ) such that, (out, e) be a query that passes the verification test. That is a · s i = r i + e i b, for i = 1, 2, ..., t.
We model A as follows. We fix a matrix H with entries 0 and 1. H has a column for each different vector e, which is the value of the challenge in every round of our system. Therefore, it can be from 1 to 2 t , which represents binary vectors between (0, . . . , 0, 1) and (1, . . . , 1, 1). For each internal state we get a row of the matrix. The vector r that the algorithm produces depends on the pk and the choice of the row, i.e. r = r(pk, Internal State) and s depends on the row and the column, that is (s i ) i = (s i ) i (pk, Internal State, e).
With input (pk, Internal State, e), the output of algorithm A is a passing pair (r, (s i ) i ); e . If we hit 0, then we do not get a passing pair for the specific input (pk, Internal State, e). Our goal is to found a row with two 1's at least. If we have this row, we show using Lemma 6.5 how to reveal an element of the set Σ b . Furthermore, we shall show that this can be done with constant probability. First, we examine the distribution of 1's in the rows. A row is called heavy if it contains a fraction of at least ε 2 , 1's. We also define: h: the number of H's entries, so ε · h is the number of 1's in H h 1 : the number of H's entries in non heavy rows, so the number of 1's in these rows is less than h 1 ε/2. Therefore, heavy rows have at least h 2 1's, which are: That is at least half of the rows are heavy. Furthermore, from the assumption that ε ≥ 2 −t+2 and the fact that h > 2 t , it follows that a heavy row contains at least two 1's.
After 1/ε tries we can find the first 1, if we probe H randomly (i.e. if we choose a random internal state and a random e). The probability of that is more than 1/2. If this 1 lies in a heavy row, then we can find a second 1 in the same row with probability ε 2 2 t −1 2 t . As a result, we need 2 t ε 2 2 t −1 tries to succeed. Since 2 t ε 2 2 t −1 < 2 t ε 2 2 t−1 = 4/ε, this means that with less than 4 ε tries we get the second 1, with probability 1/2. Otherwise, if the first hit is in a non heavy row, we could spend too much time searching for a second 1. To avoid this, we apply an algorithm, say A 1 , which stops the procedure after a specific number of tries. The algorithm consists of two steps, that run in parallel: St1: Probe random in the same row until a second 1 is found. St2: Repeatedly, probing a random entry in H and choosing a random number among 1, 2, . . . , d (d will be chosen later). This step stops, if the entry is 1 and the number d is 1.
Algorithm A 1 runs in expected time O(|A|/ε). But, we want St1 to stop first (with high probability), in order to have two 1's in one row. The probability of St2 finishes after k attempts is ε/d(1 − ε/d) k−1 . Using the assumption for ε as before, we get that (1 − ε/d) k−1 ≤ 1 and that means the probability of finishing after k or fewer attempts is at most kε/d. We consider that k = d/(2ε) , in order the success probability for St2 be 1/2. Furthermore, if we choose d = 16, we have that St2 finishes after more than 8/ε = k tries with probability at least 1/2. As a result, St1 finishes before St2 with probability greater than 1 2 1 2 = 1 4 . So, if this occurred, then with probability at least 1/8 we shall get the second 1.
So, we shall find two 1's in a heavy row after 12/ε tries and with constant probability at least 1 2 · 1 8 = 1 16 . That is, the algorithm runs in O(|A|/ε) and succeeds to find two 1's in the same row with constant probability > 1/16. Since now, we have two 1's (in the same row), we get two queries (r, (s i ) i , e), (r, (s i ) i , e ), with e = e . Applying Lemma 6.5, the Theorem follows.
Since we assumed that finding elements of Σ b is difficult, we can not improve the success probability to be > 2 −t+2 .
6.1. The parameters. We have to choose R, in order to be difficult to find even one n−tuple (x 1 , ..., x n ) of equation (7), if Eve picks randomly a solution from S n,R, . Also, to consider secure parameters we have to consider the attack given in section 5.1. Let P r be the probability to find a solution x of equation (7) with x ∈ S n,R, , if we pick x randomly from the set S n,R, . Also, we set P r the same probability if we choose x from the set of S n,R,0 . Let N b be the number of solutions in B n = S n,R, of the Diophantine equation n i=1 a i x i = b. We remark that, there is a hyperplane that meets n− vertices of the box B n (e.g. a face of B n ) and contains the maximum number of integer points (in B n ) of all hyperplanes that passes from at least one point of B n . Let F n−1 be that hyperplane. So, N b ≤ |F n−1 |, thus P r ≤ |F n−1 | |S n,R, | = |I n/2 Similarly, We want both the probabilities to be negligible. It is enough to choose, R/2 ≥ 80. Then, the probabilities P r, P r are ≤ 2 −80 . So, if we pick R ≥ 160, then the probabilities P r, P r are negligible. Furthermore, R must be large enough, since choosing two times the same ephemeral key, will reveal the secret key with probability 1/2. For instance, as a minimal choice we suggest n = 70, R = 192, p = 0.99 so = 12, and a $ ← − I n R/8 , then we get a public key (a, b, R, ) of length Note that the length |b| ≈ 9R/8 + + |n| = 235 bits (here with |.| we denote the binary length). The system has relatively small public keys and is very fast since it uses only additions and multiplications. Finally, the previous selection of parameters shall resist the attack given in subsection 5.1. Since may exist (or found) other better attacks, someone has to adjust the parameters considering the new attack. In general, the hardness of the compact knapsack among others, depends on the topology of the set of solutions. So someone instead of S n,R, can use another set S which minimizes the length of the public key and achieves the same security. The only thing that needs some care is to check that Lemma 6.1 is still valid for the new set S.
On the whole, we need three sets, the parameter set P where we choose a and two sets S 1 , S 2 , with S 2 ⊂ S 1 , where we randomly choose the private key x from S 1 and the ephemeral key k from S 2 . The sets S 1 , S 2 are chosen such that x − k ∈ S 1 with large probability. Then, we can prove the soundness of the id-scheme. Furthermore, we want the compact knapsack problem with solution sets S 1 , S 2 and parameter set P to resist the CVP-attack of subsection 5.1.
Finally, in [20] an id-scheme was presented based on Short Integers Problem (SIS). Some basic differences between our scheme and [20] are the following: • In [20] the system is over Z p and ours is over the integers.
• System [20] is provable secure under active attacks. the author applied Theorem 2 of [20]. In this specific Theorem, we can not consider one equation (as in our system). Because in this case the reduction of the Theorem does not work (i.e. SIS can not be reduced to a hard lattice problem).
• Our id-scheme security is based on compact knapsack (the solutions here may be large and not necessarily small as in SIS). Finally, the two problems SIS and compact knapsack are different in the sense that the latter is NP-complete.

Conclusions and future work
In this work we addressed the knapsack problem and some of its variants. We optimized the algorithm presented in [30] using some heuristics that work in practice. We provided two variants, which in average are better than the original method. One of them can be easily parallelized. Furthermore, we considered the compact knapsack problem and we apply a CVP-reduction and we combine it with a branch and bound algorithm. We used the results of our algorithm to provide some minimal security conditions of the parameters for our Id-scheme which bases its security on this problem. Unfortunately, we managed to prove its security only under passive attacks. One other disadvantage is the relatively large information complexity. For instance, if (n, R, ) = (70, 192, 12), then the prover sends ≈ 145 KBytes using 80 rounds, to the verifier. On the other hand, this scheme is lightweight in the sense that does not use exponentiations. Also, it is potentially quantum resistant since does not base its security in factorization or the discrete logarithm problem. We did not address here the problem of choosing the right parameters, but we provide some minimal requirements for them. Furthermore, other choices of the parameters probable lead to smaller public keys and information complexity. This last remark, can be used to implement the system in constrained devices e.g. smart cards.
The next step of this work is to consider the transformation of the three-move Id -scheme to a digital signature using the Fiat-Shamir transformation. This can be done if we replace the role of Bob with a random hash function. Then, many issues need to be carefully studied. For instance, the security under random oracle model. Also, the selection of the parameters.
One other possible extension as far as the experimental part of this work, is to consider more advanced type of reductions instead of classical BKZ with pruning. For instance in [11] the authors suggested a new type of pruning, they called it extreme pruning, which behaves better than the classical linear pruning. So one may try this new improved method to solve hard knapsack problems using the method of Schnorr-Shevchenko.