Full Learning

To appear in the J. of Cryptology. A preliminary version appeared in the Proceedings of EUROCRYPT ’06.
Learning a Parallelepiped:
Cryptanalysis of GGH and NTRU Signatures
Phong Q. Nguyen?1 and Oded Regev??2

1
INRIA & École normale supérieure, DI, 45 rue d’Ulm, 75005 Paris, France.
http://www.di.ens.fr/~pnguyen/
2
School of Computer Science, Tel-Aviv University, Tel-Aviv 69978, Israel.
http://www.cs.tau.ac.il/~odedr/
Abstract. Lattice-based signature schemes following the Goldreich-

Goldwasser-Halevi (GGH) design have the unusual property that each
signature leaks information on the signer’s secret key, but this does
not necessarily imply that such schemes are insecure. At Eurocrypt ’03,
Szydlo proposed a potential attack by showing that the leakage reduces
the key-recovery problem to that of distinguishing integral quadratic
forms. He proposed a heuristic method to solve the latter problem, but
it was unclear whether his method could attack real-life parameters of
GGH and NTRUSign. Here, we propose an alternative method to attack
signature schemes à la GGH, by studying the following learning prob-
lem: given many random points uniformly distributed over an unknown
n-dimensional parallelepiped, recover the parallelepiped or an approx-
imation thereof. We transform this problem into a multivariate opti-
mization problem that can provably be solved by a gradient descent.
Our approach is very effective in practice: we present the first success-
ful key-recovery experiments on NTRUSign-251 without perturbation,
as proposed in half of the parameter choices in NTRU standards under
consideration by IEEE P1363.1. Experimentally, 400 signatures are suffi-
cient to recover the NTRUSign-251 secret key, thanks to symmetries in
NTRU lattices. We are also able to recover the secret key in the signature
analogue of all the GGH encryption challenges.
1 Introduction
Inspired by the seminal work of Ajtai [1], Goldreich, Goldwasser and Halevi
(GGH) proposed at Crypto ’97 [10] a lattice analogue of the coding-theory-
based public-key cryptosystem of McEliece [22]. The security of GGH is related
?
Part of this work is supported by the Commission of the European Communities
through the IST program under contract IST-2002-507932 ECRYPT, and by the
French government through the X-Crypt RNRT project.
??
Supported by the Binational Science Foundation, by the Israel Science Foundation,
by the European Commission under the Integrated Project QAP funded by the IST
directorate as Contract Number 015848, and by a European Research Council (ERC)
Starting Grant.
2
to the hardness of approximating the closest vector problem (CVP) in a lattice.

The GGH article [10] focused on encryption, and five encryption challenges were
issued on the Internet [9]. Two years later, Nguyen [29] found a flaw in the
original GGH encryption scheme, which allowed to solve four out of the five
GGH challenges, and obtain partial information on the last one. Although GGH
might still be secure with an appropriate choice of the parameters, its efficiency
compared to traditional public-key cryptosystems is perhaps debatable: it seems
that a very high lattice dimension is required, while the keysize grows roughly
quadratically in the dimension (even when using the improvement suggested
by Micciancio [23]). The only lattice-based scheme known that can cope with
very high dimension is NTRU [15] (see the survey [31]), which can be viewed
as a very special instantiation of GGH with a “compact” lattice and different
encryption/decryption procedures (see [23, 25]).
In [10], Goldreich et al. described how the underlying principle of their en-
cryption scheme could also provide a signature scheme. The resulting GGH
signature scheme did not attract much interest in the research literature until
the company NTRU Cryptosystems proposed a relatively efficient signature
scheme called NTRUSign [13], based exactly on the GGH design but using the
compact NTRU lattices. NTRUSign had a predecessor NSS [16] less connected
to the GGH design, and which was broken in [6, 8]. Gentry and Szydlo [8] ob-
served that the GGH signature scheme has an unusual property (compared to
traditional signature schemes): each signature released leaks information on the
secret key, and once sufficiently many signatures have been obtained, a certain
Gram matrix related to the secret key can be approximated. The fact that GGH
signatures are not zero-knowledge can be explained intuitively as follows: for a
given message, many valid signatures are possible, and the one selected by the
secret key says something about the secret key itself.
This information leakage does not necessarily prove that such schemes are
insecure. Szydlo [35] proposed a potential attack on GGH based on this leakage
(provided that the exact Gram matrix could be obtained), by reducing the key-
recovery problem to that of distinguishing integral quadratic forms. It is however
unknown if the latter problem is easy or not, although Szydlo proposed a heuris-
tic method based on existing lattice reduction algorithms applied to quadratic
forms. As a result, it was unclear if Szydlo’s approach could actually work on
real-life instantiations of GGH and NTRUSign. The paper [12] claims that, for
NTRUSign without perturbation, significant information about the secret key
is leaked after 10,000 signatures. However, it does not identify any attack that
would require less than 100 million signatures (see [13, Sect. 4.5] and [12, Sect.
7.2 and App. C]).
Our Results. In this article, we present a new key-recovery attack on lattice-

based signature schemes following the GGH design, including NTRUSign. The
basic observation is that a list of known pairs (message, signature) gives rise
to the following learning problem, which we call the hidden parallelepiped prob-
lem (HPP): given many random points uniformly distributed over an unknown
n-dimensional parallelepiped, recover the parallelepiped or an approximation
3
thereof (see Fig. 1). We transform the HPP into a multivariate optimization
problem based on the fourth moment (also known as kurtosis) of one-dimensional
projections. This problem can be solved by a gradient descent. Our approach is
very effective in practice: we present the first successful key-recovery experiments
on NTRUSign-251 without perturbation, as proposed in half of the parameter
choices in the NTRU standards [4] being considered by IEEE P1363.1 [19]; ex-
perimentally, 400 signatures are enough to disclose the NTRUSign-251 secret
key. We have also been able to recover the secret key in the signature analogue
of all five GGH encryption challenges; the GGH case requires significantly more
signatures because NTRU lattices have special properties which can be exploited
by the attack. When the number of signatures is sufficiently high, the running
time of the attack is only a fraction of the time required to generate all the
signatures.
From the theoretical side, we are able to show that under a natural assump-
tion on the distribution of signatures, an attacker can recover a good approxi-
mation of the secret key of NTRUSign and the GGH challenges in polynomial
time, given a polynomial number of signatures of random messages. Since the
secret key in both NTRUSign and the GGH challenges has very small entries,
this approximation leads to the exact secret key by simple rounding.
Fig. 1. The Hidden Parallelepiped Problem in dimension two.
Related Work. Interestingly, it turns out that the HPP (as well as related
problems) have already been looked at by people dealing with what is known
as Independent Component Analysis (ICA) (see, e.g., the book by Hyvärinen et
al. [17]). ICA is a statistical method whose goal is to find directions of inde-
pendent components, which in our case translates to the n vectors that define
the parallelepiped. It has many applications in statistics, signal processing, and
neural network research. To the best of our knowledge, this is the first time ICA
is used in cryptanalysis.
There are several known algorithms for ICA, and most are based on a gra-
dient method such as the one we use in our algorithm. Our algorithm is closest
in nature to the FastICA algorithm proposed in [18], who also considered the
fourth moment as a goal function. We are not aware of any rigorous analysis
of these algorithms; the proofs we have seen often ignore the effect of errors in
approximations. Finally, we remark that the ICA literature offers other, more
general, goal functions that are supposed to offer better robustness against noise
4
etc. We have not tried to experiment with these other functions, since the fourth
moment seems sufficient for our purposes.
Another closely related result is that by Frieze et al. [5], who proposed a
polynomial-time algorithm to solve the HPP (and generalizations thereof). Tech-
nically, their algorithm is slightly different from those present in the ICA litera-
ture as it involves the Hessian, in addition to the usual gradient method. They
also claim to have a fully rigorous analysis of their algorithm, taking into ac-
count the effect of errors in approximations. Unfortunately, most of the analysis
is missing from the preliminary version, and to the best of our knowledge, a full
version of the paper has never appeared.
Open Problem. Our attack does not work against the perturbation techniques
proposed in [12, 4, 14] as efficient countermeasures: these modify the signature
generation process in such a way that the hidden parallelepiped is replaced by
a more complicated set. For instance, the second half of parameter choices in
NTRU standards [4] involves exactly a single perturbation. In this case, the at-
tacker has to solve an extension of the hidden parallelepiped problem in which the
parallelepiped is replaced by the Minkowski sum of two hidden parallelepipeds:
the lattice spanned by one of the parallelepipeds is public, but not the other
one. The existence of efficient attacks against perturbation techniques is an open
problem. The drawbacks of perturbations is that they slow down signature gen-
eration, increase both the size of the secret key, and the distance between the
signature and the message.
Other Schemes. We now mention some other lattice-based signature schemes,
all of which come with an associated security proof, showing that any (asymp-
totic) attack on the scheme must necessarily lead to an efficient algorithm for a
certain lattice problem that is believed to be hard. Moreover, their security is
established based on worst-case hardness, i.e., any asymptotic attack (even with
a small probability of success) implies an efficient solution to any instance of the
underlying lattice problem. For more details on provably secure lattice-based
cryptography and on the signature schemes mentioned below, see, e.g., [32, 24,
26].
From a theoretical point of view, signature schemes can be constructed from
one-way functions in a black-box way without any further assumptions [28].
Therefore, one can obtain signature schemes that are provably secure based on
the worst-case hardness of lattice problems by using known constructions of
lattice-based one-way functions, such as those in Ajtai’s seminal work [1] and
followup work. These black-box constructions, however, incur a large overhead
and are impractical.
The first construction of efficient lattice-based signature schemes with a sup-
porting proof of security (in the random oracle model) was suggested by Miccian-
cio and Vadhan [27]. More efficient schemes were recently proposed by Gentry,
Peikert and Vaikuntanathan [7], and by Lyubashevsky and Micciancio [21].
The former scheme can be seen as a theoretically justified variant of the
GGH and NTRUSign signature schemes, with worst-case security guarantees
based on general lattices in the random oracle model. Compared to the GGH
5
scheme, their construction differs in two main aspects. First, it is based on lattices
chosen from a distribution that enjoys a worst-case connection (the lattices in
GGH and NTRU are believed to be hard, but not known to have a worst-
case connection). A second and crucial difference is that their signing algorithm
is designed so that it does not reveal any information about the secret basis.
This is achieved by replacing Babai’s round-off procedure with a “Gaussian
sampling procedure”, originally due to Klein [20], whose distinctive feature is
that its output distribution, for the range of parameters considered in [7], is
essentially independent of the secret basis used. The effect of this on our attack
is that instead of observing points chosen uniformly from the parallelepiped
generated by the secret basis, the attack observes points chosen from a spherically
symmetric Gaussian distribution, and therefore learns nothing about the secret
basis.
The scheme of Lyubashevsky and Micciancio [21] has worst-case security
guarantees based on a type of lattices known as ideal lattices, and it is the
most (asymptotically) efficient construction known to date, yielding signature
generation and verification algorithms that run in almost linear time. Moreover,
the security of [21] does not rely on the random oracle model.
Despite these significant advances, no concrete choice of parameters has been
proposed yet, and it is probably fair to say that provably-secure lattice-based
signature schemes are not yet at the level of efficiency and maturity that would
allow them to be used extensively in real-life applications.
Road map. The paper is organized as follows. In Section 2, we provide nota-

tion and necessary background on lattices, GGH and NTRUSign. In Section 3,
we introduce the hidden parallelepiped problem, and explain its relationship to
GGH-type signature schemes. In Section 4, we present a method to solve the
hidden parallelepiped problem. In Section 5, we present experimental results ob-
tained with the attack on real-life instantiations of GGH and NTRUSign. In
Section 6, we provide a theoretical analysis of the main parts of our attack.
2 Background and Notation
Vectors of Rn will be row vectors denoted by bold lowercase letters such as b.

The matrix whose rows are b1 , . . . , bn is denoted by [b1 , . . . , bn ]. We denote the
ring of n × n integer matrices by Mn (Z). The group of n × n invertible matrices
with real coefficients will be denoted by GLn (R) and On (R) will denote the
subgroup of orthogonal matrices. The transpose of a matrix M will be denoted
by M t , and M −t will mean the inverse of the transpose. The notation dxc denotes
a closest integer to x. Naturally, dbc will denote the operation applied to all
the coordinates of b. If X is a random variable, we will denote by Exp[X] its
expectation. The gradient of a function f from Rn to R will be denoted by
∇f = ( ∂x∂f
1
∂f
, . . . , ∂x n
).
6
2.1 Lattices
Let k · k and h·, ·i be the Euclidean norm and inner product of Rn . We refer to
the survey [31] for a bibliography on lattices. In this paper, by the term lattice,
we mean a full-rank discrete subgroup of Rn . The simplest lattice is Zn . It turns
out that in any lattice L, not just Zn , there must exist linearly independent
vectors b1 , . . . , bn ∈ L such that:
( n )
X
L= ni bi | ni ∈ Z .
i=1
Any such n-tuple of vectors [b1 , . . . , bn ] is called a basis of L: an n-dimensional

lattice can be represented by a basis, that is, a matrix of GLn (R). Reciprocally,
any matrix B ∈ GLn (R) spans a lattice: the set of all integer linear combinations
of its rows, that is, mB where m ∈ Zn . The closest vector problem (CVP) is the
following: given a basis of L ⊆ Zn and a target t ∈ Qn , find a lattice vector v ∈ L
minimizing the distance kv − tk. If we denote by d that minimal distance, then
approximating CVP to a factor k means finding v ∈ L such that kv −tk ≤ kd. A
measurable part D of Rn is said to be a fundamental domain of a lattice L ⊆ Rn if
the sets b+D, where b runs over L, cover Rn and have pairwise disjoint interiors.
If B is a basis of L, then the parallelepiped P1/2 (B) = {xB : x ∈ [−1/2, 1/2]n }
is a fundamental domain of L. All fundamental domains of L have the same
measure: the volume vol(L) of the lattice L.
2.2 The GGH Signature Scheme
The GGH scheme [10] works with a lattice L in Zn . The secret key is a non-
singular matrix R ∈ Mn (Z), with very short row vectors (their entries are
polynomial in n). In the GGH challenges [9], R was chosen as a perturbation
of a multiple of the identity matrix, so that
√ its vectors were almost orthogonal:
more precisely, R = kIn + E where k = 4d n + 1c + 1 and each entry of the n × n
matrix E is chosen uniformly at random in {−4, . . . , +3}. Micciancio [23] noticed
that this distribution has the weakness that it discloses the rough directions of
the secret vectors. The lattice L is the lattice in Zn spanned by the rows of R:
the knowledge of R enables the signer to approximate CVP rather well in L.
The basis R is then transformed to a non-reduced basis B, which will be public.
In the original scheme [10], B is the multiplication of R by sufficiently many
small unimodular matrices. Micciancio [23] suggested to use the Hermite normal
form (HNF) of L instead. As shown in [23], the HNF gives an attacker the least
advantage (in a certain precise sense) and it is therefore a good choice for the
public basis. The messages are hashed onto a “large enough” subset of Zn , for
instance a large hypercube. Let m ∈ Zn be the hash of the message to be signed.
The signer applies Babai’s round-off CVP approximation algorithm [3] to get a
lattice vector close to m:
s = bmR−1 eR,
7
so that s − m ∈ P1/2 (R) = {xR : x ∈ [−1/2, 1/2]n }. Of course, any other CVP
approximation algorithm could alternatively be applied, for instance Babai’s
nearest plane algorithm [3]. To verify the signature s of m, one would first check
that s ∈ L using the public basis B, and compute the distance ks − mk to check
that it is sufficiently small.
2.3 NTRUSign
NTRUSign [13] is a special instantiation of GGH with the compact lattices from
the NTRU encryption scheme [15], which we briefly recall: we refer to [13, 4] for
more details. In the NTRU standards [4] being considered by IEEE P1363.1 [19],
one selects N = 251, q = 128. Let R be the ring Z[X]/(X N − 1) whose
multiplication is denoted by ∗. Using resultants, one computes a quadruplet
(f, g, F, G) ∈ R4 such that f ∗ G − g ∗ F = q in R and f is invertible mod q,
where f and g have 0–1 coefficients (with a prescribed number of 1), while F and
G have slightly larger coefficients, yet much smaller than q. This quadruplet is
the NTRU secret key. Then the secret basis is the following (2N ) × (2N ) matrix:
 
f0 f1 · · · fN −1 g0 g1 · · · gN −1
 fN −1 f0 · · · fN −2 gN −1 g0 · · · gN −2 
 .. . . . . .. .. .. .. .. 
 
 .
 . . . . . . . 
 f1 · · · fN −1 f0 g1 · · · gN −1 g0 
R=  F0 F1 · · · FN −1 G0 G1 · · · GN −1 ,

 
 FN −1 F0 · · · FN −2 GN −1 G0 · · · GN −2 
 . . .. .. .. 
.. ... .. ..
 
 .. . . . . . 
F1 · · · FN −1 F0 G1 · · · GN −1 G0
where fi denotes the coefficient of X i of the polynomial f . Thus, the lattice

dimension is n = 2N . Due to the special structure of R, it turns out that a
single row of R is sufficient to recover the whole secret key. Because f is chosen
invertible mod q, the polynomial h = g/f (mod q) is well-defined in R: this is
the NTRU public key. Its fundamental property is that f ∗ h ≡ g (mod q) in R.
The polynomial h defines the following (natural) public basis of the lattice:
1 0 · · · 0 h0 h1 · · · hN −1
 
 0 1 · · · 0 hN −1 h0 · · · hN −2 
 .. .. . . .. .. . . . . .. 
 
. . . . . . . . 
 0 0 · · · 1 h1 · · · hN −1 h0 
 
0 0 ··· 0 q 0 ··· 0  ,
 
. .
 
0 0 ··· 0 0

q .. .. 

. . . . . .
.. ...
 
 .. .. . . .. ..

0 
0 0 ··· 0 0 ··· 0 q
which implies that the lattice volume is q N .

8
The messages are assumed to be hashed in {0, . . . , q−1}2N . Let m ∈ {0, . . . , q−

1} 2N
be such a hash. We write m = (m1 , m2 ) with mi ∈ {0, . . . , q − 1}N . It is
shown in [13] that the vector (s, t) ∈ Z2N which we would obtain by applying
Babai’s round-off CVP approximation algorithm to m using the secret basis R
can be alternatively computed using convolution products involving m1 , m2 and
the NTRU secret key (f, g, F, G). In practice, the signature is simply s and not
(s, t), as t can be recovered from s thanks to h. Besides, s might be further
reduced mod q, but its initial value can still be recovered because it is such that
s − m1 ranges over a small interval (this is the same trick used in NTRU de-
cryption). This gives rise for standard parameter choices to a signature length
of 251 × 7 = 1757 bits. While this signature length is much smaller than other
lattice-based signature schemes such as GGH, it is still significantly larger than
more traditional signature schemes such as DSA.
This is the basic NTRUSign scheme [13]. In order to strengthen the secu-
rity of NTRUSign, perturbation techniques have been proposed in [12, 4, 14].
Roughly speaking, such techniques perturb the hashed message m before signing
with the NTRU secret basis. However, it is worth noting that there is no per-
turbation in half of the parameter choices recommended in NTRU standards [4]
under consideration by IEEE P1363.1. Namely, this is the case for the parameter
choices ees251sp2, ees251sp3, ees251sp4 and ees251sp5 in [4]. For the other
half, only a single perturbation is recommended. But NTRU has stated that the
parameter sets presented in [14] are intended to supersede these parameter sets.
3 The Hidden Parallelepiped Problem
Consider the signature generation in the GGH scheme described in Section 2.

Let R ∈ Mn (Z) be the secret basis used to approximate CVP in the lattice
L. Let m ∈ Zn be the message digest. Babai’s round-off CVP approximation
algorithm [3] computes the signature s = bmR−1 eR, so that s − m belongs to
the parallelepiped P1/2 (R) = {xR : x ∈ [−1/2, 1/2]n }, which is a fundamental
domain of L. In other words, the signature generation is simply a reduction of the
message m modulo the parallelepiped spanned by the secret basis R. If we were
using Babai’s nearest plane CVP approximation algorithm [3], we would have
another fundamental parallelepiped (spanned by the Gram-Schmidt vectors of
the secret basis) instead: we will not further discuss this case in this paper, since
it does not create any significant difference and since this is not the procedure
chosen in NTRUSign.
GGH [10] suggested to hash messages into a set much bigger than the fun-
damental domain of L. This is for instance the case in NTRUSign where the
cardinality of {0, . . . , q − 1}2N is much bigger than the lattice volume q N . What-
ever the distribution of the message digest m might be, it would be reasonable
to assume that the distribution s − m is uniform (or very close to uniform) in
the secret parallelepiped P1/2 (R). More precisely, it seems reasonable to make
the following assumption:
9
Assumption 1 (The Uniformity Assumption) Let R be the secret basis of

the lattice L ⊆ Zn . When the GGH scheme signs polynomially many “randomly
chosen” message digests m1 , . . . , mk ∈ Zn using Babai’s round-off algorithm,
the signatures s1 , . . . , sk are such that the vectors si − mi are independent and
uniformly distributed over P1/2 (R) = {xR : x ∈ [−1/2, 1/2]n }.
Note that this is only an idealized assumption: in practice, the signatures
and the message digests are integer vectors, so the distribution of si − mi is
discrete rather than continuous, but this should not be a problem if the lattice
volume is sufficiently large, as is the case in NTRUSign. Similar assumptions
have been used in previous attacks [8, 35] on lattice-based signature schemes. We
emphasize that all our experiments on NTRUSign do not use this assumption
and work with real-life signatures.
We thus arrive at the following geometric learning problem (see Fig. 1):
Problem 2 (The Hidden Parallelepiped Pn Problem or HPP) Let V = [v1 ,
. . . , vn ] ∈ GLn (R) and let P(V ) = { i=1 xi vi : xi ∈ [−1, 1]} be the paral-
lelepiped spanned by V . The input to the HPP is a sequence of poly(n) indepen-
dent samples from U (P(V )), the uniform distribution over P(V ). The goal is to
find a good approximation of the rows of ±V .
In the definition of the HPP, we chose [−1, 1] rather than [−1/2, 1/2] like in
Assumption 1 to simplify subsequent calculations.
Clearly, if one could solve the HPP, then one would be able to approximate
the secret basis in GGH by collecting random pairs (message, signature). To
complete the attack, we need to show how to obtain the actual secret basis given
a good enough approximation of it. One simple way to achieve this is by rounding
the coordinates of the approximation to the closest integer. This approach will
work if and only if the error e in the approximation has max-norm less than 1/2:
we will see that this property can be guaranteed if the entries of the secret basis
are small in absolute value, provided that one is given sufficiently many random
pairs (message, signature).
Alternatively, one can use approximate-CVP algorithms to try to recover the
secret basis from the approximation, since one knows a lattice basis from the
GGH public key. One popular method is to reduce the public basis as much as
possible (say using BKZ reduction [33]), and then apply Babai’s nearest plane
algorithm [3] to the approximation, using the reduced basis. This approach suc-
ceeds if and only if the error vector e lies in the parallelepiped P1/2 (R∗ ), where
R∗ is formed by the Gram-Schmidt vectors of the reduced basis. Depending on
the geometry of the reduced basis and the error e, this might work for certain
cases where the simple rounding approach fails. Previous experiments of [29] on
the GGH challenges [9] seem to suggest that in practice, even a moderately good
approximation of a lattice vector is sufficient to recover the closest lattice vector,
even in high dimension.
The Special Case of NTRUSign. Following the announcement of our re-

sult in [30], Whyte observed [36] that symmetries in the NTRU lattices might
10
Algorithm 1 Solving the Hidden Parallelepiped Problem

Input: A polynomial number of samples uniformly distributed over a parallelepiped
P(V ).
Output: Approximations of rows of ±V .
1: Compute an approximation G of the Gram matrix V t V of V t (see Section 4.1).
2: Compute the Cholesky factor L of G−1 , so that G−1 = LLt .
3: Multiply the samples of P(V ) by L to the right to obtain samples of P(C) where
C = V L.
4: Compute approximations of rows of ±C by Algorithm 2 from Section 4.3.
5: Multiply each approximation by L−1 to the right to derive an approximation of a
row of ±V .
lead to attacks that require far less signatures. Namely, Whyte noticed that
in the particular case of NTRUSign, the hidden parallelepiped P(R) has the
following property: for each x ∈ P(R) the block-rotation σ(x) also belongs to
P(R), where σ is the function that maps any (x1 , . . . , xN , y1 , . . . , yN ) ∈ R2N
to (xN , x1 , . . . , xN −1 , yN , y1 , . . . , yN −1 ). This is because σ is a linear operation
that permutes the rows of R and hence leaves P(R) invariant. As a result, by
using the N possible rotations, each signature actually gives rise to N samples
in the parallelepiped P(R) (as opposed to just one in the general case of GGH).
For instance, 400 NTRUSign-251 signatures give rise to 100,400 samples in the
NTRU parallelepiped. Notice that these samples are no longer independent and
hence Assumption 1 does not hold. Nevertheless, as we will describe later, this
technique leads in practice to attacks using a significantly smaller number of
signatures.
4 Learning a Parallelepiped
In this section, we describe our solution to the Hidden Parallelepiped Problem
(HPP), based on the following steps. First, we approximate the covariance matrix
of the given distribution. This covariance matrix is essentially V t V (where V
defines the given parallelepiped). We then exploit this approximation in order
to transform our hidden parallelepiped P(V ) into a unit hypercube: in other
words, we reduce the HPP to the case where the hidden parallelepiped is a
hypercube. Finally, we show how hypercubic instances of the HPP are related
to a multivariate optimization problem based on the fourth moment, which we
solve by a gradient descent. The algorithm is summarized in Algorithms 1 and 2,
and is described in more detail in the following.
4.1 The Covariance Matrix Leakage

The first step in our algorithm is based on the idea of approximating the covari-
ance matrix, which was already present in the work of Gentry and Szydlo [8,
35] (after this basic step, our strategy differs completely from theirs). Namely,
Gentry and Szydlo observed that from GGH signatures one can easily obtain
11
an approximation of V t V , the Gram matrix of the transpose of the secret basis.

Here, we simply translate this observation to the HPP setting.
Lemma 1 (Covariance Matrix Leakage). Let V ∈ GLn (R). Let v be chosen

from the uniform distribution over the parallelepiped P(V ). Then
Exp[vt v] = V t V /3.
In other words, the covariance matrix of the distribution U (P(V )) is V t V /3.
Proof. We can write v = xV where x has uniform distribution over [−1, 1]n .
Hence,
vt v = V t xt xV.
An elementary computation shows that Exp[xt x] = In /3 where In is the n × n
identity matrix, and the lemma follows. t
u
Hence, by taking the average of vt v over all our samples v from U (P(V )),
and multiplying the result by 3, we can obtain an approximation of V t V .
4.2 Morphing a Parallelepiped into a Hypercube

The second stage is explained by the following lemma.
Lemma 2 (Hypercube Transformation). Let V ∈ GLn (R). Denote by G ∈
GLn (R) the symmetric positive definite matrix V t V . Denote by L ∈ GLn (R) the
Cholesky factor1 of G−1 , that is, L is the unique lower-triangular matrix with
positive diagonal entries such that G−1 = LLt . Then the matrix C = V L ∈
GLn (R) satisfies the following:
1. The rows of C are pairwise orthogonal unit vectors. In other words, C is an
orthogonal matrix in On (R) and P(C) is a unit hypercube.
2. If v is uniformly distributed over the parallelepiped P(V ), then c = vL is
uniformly distributed over the hypercube P(C).
Proof. The Gram matrix G = V t V is clearly symmetric positive definite. Hence
G−1 = V −1 V −t is also symmetric positive definite, and has a Cholesky factoriza-
tion G−1 = LLt where L is lower-triangular matrix. Therefore, V −1 V −t = LLt .
Let C = V L ∈ GLn (R). Then
CC t = V LLt V t = V V −1 V −t V t = I.
For the second claim, let v be uniformly distributed over P(V ). Then we can
write v = xV where x is uniformly distributed over [−1, 1]n . It follows that
vL = xV L = xC has the uniform distribution over P(C). t
u
Lemma 2 says that by applying the transformation L, we can map our sam-
ples from the parallelepiped P(V ) into samples from the hypercube P(C). Then,
1
Instead of the Cholesky factor, one can take any matrix L such that G−1 = LLt .
We work with Cholesky factorization as this turns out to be more convenient in our
experiments.
12
if we could approximate the rows of ±C, we would also obtain an approxima-

tion of the rows of ±V by applying L−1 . In other words, we have reduced the
Hidden Parallelepiped Problem into what one might call the Hidden Hypercube
Problem (see Fig. 2). From an implementation point of view, we note that the
Fig. 2. The Hidden Hypercube Problem in dimension two.
Cholesky factorization (required for obtaining L) can be easily computed by a

process close to the Gram-Schmidt orthogonalization process (see [11]). Lemma 2
assumes that we know G = V t V exactly. If we only have an approximation of
G, then C will only be close to some orthogonal matrix in On (R): the Gram
matrix CC t of C will be close to the identity matrix, and the image under L
of our parallelepiped samples will be uniformly distributed over a body that is
close to being a unit hypercube.
4.3 Learning a Hypercube

For any V = [v1 , . . . , vn ] ∈ GLn (R) and any integer k ≥ 1, we define the k-th
moment of P(V ) over a vector w ∈ Rn as
momV,k (w) = Exp[hu, wik ],
where u is uniformly distributed over the parallelepiped P(V ). 2 Clearly, momV,k (w)
can be approximated by using the given samples from U (P(V )). Since all the
odd moments are zero, we are interested in the first even moments, namely the
second and fourth moments. A straightforward calculation shows that for any
w ∈ Rn , they are given by
n
1X 1
momV,2 (w) = hvi , wi2 = wV t V wt ,
3 i=1 3
n
1X 1X
momV,4 (w) = hvi , wi4 + hvi , wi2 hvj , wi2 .
5 i=1 3
i6=j
2
This should not be confused with an unrelated notion of moment considered in [13,
14, 8].
13
Note that the second moment is given by the covariance matrix mentioned in
Section 4.1. When V ∈ On (R) (i.e., the vectors vi are orthonormal), the second
moment becomes kwk2 /3 while the fourth moment becomes
n
1 2 X
momV,4 (w) = kwk4 − hvi , wi4 .
3 15 i=1
By expressing w in the vi basis and taking derivatives in each of the directions

vi , it is not hard to verify that the gradient of the latter is
   
n n
X 4 X 8
∇momV,4 (w) =   hvj , wi2  hvi , wi − hvi , wi3  vi .
i=1
3 j=1 15
For w on the unit sphere the second moment is constantly 1/3, and
n
1 2 X
momV,4 (w) = − hvi , wi4 ,
3 15 i=1
n
4 8 X
∇momV,4 (w) = w− hvi , wi3 vi . (1)
3 15 i=1
See Figure 3.
0.2
0.1
-0.2 -0.1 0.1 0.2
-0.1
-0.2
Fig. 3. The fourth moment for n = 2. On the left: the dotted line shows the restriction
to the unit circle. On the right: a polar plot restricted to the unit circle.
Lemma 3. Let V = [v1 , . . . , vn ] ∈ On (R). Then the global minimum of momV,4 (w)
over the unit sphere of Rn is 1/5 and this minimum is obtained at ±v1 , . . . , ±vn .
There are no other local minima.
Proof. The method of Lagrange multipliers shows that for w to be an extremum
point of momV,4 on the unit sphere, it must be proportional to ∇momV,4 (w).
14
Algorithm 2 Solving the Hidden Hypercube Problem by Gradient Descent

Parameters: A descent parameter δ.
Input: A polynomial number of samples uniformly distributed over a unit hypercube
P(V ).
Output: An approximation of some row of ±V .
1: Let w be chosen uniformly at random from the unit sphere of Rn .
2: Compute an approximation g of the gradient ∇mom4 (w) (see Section 4.3).
3: Let wnew = w − δg.
4: Divide wnew by its Euclidean norm kwnew k.
5: if momV,4 (wnew ) ≥ momV,4 (w) where the moments are approximated by sampling
then
6: return the vector w.
7: else
8: Replace w by wnew and go back to Step 2.
9: end if
Pn
By writing w = i=1 hvi , wivi and using (1), we see that there must exist
some α such that hvi , wi3√= αhvi , wi for i = 1, . . . , n. In other words, each
hvi , wi is either zero or ± α. It is easy to check that among all such points,
only ±v1 , . . . , ±vn form local minima. t
u
In other words, the hidden hypercube problem can be reduced to a minimiza-
tion problem of the fourth moment over the unit sphere. A classical technique
to solve such minimization problems is the gradient descent described in Algo-
rithm 2. The gradient descent typically depends on a parameter δ, which has
to be carefully chosen. Since we want to minimize the function here, we go in
the opposite direction of the gradient. To approximate the gradient in Step 2 of
Algorithm 2, we notice that
∇momV,4 (w) = Exp[∇(hu, wi4 )] = 4Exp[hu, wi3 u].
This allows to approximate the gradient ∇momV,4 (w) using averages over sam-
ples, like for the fourth moment itself.
5 Experimental Results
As usual in cryptanalysis, perhaps the most important question is whether or

not the attack works in practice. We therefore implemented the attack in C++
and ran it on a 2GHz PC/Opteron. The critical parts of the code were written
in plain C++ using double arithmetic, while the rest used Shoup’s NTL library
version 5.4 [34]. Based on trial-and-error, we chose δ = 0.7 in the gradient descent
(Algorithm 2), for all the experiments mentioned here. The choice of δ has a big
impact on the behavior of the gradient descent: the choice δ = 0.7 works well,
but we do not claim that it is optimal. When doing several descents in a row, it
is useful to relax the halting condition 5 in Algorithm 2 to abort descents which
seem to make very little progress.
15
5.1 NTRUSign
We performed two kinds of experiments against NTRUSign, depending on

whether the symmetries of NTRU lattices explained in Section 3 were used or
not. All the experiments make it clear that perturbation techniques are really
mandatory for the security of NTRUSign, though it is currently unknown if
such techniques are sufficient to prevent this kind of attacks.
Without exploiting the symmetries of NTRU lattices. We applied Al-

gorithm 1 to real-life parameters of NTRUSign. More precisely, we ran the
attack on NTRUSign-251 without perturbation, corresponding to the param-
eter choices ees251sp2, ees251sp3, ees251sp4 and ees251sp5 in the NTRU
standards [4] under consideration by IEEE P1363.1 [19]. This corresponds to
a lattice dimension of 502. We did not rely on the uniformity assumption: we
generated genuine NTRUSign signatures of messages generated uniformly at
random over {0, . . . , q − 1}n .
The results of the experiments are summarized in Figure 4. For each given
400
350
300
250
200
150
100
50
0
100000 150000 200000 250000 300000
Fig. 4. Experiments on NTRUSign-251 without perturbation, and without using

NTRU symmetries. The curve shows the average number of random descents required
to recover the secret key, depending on the number of signatures, which is in the range
80,000–300,000.
number of signatures in the range 80,000–300,000, we generated a set of sig-

natures, and applied Algorithm 1 to it: from the set of samples we derived an
approximation of the Gram matrix, used it to transform the parallelepiped into
a hypercube, and finally, we ran a series of about a thousand descents, starting
with random points. We regard a descent as successful if, when rounded to the
nearest integer vector, the output of the descent gives exactly one of the vectors
of the secret basis (which is sufficient to recover the whole secret basis in the
16
case of NTRUSign). We did not notice any improvement using Babai’s near-
est plane algorithm [3] (with a BKZ-20 reduced basis [33] computed from the
public basis) as a CVP approximation. The curve shows the average number of
random descents needed for a successful descent as a function of the number of
signatures.
Typically, a single random descent does not take much time: for instance, a
usual descent for 150,000 signatures takes roughly ten minutes. When successful,
a descent may take as little as a few seconds. The minimal number of signatures
to make the attack successful in our experiments was 90,000, in which case the
required number of random descents was about 400. With 80,000 signatures, we
tried 5,000 descents without any success. The curve given in Fig. 4 may vary
a little bit, depending on the secret basis: for instance, for the basis used in
the experiments of Fig. 4, the average number of random descents was 15 with
140,000 signatures, but it was 23 for another basis generated with the same
NTRU parameters. It seems that the exact geometry of the secret basis has an
influence, as will be seen in the analysis of Section 6.
Exploiting the symmetries of NTRU lattices. Based on Whyte’s obser-

vation described in Section 3, one might hope that the number of signatures
required by the attack can be shrunk by a factor of roughly N . Luckily, this is
Table 1. Experiments on NTRUSign-251 without perturbation, using NTRU sym-

metries.
Number of signatures Expected number of descents to recover the secret key

1,000 2
500 40
400 100
indeed the case in practice (see Table 1): as few as 400 signatures are enough
in practice to recover the secret key, though the corresponding 100,400 paral-
lelepiped samples are not independent. This means that the previous number of
90,000 signatures required by the attack can be roughly divided by N = 251.
Hence, NTRUSign without perturbation should be considered totally insecure.
5.2 The GGH Challenges

We also did experiments on the GGH challenges [9], which range from dimension
200 to 400. Because there is actually no GGH signature challenge, we simply
generated secret bases like in the GGH encryption challenges. To decrease the
cost of sample generation, and because there was no concrete proposal of a GGH
signature scheme, we relied on the uniformity assumption: we created samples
uniformly distributed over the secret parallelepiped, and tried to recover the
secret basis.
17
When the number of samples becomes sufficiently high, the approximation

obtained by a random descent is sufficiently good to disclose one of the vectors
of the secret basis by simple rounding, just as in the NTRUSign case: however,
the number of required samples is significantly higher than for NTRUSign; for
instance, with 200,000 samples in dimension 200, three descents are enough to
disclose a secret vector by rounding; whereas three descents are also enough
with 250,000 samples for NTRUSign, but that corresponds to a dimension of
502 which is much bigger than 200. This is perhaps because the secret vectors
of the GGH challenges are significantly longer than those of NTRUSign.
However, one can significantly improve the result by using a different round-
ing procedure, as pointed out in Section 3. Namely, instead of rounding the
approximation to an integer vector, one can apply a CVP approximation algo-
rithm such as Babai’s nearest plane algorithm [3]: such algorithms will succeed
if the approximation is sufficiently close to the lattice; and one can improve the
chances of the algorithm by computing a lattice basis as reduced as possible
(using for instance BKZ reduction [33]). For instance, with only 20,000 samples
in dimension 200, it was impossible to recover a secret vector by rounding, but
it became easy with Babai’s nearest plane algorithm on a BKZ-20 reduced basis
(obtained by BKZ reduction of the public HNF basis): more precisely, three ran-
dom descents sufficed on the average. More generally, Figure 5 shows the average
number of samples required so that ten random descents disclose a secret vector
with high probability, depending on the dimension of the GGH challenge, using
Babai’s nearest plane algorithm on a BKZ-20 reduced basis. Figure 5 should
Number of GGH Signatures Required

200000
100000
200 250 300 350 400
Fig. 5. Average number of GGH signatures required so that ten random descents
coupled with Babai’s nearest plane algorithm disclose with high probability a secret
vector, depending on the dimension of the GGH challenge.
not be interpreted as the minimal number of signatures required for the success
of the attack: it only gives an upper bound for that number. Indeed, there are
several ways to decrease the number of signatures:
– One can run much more than ten random descents.
– One can take advantage of the structure of the GGH challenges. When start-
ing a descent, rather than starting with a random point on the unit sphere,
18
we may exploit the fact that we know the rough directions of the secret
vectors.
– One can use better CVP approximation algorithms, or use better reduction
algorithms in conjunction with Babai’s nearest plane algorithm.
6 Theoretical Analysis
Our goal in this section is to give a rigorous theoretical justification to the success
of the attack. Namely, we will show that given a large enough polynomial number
of samples, Algorithm 1 succeeds in finding a good approximation to a row of
V with some constant probability. For sake of clarity and simplicity, we will not
make any attempt to optimize this polynomial bound on the number of samples.
We will also assume we can perform operations on real numbers; modifying
the analysis to work with finite precision numbers should be straightforward.
Let us remark that it is possible that a rigorous analysis already exists in the
ICA literature, although we were unable to find any (an analysis under some
simplifying assumptions can be found in [18]). Also, Frieze et al. [5] sketch a
rigorous analysis of a similar algorithm.
In order to approximate the covariance matrix, the fourth moment, and its
gradient, our attack computes averages over samples. Because the samples are
independent and identically distributed, we can use known bounds on large de-
viations such as the Chernoff bound (see, e.g., [2]) to obtain that with extremely
high probability the approximations are very close to the true values. In our
analysis below we omit the explicit calculations, as these are relatively standard.
6.1 Analysis of Algorithm 2
We start by analyzing Algorithm 2. For simplicity, we consider only the case in

which the descent parameter δ equals 3/4. A similar analysis holds for 0 < δ <
3/4. Another simplifying assumption we make is that instead of the stopping
rule in Step 5 we simply repeat the descent step some small number r of times
(which will be specified later).
For now, let us assume that the matrix V is an orthogonal matrix, so our
samples are drawn from a unit hypercube P(V ). We will later show that the
actual matrix V , as obtained from Algorithm 1, is very close to orthogonal, and
that this approximation does not affect the success of Algorithm 2.
Theorem 3. For any c0 > 0 there exists a c1 > 0 such that given nc1 samples
uniformly distributed over some unit hypercube P(V ), V = [v1 , . . . , vn ] ∈ On (R),
Algorithm 2 with δ = 3/4 and r = O(log log n) descent steps outputs with con-
stant probability a vector that is within `2 distance n−c0 of ±vi for some i.
Proof. We first analyze the behavior of Algorithm 2 under the assumption that all
gradients are computed exactly without any error. We write any vector w ∈ Rn
19
Pn
as w = i=1 wi vi . Then, using (1), we see that for w on the unit sphere,
n
4 8 X 3
∇momV,4 (w) = w− w vi .
3 15 i=1 i
Since we took δ = 3/4, Step 3 in Algorithm 2 performs

n
2X 3
wnew = w vi .
5 i=1 i
The vector is then normalized in Step 4. So we see that each step in the gradi-
ent descent takes a vector (w1 , . . . , wn ) to the vector α · (w13 , . . . , wn3 ) for some
normalization factor α (where both vectors are written in the vi basis). Hence,
after r iterations, a vector (w1 , . . . , wn ) is transformed to the vector
r r
α · (w13 , . . . , wn3 )
for some normalization factor α.
Recall now that the original vector (w1 , . . . , wn ) is chosen uniformly from
the unit sphere. It can be shown that with some constant probability, one of its
coordinates is greater in absolute value than all other coordinates by a factor of
at least 1 + Ω(1/ log n) (first prove this for a vector distributed according to the
standard multivariate Gaussian distribution, and then note that by normalizing
we obtain a uniform vector from the unit sphere). For such a vector, after only
r = O(log log n) iterations, this gap is amplified to more than, say, nlog n , which
means that we have one coordinate very close to ±1 and all others are at most
n− log n in absolute value. This establishes that if all gradients are known exactly,
Algorithm 2 succeeds with some constant probability.
To complete the analysis of Algorithm 2, we now argue that it succeeds
with good probability even in the presence of noise in the approximation of
the gradients. First, it can be shown that for any c > 0, given a large enough
polynomial number of samples, with very high probability all our gradient ap-
proximations are accurate to within an additive error of n−c in the `2 norm (we
have r such approximations during the course of the algorithm). This follows
by a standard application of the Chernoff bound followed by a union bound.
Now let w = (w1 , . . . , wn ) be a unit vector in which one coordinate, say the
jth, is greater in absolute value than all other coordinates by at least a fac-
tor of 1 √+ Ω(1/ log n). Since w is a unit vector, this in particular means that
wj > 1/ n. Let w̃new = w − δ∇mom4 (w). Recall that for each i, w̃new,i = 52 wi3
which in particular implies that w̃new,j > 52 n−1.5 > n−2 . By our assumption on
the approximation g, we have that for each i, |w̃new,i − wnew,i | ≤ n−c . So for
any k 6= j,
|wnew,j | |w̃new,j | − n−c |w̃new,j |(1 − n−(c−2) )
≥ ≥ .
|wnew,k | |w̃new,k | + n−c |w̃new,k | + n−c
If |w̃new,k | > n−(c−1) , then the above is at least (1 − O(1/n))(wj /wk )3 . Oth-
erwise, the above is at least Ω(nc−3 ). Hence, after O(log log n) steps, the gap
20
wj /wk becomes Ω(nc−3 ). Therefore, for any c0 > 0 we can make the distance
between the output vector and one of the ±vi s less than n−c0 by choosing a
large enough c. t
u
6.2 Analysis of Algorithm 1
The following theorem completes the analysis of the attack. In particular, it im-
plies that if V is an integer matrix all of whose entries are bounded in absolute
value by some polynomial, then running Algorithm 1 with a large enough poly-
nomial number of samples from the uniform distribution on P(V ) gives (with
constant probability) an approximation to a row of ±V whose error is less than
1/2 in each coordinate, and therefore leads to an exact row of ±V simply by
rounding each coordinate to the nearest integer. Hence we have a rigorous proof
that our attack can efficiently recover the secret key in both NTRUSign and
the GGH challenges.
Theorem 4. For any c0 > 0 there exists a c1 > 0 such that given nc1 sam-
ples uniformly distributed over some parallelepiped P(V ), V = [v1 , . . . , vn ] ∈
GLn (R), Algorithm 1 outputs with constant probability a vector ẽV where ẽ is
within `2 distance n−c0 of some standard basis vector ei .
Proof. Recall that a sample v from U (P(V )) can be written as xV where x

is chosen uniformly from [−1, 1]n . So let vi = xi V for i = 1, . . . , N be the
input samples. Then our approximation G to the Gram matrix V t V is given
t˜ ˜
by G = V IV where I = N 3
xi xi . We claim that with high probability, I˜ is
P t
very close to the identity matrix. Indeed, for x chosen randomly from [−1, 1]n ,
each diagonal entry of xt x has expectation 1/3 and each off-diagonal entry has
expectation 0. Moreover, these entries take values in [−1, 1]. By the Chernoff
bound we obtain that for any approximation parameter c > 0, if we choose, say,
N = n2c+1 then with very high probability each entry in I˜ − I is at most n−c in
absolute value. This implies that all eigenvalues of the symmetric matrix I˜ are
in the range 1 ± n−c+1 (and in particular we obtain that I˜ and hence also G are
invertible).
Recall that we define L to be the Cholesky factor of G−1 = V −1 I˜−1 V −t
and that C = V L. Now CC t = V LLt V t = I˜−1 , which implies that C is close
to an orthogonal matrix. Let us make this precise. Consider the singular value
decomposition of C, given by C = U1 DU2 where U1 , U2 are orthogonal matrices
and D is diagonal. Then CC t = U1 D2 U1t and hence D2 = U1t I˜−1 U1 . From this
it follows that the diagonal of D consists of the square roots of the reciprocals of
the eigenvalues of I,˜ which in particular means that all values on the diagonal
of D are also in the range 1 ± n−c+1 .
Consider the orthogonal matrix C̃ = U1 U2 . We claim that for large enough
c, samples from P(C) ‘look like’ samples from P(C̃). More precisely, assume
that c is chosen so that the number of samples required by Algorithm 2 is
less than, say, nc−4 . Then, it follows from Lemma 4 below that the statistical
21
distance3 between a set of nc−4 samples from P(C) and a set of nc−4 samples
from P(C̃) is at most O(n−1 ). By Theorem 3, we know that when given samples
from P(C̃), Algorithm 2 outputs an approximation of a row of ±C̃ with some
constant probability. Hence, when given samples from P(C), it must still output
an equally good approximation of a row of ±C̃ with a probability that is smaller
by at most O(n−1 ) and in particular, constant.
To complete the proof, let c̃ be the vector obtained in Step 4. The output of
Algorithm 1 is then
c̃L−1 = c̃C −1 V = (c̃C̃ −1 )(C̃C −1 )V = (c̃C̃ −1 )(U1 D−1 U1t )V.
As we have seen before, all eigenvalues of U1 D−1 U1t are close to 1. It therefore
follows that the above is a good approximation to a row of ±V , and it is not
hard to verify that the quality of this approximation satisfies the requirements
stated in the theorem. t
u
Lemma 4. The statistical distance between the uniform distribution on P(C)

and that on P(C̃) is at most O(n−c+3 ).
Proof. We first show that the parallelepiped P(C) is almost contained and almost
contains the cube P(C̃):
(1 − n−c+2 )P(C̃) ⊆ P(C) ⊆ (1 + n−c+2 )P(C̃).
To show this, take any vector y ∈ [−1, 1]n . The second containment is equivalent
to showing that all the coordinates of yU1 DU1t are at most 1 + n−c+2 in absolute
value. Indeed, by the triangle inequality,
kyU1 DU1t k∞ ≤ kyk∞ + kyU1 (D − I)U1t k∞ ≤ 1 + kyU1 (D − I)U1t k2

√
≤ 1 + n−c+1 n < 1 + n−c+2 .
The first containment is proved similarly. On the other hand, the ratio of volumes
between the two cubes is ((1 + n−c+2 )/(1 − n−c+2 ))n = 1 + O(n−c+3 ). From this
it follows that the statistical distance between the uniform distribution on P(C)
and that on P(C̃) is at most O(n−c+3 ). t
u
Acknowledgements. We thank William Whyte for helpful discussions and the
anonymous referees for useful comments.
References
1. M. Ajtai. Generating hard instances of lattice problems. In Complexity of com-
putations and proofs, volume 13 of Quad. Mat., pages 1–32. Dept. Math., Seconda
Univ. Napoli, Caserta, 2004.
3
The statistical distance (or total variation distance) between two distributions is the
maximum probability with which one can distinguish between an input sampled from
the first distribution and an input sampled from the second distribution.
22
2. N. Alon and J. H. Spencer. The probabilistic method. Wiley-Interscience Series in

Discrete Mathematics and Optimization. Wiley-Interscience [John Wiley & Sons],
New York, second edition, 2000.
3. L. Babai. On Lovász lattice reduction and the nearest lattice point problem.
Combinatorica, 6:1–13, 1986.
4. Consortium for Efficient Embedded Security. Efficient embedded security stan-
dards #1: Implementation aspects of NTRUEncrypt and NTRUSign. Version 2.0
available available at [19], June 2003.
5. A. Frieze, M. Jerrum, and R. Kannan. Learning linear transformations. In 37th
Annual Symposium on Foundations of Computer Science (Burlington, VT, 1996),
pages 359–368. IEEE Comput. Soc. Press, Los Alamitos, CA, 1996.
6. C. Gentry, J. Jonsson, J. Stern, and M. Szydlo. Cryptanalysis of the NTRU signa-
ture scheme (NSS) from Eurocrypt 2001. In Proc. of Asiacrypt ’01, volume 2248
of LNCS. Springer-Verlag, 2001.
7. C. Gentry, C. Peikert, and V. Vaikuntanathan. Trapdoors for hard lattices and new
cryptographic constructions. In Proc. 40th ACM Symp. on Theory of Computing
(STOC), pages 197–206, 2008.
8. C. Gentry and M. Szydlo. Cryptanalysis of the revised NTRU signature scheme.
In Proc. of Eurocrypt ’02, volume 2332 of LNCS. Springer-Verlag, 2002.
9. O. Goldreich, S. Goldwasser, and S. Halevi. Challenges for the GGH cryptosystem.
Available at http://theory.lcs.mit.edu/~shaih/challenge.html.
10. O. Goldreich, S. Goldwasser, and S. Halevi. Public-key cryptosystems from lattice
reduction problems. In Proc. of Crypto ’97, volume 1294 of LNCS, pages 112–131.
IACR, Springer-Verlag, 1997. Full version vailable at ECCC as TR96-056.
11. G. Golub and C. Loan. Matrix Computations. Johns Hopkins Univ. Press, 1996.
12. J. Hoffstein, N. A. H. Graham, J. Pipher, J. H. Silverman, and W. Whyte.
NTRUSIGN: Digital signatures using the NTRU lattice. Full version of [13]. Draft
of April 2, 2002, available on NTRU’s website.
NTRUSIGN: Digital signatures using the NTRU lattice. In Proc. of CT-RSA,
volume 2612 of LNCS. Springer-Verlag, 2003.
Performances improvements and a baseline parameter generation algorithm for
NTRUsign. In Proc. of Workshop on Mathematical Problems and Techniques in
Cryptology, pages 99–126. CRM, 2005.
15. J. Hoffstein, J. Pipher, and J. Silverman. NTRU: a ring based public key cryptosys-
tem. In Proc. of ANTS III, volume 1423 of LNCS, pages 267–288. Springer-Verlag,
1998. First presented at the rump session of Crypto ’96.
16. J. Hoffstein, J. Pipher, and J. H. Silverman. NSS: An NTRU lattice-based signature
scheme. In Proc. of Eurocrypt ’01, volume 2045 of LNCS. Springer-Verlag, 2001.
17. A. Hyvärinen, J. Karhunen, and E. Oja. Independent Component Analysis. John
Wiley & Sons, 2001.
18. A. Hyvärinen and E. Oja. A fast fixed-point algorithm for independent component
analysis. Neural Computation, 9(7):1483–1492, 1997.
19. IEEE P1363.1. Public-key cryptographic techniques based on hard problems over
lattices. See http://grouper.ieee.org/groups/1363/lattPK/index.html, June
2003.
20. P. Klein. Finding the closest lattice vector when it’s unusually close. In Proc. of
SODA ’00. ACM–SIAM, 2000.
23
21. V. Lyubashevsky and D. Micciancio. Asymptotically efficient lattice-based digital

signatures. In Fifth Theory of Cryptography Conference (TCC), volume 4948 of
Lecture Notes in Computer Science. Springer, 2008.
22. R. McEliece. A public-key cryptosystem based on algebraic number theory. Tech-
nical report, Jet Propulsion Laboratory, 1978. DSN Progress Report 42-44.
23. D. Micciancio. Improving lattice-based cryptosystems using the Hermite normal
form. In Proc. of CALC ’01, volume 2146 of LNCS. Springer-Verlag, 2001.
24. D. Micciancio. Cryptographic functions from worst-case complexity assumptions,
2007. Survey paper prepared for the LLL+25 conference. To appear.
25. D. Micciancio and S. Goldwasser. Complexity of Lattice Problems: A Cryptographic
Perspective, volume 671 of The Kluwer International Series in Engineering and
Computer Science. Kluwer Academic Publishers, Boston, Massachusetts, Mar.
2002.
26. D. Micciancio and O. Regev. Lattice-based cryptography. In D. J. Bernstein and
J. Buchmann, editors, Post-quantum Cryprography. Springer, 2008.
27. D. Micciancio and S. Vadhan. Statistical zero-knowledge proofs with efficient
provers: lattice problems and more. In Advances in cryptology – Proc. CRYPTO
’03, volume 2729 of Lecture Notes in Computer Science, pages 282–298. Springer-
Verlag, 2003.
28. M. Naor and M. Yung. Universal one-way hash functions and their cryptographic
applications. In Proc. 21st ACM Symp. on Theory of Computing (STOC), pages
33–43, 1989.
29. P. Q. Nguyen. Cryptanalysis of the Goldreich-Goldwasser-Halevi cryptosystem
from Crypto ’97. In Proc. of Crypto ’99, volume 1666 of LNCS, pages 288–304.
IACR, Springer-Verlag, 1999.
30. P. Q. Nguyen and O. Regev. Learning a Parallelepiped: Cryptanalysis of GGH
and NTRU Signatures. In Advances in cryptology – Proceedings of EUROCRYPT
’06, volume 4004 of LNCS, pages 215–233. Springer-Verlag, 2006.
31. P. Q. Nguyen and J. Stern. The two faces of lattices in cryptology. In Proc. of
CALC ’01, volume 2146 of LNCS. Springer-Verlag, 2001.
32. O. Regev. Lattice-based cryptography. In Advances in cryptology – Proc. of
CRYPTO ’06, volume 4117 of LNCS, pages 131–141. Springer-Verlag, 2006.
33. C. P. Schnorr and M. Euchner. Lattice basis reduction: improved practical algo-
rithms and solving subset sum problems. Math. Programming, 66:181–199, 1994.
34. V. Shoup. NTL: A library for doing number theory. Available at
http://www.shoup.net/ntl/.
35. M. Szydlo. Hypercubic lattice reduction and analysis of GGH and NTRU signa-
tures. In Proc. of Eurocrypt ’03, volume 2656 of LNCS. Springer-Verlag, 2003.
36. W. Whyte. Improved NTRUSign transcript analysis. Presentation at the rump
session of Eurocrypt ’06, on May 30, 2006.

Full Learning

Uploaded by

Copyright:

Available Formats

Full Learning

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Full Learning

Uploaded by

Copyright:

Available Formats

To appear in the J. of Cryptology. A preliminary version appeared in the Proceedings of EUROCRYPT ’06.

Phong Q. Nguyen?1 and Oded Regev??2

Abstract. Lattice-based signature schemes following the Goldreich-

to the hardness of approximating the closest vector problem (CVP) in a lattice.

Our Results. In this article, we present a new key-recovery attack on lattice-

Fig. 1. The Hidden Parallelepiped Problem in dimension two.

Road map. The paper is organized as follows. In Section 2, we provide nota-

2 Background and Notation

Vectors of Rn will be row vectors denoted by bold lowercase letters such as b.

Any such n-tuple of vectors [b1 , . . . , bn ] is called a basis of L: an n-dimensional

2.2 The GGH Signature Scheme

where fi denotes the coefficient of X i of the polynomial f . Thus, the lattice

which implies that the lattice volume is q N .

The messages are assumed to be hashed in {0, . . . , q−1}2N . Let m ∈ {0, . . . , q−

3 The Hidden Parallelepiped Problem

Consider the signature generation in the GGH scheme described in Section 2.

Assumption 1 (The Uniformity Assumption) Let R be the secret basis of

The Special Case of NTRUSign. Following the announcement of our re-

Algorithm 1 Solving the Hidden Parallelepiped Problem

4.1 The Covariance Matrix Leakage

an approximation of V t V , the Gram matrix of the transpose of the secret basis.

Lemma 1 (Covariance Matrix Leakage). Let V ∈ GLn (R). Let v be chosen

In other words, the covariance matrix of the distribution U (P(V )) is V t V /3.

4.2 Morphing a Parallelepiped into a Hypercube

if we could approximate the rows of ±C, we would also obtain an approxima-

Fig. 2. The Hidden Hypercube Problem in dimension two.

Cholesky factorization (required for obtaining L) can be easily computed by a

4.3 Learning a Hypercube

momV,k (w) = Exp[hu, wik ],

By expressing w in the vi basis and taking derivatives in each of the directions

-0.2 -0.1 0.1 0.2

Algorithm 2 Solving the Hidden Hypercube Problem by Gradient Descent

∇momV,4 (w) = Exp[∇(hu, wi4 )] = 4Exp[hu, wi3 u].

As usual in cryptanalysis, perhaps the most important question is whether or

We performed two kinds of experiments against NTRUSign, depending on

Without exploiting the symmetries of NTRU lattices. We applied Al-

Fig. 4. Experiments on NTRUSign-251 without perturbation, and without using

number of signatures in the range 80,000–300,000, we generated a set of sig-

Exploiting the symmetries of NTRU lattices. Based on Whyte’s obser-

Table 1. Experiments on NTRUSign-251 without perturbation, using NTRU sym-

Number of signatures Expected number of descents to recover the secret key

5.2 The GGH Challenges

When the number of samples becomes sufficiently high, the approximation

Number of GGH Signatures Required

200 250 300 350 400

6.1 Analysis of Algorithm 2

We start by analyzing Algorithm 2. For simplicity, we consider only the case in

Since we took δ = 3/4, Step 3 in Algorithm 2 performs

6.2 Analysis of Algorithm 1

Proof. Recall that a sample v from U (P(V )) can be written as xV where x

c̃L−1 = c̃C −1 V = (c̃C̃ −1 )(C̃C −1 )V = (c̃C̃ −1 )(U1 D−1 U1t )V.

Lemma 4. The statistical distance between the uniform distribution on P(C)

(1 − n−c+2 )P(C̃) ⊆ P(C) ⊆ (1 + n−c+2 )P(C̃).

kyU1 DU1t k∞ ≤ kyk∞ + kyU1 (D − I)U1t k∞ ≤ 1 + kyU1 (D − I)U1t k2

2. N. Alon and J. H. Spencer. The probabilistic method. Wiley-Interscience Series in

21. V. Lyubashevsky and D. Micciancio. Asymptotically efficient lattice-based digital

You might also like