Skip to content

Experiments using PSLQ to break Learning With Errors (LWE)

License

Notifications You must be signed in to change notification settings

predrag3141/pslq_vs_lwe

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

61 Commits
 
 
 
 
 
 

Repository files navigation

PSLQ vs. Learning With Errors

This repository contains an experiment using PSLQ to break a particular Learning With Errors (LWE) encryption scheme in a toy-sized example.

The LWE problem challenges an attacker to find a short solution to a linear equation with many constraints over a finite field. PSLQ is suited to solve the same problem, but with just one constraint. So the code in this repository transforms the original multi-constraint equation into a single-constraint equation and uses PSLQ to solve it in some toy examples.

How it Works

PSLQ is an algorithm that can find a small but not-all-zero integer-only solution z1,z2,...,zn of the equation

a1z1+a2z2+...+anzn=0 (equation 1)

where the ai are real numbers.

The LWE problem, whose hardness is the security guarantee for a variety of cryptographic primitives, is similar to equation (1) in one way: It, too, is a linear equation with a small, hard-to-find, not-all-zero, discrete solution.

An LWE Public Key Scheme

LWE is the security guarantee for at least two public key encryption schemes. One of these schemes is covered in this video. The description of this scheme starts at 28:30 into the video.

Public and Private Key

In this scheme, all computations are in a finite field, GF(q), where q is a moderately large prime number. To give an idea of the size of q, the toy-sized code in this repository iterates through values of q starting with 11.

A is a public n x m matrix with m >> n log(n). The toy-sized n and m in the code here are 3 and 9, respectively.

One of the parties, whom we will call Alice, chooses short m-long vector x, a private key. Alice's public key is

u = Ax mod q (equation 2)

Here,

  • x is short enough to keep estimate 5 (below) within q/4, which turns out to be close enough.
  • u is an n-long column vector.

Encrypting

Another party, Bob, wishes to encrypt a one-bit message, denoted bit below, to Alice. First, Bob sends the ciphertext preamble,

bt = sA + et (equation 3)

Here,

  • The "t"s mean transpose
  • e is a temporary short vector private to Bob.
  • e, like x, is short enough to keep estimate 5 (below) within q/4. That doesn't mean that e and x have similar sizes, but they work together to keep estimate 5 that close.
  • Bob uses his own n-long secret vector, s

Next, using Alice's public key, u, Bob sends the payload -- a rational number:

b' = st u + e' + (q /2) bit

Here,

  • e', like e and x, is short enough to keep the estimate 5 (below) within q/4.

Decrypting

In order to calculate bit from b', Alice combines Bob's preamble and her private key, x, to compute

bt x

    = st Ax + et x

    = st u + et x

    ~ st u (estimate 4)

Alice then computes

b' - bt x

    = st u + e' + bit * q/2 - bt x

    ~ st u + e' + bit * q/2 - st u (using estimate 4)

    = e' + bit * q/2

    ~ bit * q/2 (estimate 5)

Since x, e and e' were chosen to keep estimate 5 within q/4, Alice distinguishes bit = 0 from bit = 1 by concluding that

  • bit = 0 if -q/4 < b' - bt x < q/4
  • bit = 1 otherwise

Breaking this Scheme with PSLQ

"Breaking LWE" sounds dramatic, but this is a toy-sized example. Lots already goes wrong as we'll see; yet there was one success out of many failures, and the success ratio should improve with tweaks to PSLQ. The interesting part would be trying to scale the size of the example, and of course more can go wrong when that happens. But PSLQ is a polynomial time algorithm in precision and dimension (m in this case). So there is reason to investigate, beginning with this baby step.

Any Short Solution Works

What keeps estimate 5 close enough to distinguish bit = 0 from bit = 1 is not the specific value of Alice's private key, x. It is how short x is. To break this public key encryption scheme, it is enough to find a vector, y, such that

  • |y| <= |x| (in Euclidean norm)
  • Ay = u mod q

The fact that x is short and that Ax = u mod q are what make estimate 4 work. Any y with those properties would work too. PSLQ can be adapted to find y as follows.

Constructing Input to PSLQ

Notation:

  • A has row vectors a1,a2,...,an
  • u has corresponding entries, u1,u2,...,un
  • Select a suitably-sized integer, denoted base in what follows
  • y will denote the short solution mentioned in the previous section.
  • y' = (y'1, y'2, ..., y'n) will denote a vector of coefficients that make each constraint in Ay = u work mod q. In other words, <ai, y> = ui - y'i q for i=1,...,n.
  • v will denote the input to give PSLQ, to get PSLQ to return a short solution of Ay = u mod q

Ay = u mod q

    =><a1, y> + <base a2, y> + <base2 a3, y> + ... + <basen-1 an, y>

      = u1 + base u2 + base2 u3 + ... + basen-1 un

      -(qy'1 + base qy'2 + base2 qy'3 + ... + basen-1 qy'n)

    => <(q, base q, base2 q, ..., basen-1 q), y'>

      + <a1 + base a2 + base2 a3 + ... + basen-1 an, y>

      + (-u1) + base (-u2) + base2 (-u3) + ... + basen-1 (-un) = 0 (equation 6)

Based on equation 6, PSLQ is given input

v = (q, base q, base2 q, ..., basen-1 q) || a1 + base a2 + base2 a3 + ... + basen-1 an || ( -u')

Here,

  • "||" means concatenation
  • The scalar u' = u1 + base u2 + base2 u3 + ... + basen-1 un

So v is an n+m+1-long concatenation of the n-long vector of the baseiq; an m-long combination of the ai; and u'.

Calling PSLQ

PSLQ will find a short solution yext of <v, yext> = 0. (Subscript "ext" highlights the fact that yext is an extension of the desired solution, y, of Ay = u). If all goes well, y = (yext n+1, yext n+2, ..., yext n+m) will be a solution of Ay = u mod q.

What can go wrong is

  • yext being a short non-causal solution; i.e., y does not satisfy each equation <ai, y> = ui, i=1,...,n. To mitigate this risk, base must be chosen large enough that y is likely to be causal. In the code, for reasons outside the scope of this README, base = 20(m + n)/n.
  • yext not even being short! Though PSLQ is designed to produce short solutions, its intended use case is for non-integer inputs. When attacking LWE, inputs to PSLQ are integers, so there are many integer relations among them. PSLQ considers any of the plentiful integer solutions -- even a large one -- to be a win, and it terminates. There is a potential remedy to this problem; see below.
  • yext m+n+1 != 1. If this last coefficient of yext is 0, y is a solution of Ay = 0 mod q -- no use in this situation. If the last coefficient of yext is other than 0 or 1, yext needs to be divided by yext n+m+1 mod q. Only if yext n+m+1 is a unit (1 or -1) does y/yext n+m+1 mod q remain a short solution of Ay = u.

Notes on large PSLQ output

  • The cause of the early termination problem: PSLQ makes a decision to swap a pair of rows of an internal matrix, H, in each iteration. The choice of a pair to swap determines whether PSLQ terminates. But this choice is not optimized to prevent early termination with a large solution. Instead, the rows are swapped to minimize a function of the diagonal elements of H. Early termination is evidently not considered a problem, because input is typically non-integer. But it's a problem for attacking LWE.
  • A fix for early termination: If PSLQ chooses a row swap in H that causes termination, revert to the state before the swap and try a different one.
  • A potential improvement in the solution size: The choice of rows to swap optimizes the diagonal of H as a proxy for a more difficult, but more revealing quantity to compute -- a certain minimum radius, r. r is the radius of the smallest intersection with the n+m-dimensional solution hyperplane with the convex hull of the 2n+m+1 columns, with signs of +1 or -1, of another matrix, B, that PSLQ tracks. This intersection cannot contain a solution of <v, yext> = 0 -- the key to how PSLQ works. The larger this intersection is in every direction, the less likely it will be that PSLQ misses good solutions. So we want to make r as large as possible with each row swap. Choosing a swap that maximizes r, rather than the proxy for it in the diagonal of H, might improve the output of PSLQ. There would be a cost to pay in runtime.
  • Order matters. You may have noticed a strange reordering of terms in the previous section, "Constructing Input to PSLQ". The q-related coefficients -- basei q for i=1,2,...,n -- were moved from the end to the beginning of v. This is because, in the ordering with the basei q at the end, the particular implementation of PSLQ used caused the second bullet above, about yext not always being short, to come true. PSLQ returned a vector yext that is all-zero, except for two coefficients 1 and -base. These two coefficients corresponded to some basei+1 and basei in v. In this unuseable result, |yext| = sqrt(1 + base2) ~ base -- a disappointingly long output for PSLQ, and one with 0 in the last coordinate (the third problem listed above).

Running the Experiment

The code in this repository is a single file, PSLQ_vs_LWE.py. It defines small n and m, and generates v as described above. To run this, install Python and mpmath if you haven't already.

Below is the output of a successful run. It was one output amongst those of many unsuccessful runs. The unsuccesful runs went awry for the reasons in the bullet list under "What can go wrong" in the section, "Calling PSLQ".

Output of a successful run:

q = 11, n = 3, m = 9; Quit (y/n)?

n
A: [[3, 10, 4, 8, 10, 9, 8, 5, 10], [9, 10, 9, 8, 5, 1, 7, 1, 3], [1, 4, 4, 10, 4, 4, 7, 5, 4]]
Private key x: [1, -1, 0, 1, 0, -1, -2, 0, -1]
rawU: [-34, -11, -15]
u: [10, 0, 7]
rawUOverQ: [4, 1, 2]
base: 160000

Input v to PSLQ: [11, 1760000, 281600000000, 25601440003, 102401600010, 102401440004, 256001280008, 102400800010, 102400160009, 179201120008, 128000160005, 102400480010, -179200000010]

Expected output w of PSLQ: [4, 1, 2, 1, -1, 0, 1, 0, -1, -2, 0, -1, 1]

<v,x> - v[9] = <[11, 1760000, 281600000000, 25601440003, 102401600010, 102401440004, 256001280008, 102400800010, 102400160009, 179201120008, 128000160005, 102400480010, -179200000010], [1, -1, 0, 1, 0, -1, -2, 0, -1]> - 179201120008 = -204801760024

<v,w> = <[11, 1760000, 281600000000, 25601440003, 102401600010, 102401440004, 256001280008, 102400800010, 102400160009, 179201120008, 128000160005, 102400480010, -179200000010], [4, 1, 2, 1, -1, 0, 1, 0, -1, -2, 0, -1, 1]> = 0

PSLQ Found a causal solution [0, -1, 0, 0, 0, 0, 0, 0, -2, 1, 0, 2, 1]  with norm 2.23606797749979 : [0, -1, 0, 0, 0, 0, 0, 0, -2, 1, 0, 2, 1] != [1, -1, 0, 1, 0, -1, -2, 0, -1] = x

Here is the meaning of the solution, yext = [0, -1, 0, 0, 0, 0, 0, 0, -2, 1, 0, 2, 1], above. The first three coefficients (0, -1, 0) of yext make the rest of the solution work mod q. The next m + 1 = 10 coefficients annihilate (ai,1, ai,2, ..., ai,9, -ui) mod q for i=1,2,3.

Conclusion

The experiment in this repository shows that attacking LWE with PSLQ may hold some promise. The next step is to implement PSLQ with adaptations to perform better with integer inputs. This implementation should be capable of arbitrary precision, in order to attack LWE with more realistic m, n and q.

Lastly, it's worth mentioning that there is a way to defend against this attack, and perhaps other lattice reduction attacks. Notice that PSLQ will tend to find small coefficients of the q-related coefficients, basei q. A randomly chosen private key, x, makes these coefficients small (~sqrt(nq)). But a carefully chosen x could make the causal coefficients of basei large. To defend on lattice attacks like this one, it will help if

  • x be the only solution of Ax = u mod q small enough for Alice to calculate bit and
  • x be chosen so that some of the <ai, x> - ui are a large multiple of q that a lattice reduction algorithm will reject.

The first constraint above may be difficult to pull off, and the two constraints are not a complete answer to lattice attacks. The attacker can still launch a statistical attack on biased plaintext, even without calculating bit correctly every time, by finding a small enough yext -- albeit not as small as the private key, x -- such that Ay = u mod q. With all that said, the easier constraint -- choosing x so that some of the <ai, x> - ui are large multiples of q -- is probably a good practice.

About

Experiments using PSLQ to break Learning With Errors (LWE)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages