Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Indistinguishability flaw #55

Closed
LoupVaillant opened this issue Dec 8, 2021 · 4 comments · Fixed by #61
Closed

Indistinguishability flaw #55

LoupVaillant opened this issue Dec 8, 2021 · 4 comments · Fixed by #61
Labels
bug Something isn't working needs investigation

Comments

@LoupVaillant
Copy link
Collaborator

Hi,

I’ve been pointed to your work, and I see you’re using my work for exactly what I wanted it to be used, that’s nice!

I think I found a bug, that makes the encrypted archives easily distinguishable from random. It’s a subtle, easy to miss issue about Elligator2. Here’s the culprit: you’re taking a freshly generated public key from vanilla X25519. The error is using vanilla X25519.

See, Curve25519 is not quite a prime order group. It has a non-trivial cofactor of 8. DJB got around the issue by inserting special precautions in the design of X25519. One of them was making sure all the generated points are in the prime order subgroup, which comprises only one eight of the curve. (He does this by clamping the scalar down to a multiple of 8).

Elligator2 however operates over the whole curve, not just the prime order subgroup. So when an attacker sees the file and tries to unhash the key, they’ll notice that all the resulting points always happen to be on the prime order subgroup. (To verify which subgroup a key belongs to, you can just multiply it by its order L.) With truly random files this is supposed to happen only one time in 8.

In other words, for each file you encrypt, you expose 3 bits of evidence to the attacker that this is an encrypted file, not a truly random one. (Now one might suspect things about random files, but those 3 bits can also leak part of the file format, and my guess is that you want to hide as much meta data as possible.)


To correct the problem, you need to generate points on the whole curve, not just the prime order subgroup. You need a "dirty" X25519 key somehow, yet still retain full compatibility with vanilla X25519. You need Monocypher’s crypto_x25519_dirty_fast().

Now I see you’re mainly using Libsodium, and only pulled my Python Elligator2 code (which by the way is not constant time, though it probably does not matter for offline uses). The problem is that Libsodium has no way that I know of to generate the dirty keys you need. (Last time I checked Frank Denis was explicitly not interested in supporting indistinguishability from random.)

My recommended course of action is to use Monocypher’s high-level crypto_hidden_key_pair() function. The third party Pymonocypher bindings may help you there. It will fix both timings & indistinguishability issues.

@covert-encryption
Copy link
Owner

covert-encryption commented Dec 8, 2021

Thanks for reporting this... Certainly a very evasive issue.

Let me clarify, since we already add three random bits on the Elligator encoding (sign and two high bits)... That the keys created also lack another three bits of entropy?

To correct the problem, you need to generate points on the whole curve, not just the prime order subgroup. You need a "dirty" X25519 key somehow, yet still retain full compatibility with vanilla X25519.

There is crypto_scalarmult_ed25519_base_noclamp, shouldn't that when supplied with random bytes do the trick, or is also retaining compatibility with vanilla X25519 then a problem? Working around sodium's restrictions is somewhat annoying but given its popularity we try to keep as much as possible fully compatible with it, and obviously also with standard algorithms like the key exchange.

I will look at Monocypher, looks like a very useful library in other ways too. Need to also talk with our crypto guy because I only have a cursory understanding of elliptic curves.

@covert-encryption
Copy link
Owner

I gather there are also other encodings than Elligator2 to address this problem. Can you give any recommendation on them?

@covert-encryption covert-encryption added bug Something isn't working needs investigation labels Dec 8, 2021
@LoupVaillant
Copy link
Collaborator Author

Let me clarify, since we already add three random bits on the Elligator encoding (sign and two high bits)... That the keys created also lack another three bits of entropy?

Yes, this is exactly what I’m saying. Those bits of entropy are not as obvious as the others, but they’re just as easy to find: just unhash the key, multiply it by (with an unclamped scalar multiplication), if the result is zero the point was on the prime order subgroup.

There is crypto_scalarmult_ed25519_base_noclamp, shouldn't that when supplied with random bytes do the trick, or is also retaining compatibility with vanilla X25519 then a problem?

Not only compatibility with vanilla X25519 will be a problem, you will still fail to generate the whole curve!! (Very counter intuitive, it’s a nightmare.) The whole theory is explained in my cofactor tutorial, but it’s a fairly long read. I’ll try to summarise the highlights here.

Public key cryptography loves prime order groups, because they have almost no structure. There’s a zero element, and all the others are basically the same: for any element X in such a group, adding X to itself enough time will eventually generate the whole group (including zero). That lack of structure makes cracking more difficult, facilitates protocol design, streamlines security proofs, the works.

That’s why early curves were short Weierstraß curves: they have a prime order. Problem is, they’re not the fastest thing around, and implementing them without letting a vulnerability slip is like sitting on the Iron Throne without cutting yourself. That’s why DJB popularised Curve25519.

Curve25519 however does not have a prime order. It has a prime order times 8. I personally visualise it as a very tall rectangle, that is 8 items wide, and L items tall (L = 2²⁵² + something, it’s an extremely tall rectangle). The structure of the group is that of the rectangle: on the top left you have zero. The top row is your low order points: when added to themselves, they’ll only generate low order points (no leaving the top row). The leftmost column (except zero) is your prime order points: when added to themselves enough times they’ll generate the entire leftmost column, and only the leftmost column. Just like the top row, there’s no leaving it.

So, even though Curve25519 is not quite a prime order group, it does have a prime order sub group: the leftmost column. The rectangle as a whole does have structure, but the leftmost column really does not. And that’s exactly what we want. To achieve that, DJB made sure every point we’ll work with in X25519 will be in that leftmost column. He took two specific precautions:

  • Clamping the scalar before the scalarmult. Specifically, clearing the 3 LSB, so that the scalar becomes a multiple of 8. (He also does stuff to the MSB, but that’s a timing attack thing, nothing to do with the cofactor).
  • Selecting the base point in the prime order subgroup. That is, the leftmost column. So it does not matter if the scalar you multiply it with is clamped or not: there is no leaving the leftmost column, and all you’ll get is another point in the prime order subgroup.

Now I need to talk a bit more about the structure of the rectangle: it actually work like Cartesian coordinates.

Take a random point in the rectangle. Let’s call it P.
Now project P to the left. Let’s call that second point Q. Note that Q is in the prime order subgroup.
Now project P to the top. Let’s call that third point R. Note that R is in the low order subgroup.
P = Q + R. Simple as that.

In more pompous term, every single point on the curve is the sum of a prime order point, and a low order point. And we’re going to exploit that:

P = Q + R
s.P = s.(Q + R)
s.P = s.Q + s.R

Remember that thing about not leaving the leftmost column or the topmost row? Well that applies to Q and R too: s.Q is still a prime order point, and s.R is still a low order point. So, while X25519 only wants you to work with Q and s.Q, you can add a random R, use the resulting P, and… you’ll still get s.Q out. Hmm…

DJB took two precautions. One of them is clamping, which makes sure s is a multiple of 8. And when you multiply a low order point by a multiple of 8, the result is a big fat zero. X25519 effectively ignores the low order component of the points it process. It might as well not be there. That’s what we’re going to exploit when generating our dirty points.

To be compatible with X25519, yet still cover the whole rectangle, we just need to do two things:

  • Generate an X25519 point the normal way.
  • Add a random low order point.

And that’s it. The low order component will not influence the result of the key exchange, the end result will be the same as if you just used a normal X25519 point instead. But adding that random low order point allows you to cover the whole curve, and that’s exactly what Monocypher does.

There are other subtleties, but I have confirmed with the help of Elligator2’s inventor Mike Hamburg himself that it does not have any further security implications.


Working around sodium's restrictions is somewhat annoying but given its popularity we try to keep as much as possible fully compatible with it, and obviously also with standard algorithms like the key exchange.

I designed my dirty key generation process with exactly this in mind. Not only being compatible with regular X25519 makes for very practical APIs (Monocypher neither needs nor have a dirty key exchange function), producing the same results as regular X25519 makes it very easy to prove that dirty points are just as secure as regular points.


I will look at Monocypher, looks like a very useful library in other ways too.

Thanks. If you care about speed though, Libsodium’s symmetric encryption is much faster (I use portable C, so I can’t take advantage of vector instructions). Public key cryptography however is a one time cost, the speed differences will not matter there.


I gather there are also other encodings than Elligator2 to address this problem. Can you give any recommendation on them?

There are other indeed, but many involve using two curve points instead of just one, and I haven’t studied them. My recommendation is to stick with Elligator2: it’s the simplest, fastest, most secure way to hide Curve25519 points I know of. And I say that in full awareness of the huge wall of text I just wrote.

samuel-lucas6 added a commit to samuel-lucas6/Cryptography-Guidelines that referenced this issue Dec 9, 2021
@covert-encryption
Copy link
Owner

@LoupVaillant A brief update on this. We have indeed confirmed the 3 bit information leakage that you reported and built a test for it, and have an initial implementation to work around this problem, written in Python at first to provide an alternative implementation for comparison, although we do consider including Monocypher, and have already done so in Covert Web.

It needs to be noted that not even the lowest level of the Sodium functions are useful for this because they error out when low order points are seen (but at least accept the dirty points). Fortunately we already had some fe/ge and other helper functions written in Python, some of your making of course.

Thanks for your thorough explanation on the sub groups, now I also understand them much better, although I don't plan to dive into the EC mathematics myself. The rectangle analogy is most useful. I gather this is of your invention along with the projections up and left?

I had a look and could not find any up to date Python bindings to your library. Both that are available are from two years back and don't seem to be maintained anymore. I could have a look at bringing that part up to date. The Sodium bindings that are available are also quite outdated, and don't allow e.g. zeroing of any buffers after use.

covert-encryption added a commit that referenced this issue Dec 22, 2021
Three bits of information per file were still leaked in ephemeral keys, weakening the deniability property. This is fixed by implementing *dirty keys* that use all eight of Curve25519 subgroups rather than only the prime group.

Loup Vaillant has been most helpful in first reporting the issue and then explaining the mathematics in depth and also reviewing this patch, and definitely receives a huge thank you from us for all his dedicated work both on Covert and on his own library Monocypher which we plan to use as it provides the much needed functionality in a safe and fast C implementation.

Also implements and uses Signal's XEdDSA (XEd25519) for signatures as is written in Covert specification. Previous versions used EdDSA instead, making it impossible to sign with Age and Wireguard keys. Now all key types are good for both signatures and encryption.

* Elliptic curve implementation extended and refactored.
* Implement Dirty points on Elligator 2, with the following notes on sign handling:
   - Using the Ed25519 `x & 1` sign as the v sign on Elligator 2, meaning that both Ed and Curve25519 points can be properly recovered.
   - The drawback of not storing a random sign instead is only including 255 bits of entropy in the 256 bit hidden field.
   - Elligator-compatible prime group points (standard edpk including sign) only contain 250 bits of entropy, and we add three bits random subgroup and another two bits as padding after Elligator2.
   - The result is 256 bits that look entirely random but only have 2**255 possible values.
   - Unhiding a point on a random 256 bit value leads to a valid prime group point, and it is impossible to tell whether that point belongs to the half that could be generated by standard means or not.
   - Distanced SHA-512 is used for creating the padding bits, making the same edsk deterministically always produce the same hidden hash.
* XEdDSA signatures
   - Encodes the Ed25519 public key sign in the high bit of the scalar stored in signature, following Signal's implementation in current code, rather than their specification which fiddles with the secret key in key generation to make sure to have non-negative public keys.
   - Allows any standard Ed25519 or Curve25519 keys to be used with minimal difficulty.
* Low order point Curve25519 encoding according to Sodium, using the y value directly for 0, -1 and 1, which cannot be directly and uniquely represented as u coordinates.
* A custom Curve25519 32-byte point encoding where the high bit contains the Ed sign.
* Added testing for all functions of the new code and for hidden points' random subgroups (previously non-random).

For more information, see the discussion in the PR and in related issues:
* #61
* #55
* #2

Co-authored-by: foonoxous <RWQe4qDs7LIsmI7Q94ctNMSfwfDW8eld24YGevEzQEKZyfJz1NPlH6uf>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working needs investigation
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants