This blog post is dedicated to those to brave souls that dare to roll their own crypto
This story begins with an old project of ours, where we were tasked to verify (among other things) how a business application handles digital signatures of transactions, to comply with four-eyes principles and other security rules.
The application used RSA signatures, and after a bit of head scratching about why our breakpoints on the usual OpenSSL API’s don’t trigger, but those placed at the depths of the library do, we realized that developers implemented what people in security like to call “Textbook RSA” in its truest sense. This of course led to red markings in the report and massive delays in development, but also presented us with some unusual problems to solve.
One of these problems stemmed from the fact that although we could present multiple theoretical attacks on the scheme, the public keys used in this application weren’t published anywhere, and without that we had no starting point for a practical attack.
At this point it’s important to remember that although public key cryptosystems guarantee that the private key can’t be derived from the public key, signatures, ciphertexts, etc., there are usually no such guarantees for the public key! In fact, the good people at the Cryptography Stack Exchange presented a really simple solution: just find the greatest common divisor (GCD) of the difference of all available message-signature pairs. Without going into the details of why this works (a more complete explanation is here), there are a few things that worth noting:
In our case, we could always recover public keys with just two signatures. At this time we had a quick and dirty implementation based on the gmpy2 library that allowed us to work with large integers and modern, efficient algorithms from Python.
It took a couple of weeks of management meetings and some sleep deprivation to strike me: the dirty little code we wrote for that custom RSA application can be useful against a more widespread technology: JSON Web Signatures, and JSON Web Tokens in particular.
Design problems of the above standards are well-known in security circles (unfortunately these concerns can’t seem to find their ways to users), and alg=”none” fiascos regularly deliver facepalms. Now we are targeting a trickier weakness of user-defined authentication schemes: confusing symmetric and asymmetric keys.
If you are a developer considering/using JWT (or anything JOSE), please take the time to at least read this post! Here are some alternatives too.
In theory, when a JWT is signed using an RSA private key, an attacker may change the signature algorithm to HMAC-SHA256. During verification the JWT implementation sees this algorithm, but uses the configured RSA public key for verification. The problem is the symmetric verification process assumes that the same public key was used to generate the MAC, so if the attacker has the RSA public key, she can forge the signature too.
In practice however, the public key is rarely available (at least in a black-box setting). But as we saw earlier, we may be able to solve this problem with some algebra. The question is: are there any practical factors that would prevent such an exploit?
To demonstrate the viability of this method we targeted a vulnerability of PyJWT version 1.5.0 that allowed key confusion attacks as described in the previous section. The library uses a blacklist to avoid key parameters that “look like” asymmetric keys in symmetric methods, but in the affected version it missed the “BEGIN RSA PUBLIC KEY” header, allowing PEM encoded public keys in the PKCS #1 format to be abused. (I haven’t checked how robust key filtering is, deprecating the verification API without algorithm specification is certainly the way to go)
Based on the documentation, RSA keys are provided to the encode/decode API’s (that also do signing and verification) as PEM encoded byte arrays. For our exploit to work, we need to create a perfect copy of this array, based on message and signature pairs. Let’s start with the factors that influence the signature value:
No problem here: with some modifications of our original code, we could successfully recreate the Base64URL encoded signature representations of JWT tokens. Let’s take a look at the container format (this guide is a great help):
As we can see, PKCS #1 and PEM allows little room for changes, so there is a high chance that if we generate a standards compliant PEM file it will match the one at the target. In case of other input formats, such as JWK, flexibility can result in a high number of possible encodings of the same key that can block exploitation.
After a lot of cursing because of the bugs and insufficient documentation of pyasn1 and asn1 packages, asn1tools finally proved to be usable to create custom DER (and thus PEM) structures. The generated output matched perfectly with the original public key, so I could successfully demonstrate token forgery without preliminary information about the asymmetric keys:
We tested with the 2048-bit keys from the JKS standard: it took less than a minute on a laptop to run the GCD algorithm on two signatures, and the algorithm produced two candidate keys for PKCS #1 which could be easily tested.
As usual, all code is available on GitHub. If you need help to integrate this technique to your Super Duper JWT Haxor Tool, use the Issue tracker!
The main lesson is: one should not rely on the secrecy of public keys, as these parameters are not protected by mathematical trapdoors.
This exercise also showed the engineering side of offensive security, where theory and practice can be far apart: although the main math trick here may seem unintuitive, it’s actually pretty easy to understand and implement. What makes exploitation hard, is to figure out all those implementation details that make pen and paper formulas work on actual computers. It won’t be a huge surprise to anyone who worked with digital certificates and keys that at least 2/3 of the work involved here was about reading standards, making ASN.1 work, etc. (Not to mention constantly converting byte arrays and strings in Python3 :P) Interestingly, it seems that the stiffness of these standards makes the format of the desired keys more predictable, and exploitation more reliable!
On the other hand, introducing unpredictable elements in the public key representations can definitely break the process. But no one would base security on their favorite indentation style, would they?