I don’t consider myself exceptional in any regard, but I stumbled upon a few cryptography vulnerabilities in Matrix’s Olm library with so little effort that it was nearly accidental.
It should not be this easy to find these kind of issues in any product people purportedly rely on for private messaging, which many people evangelize incorrectly as a Signal alternative.
Later, I thought I identified an additional vulnerability that would have been much worse, but I was wrong about that one. For the sake of transparency and humility, I’ll also describe that in detail.
This post is organized as follows:
I’ve opted to front-load the timeline and vulnerability details to respect the time of busy security professionals.
Please keep in mind that this website is a furry blog, first and foremost, that sometimes happens to cover security and cryptography topics.
Many people have, over the years, assumed the opposite and commented accordingly. The ensuing message board threads are usually is a waste of time and energy for everyone involved. So please adjust your expectations.
If you’re curious, you can learn more here.
security@
email address.In my email, I specify that I plan to disclose my findings publicly in 90 days (i.e. on August 14), in adherence with industry best practices for coordinated disclosure, unless they request an extension in writing.
security@
email address.They instructed the Matrix developers to consult with me if they needed cryptography guidance. I never heard from them on this externally reported issue.
I identified the following issues with Olm through a quick skim of their source code on Gitlab:
This is sorted by the order in which they were discovered, rather than severity.
Olm ships a pure-software implementation of AES, rather than leveraging hardware acceleration.
// Substitutes a word using the AES S-Box. WORD SubWord(WORD word) { unsigned int result; result = (int)aes_sbox[(word >> 4) & 0x0000000F][word & 0x0000000F]; result += (int)aes_sbox[(word >> 12) & 0x0000000F][(word >> 8) & 0x0000000F] << 8; result += (int)aes_sbox[(word >> 20) & 0x0000000F][(word >> 16) & 0x0000000F] << 16; result += (int)aes_sbox[(word >> 28) & 0x0000000F][(word >> 24) & 0x0000000F] << 24; return(result); }
The code in question is called from this code, which is in turn used to actually encrypt messages.
Software implementations of AES that use a look-up table for the SubWord step of the algorithm are famously susceptible to cache-timing attacks.
This kind of vulnerability in software AES was previously used to extract a secret key from OpenSSL and dm-crypt in about 65 milliseconds. Both papers were published in 2005.
A general rule in cryptography is, “attacks only get better; they never get worse“.
As of 2009, you could remotely detect a timing difference of about 15 nanosecond over the Internet with under 50,000 samples. Side-channel exploits are generally statistical in nature, so such a sample size is generally not a significant mitigation.
In the above code snippet, the vulnerability occurs inaes_sbox[/* ... */][/* ... */]
.
Due to the details of how the AES block cipher works, the input variable (word
) is a sensitive value.
Software written this way allows attackers to detect whether or not a specific value was present in one of the processor’s caches.
To state the obvious: Cache hits are faster than cache misses. This creates an observable timing difference.
Such a timing leak allows the attacker to learn the value that was actually stored in said cache. You can directly learn this from other processes on the same hardware, but it’s also observable over the Internet (with some jitter) through the normal operation of vulnerable software.
See also: cryptocoding’s description for table look-ups indexed by secret data.
The correct way to solve this problem is to use hardware accelerated AES, which uses distinct processor features to implement the AES round function and side-steps any cache-timing shenanigans with the S-box.
Not only is this more secure, but it’s faster and uses less energy too!
If you’re also targeting devices that don’t have hardware acceleration available, you should first use hardware acceleration where possible, but then fallback to a bitsliced implementation such as the one in Thomas Pornin’s BearSSL.
See also: the BearSSL documentation for constant-time AES.
Ed25519 libraries come in various levels of quality regarding signature validation criteria; much to the chagrin of cryptography engineers everywhere. One of those validation criteria involves signature malleability.
Signature malleability usually isn’t a big deal for most protocols, until suddenly you discover a use case where it is. If it matters, that usually that means you’re doing something with cryptocurrency.
Briefly, if your signatures are malleable, that means you can take an existing valid signature for a given message and public key, and generate a second valid signature for the same message. The utility of this flexibility is limited, and the impact depends a lot on how you’re using signatures and what properties you hope to get out of them.
For ECDSA, this means that for a given signature , a second signature is also possible (where is the order of the elliptic curve group you’re working with).
Matrix uses Ed25519, whose malleability is demonstrated between and .
This is trivially possible because S is implicitly reduced modulo the order of the curve, , which is a 253-bit number (0x1000000000000000000000000000000014def9dea2f79cd65812631a5cf5d3ed
) and S is encoded as a 256-bit number.
The Ed25519 library used within Olm does not ensure that , thus signatures are malleable. You can verify this yourself by looking at the Ed25519 verification code.
int ed25519_verify(const unsigned char *signature, const unsigned char *message, size_t message_len, const unsigned char *public_key) { unsigned char h[64]; unsigned char checker[32]; sha512_context hash; ge_p3 A; ge_p2 R; if (signature[63] & 224) { return 0; } if (ge_frombytes_negate_vartime(&A, public_key) != 0) { return 0; } sha512_init(&hash); sha512_update(&hash, signature, 32); sha512_update(&hash, public_key, 32); sha512_update(&hash, message, message_len); sha512_final(&hash, h); sc_reduce(h); ge_double_scalarmult_vartime(&R, h, &A, signature + 32); ge_tobytes(checker, &R); if (!consttime_equal(checker, signature)) { return 0; } return 1; }
This is almost certainly a no-impact finding (or low-impact at worst), but still an annoying one to see in 2024.
If you’d like to learn more, this page is a fun demo of Ed25519 malleability.
To mitigate this, I recommend implementing these checks from libsodium.
If you weren’t already tired of cache-timing attacks based on table look-ups from AES, the Matrix base64 code is also susceptible to the same implementation flaw.
while (pos != end) { unsigned value = DECODE_BASE64[pos[0] & 0x7F]; value <<= 6; value |= DECODE_BASE64[pos[1] & 0x7F]; value <<= 6; value |= DECODE_BASE64[pos[2] & 0x7F]; value <<= 6; value |= DECODE_BASE64[pos[3] & 0x7F]; pos += 4; output[2] = value; value >>= 8; output[1] = value; value >>= 8; output[0] = value; output += 3; }
The base64 decoding function in question is used to load the group session key, which means the attack published in this paper almost certainly applies.
Steve Thomas (one of the judges of the Password Hashing Competition, among other noteworthy contributions) wrote some open source code a while back that implements base64 encoding routines in constant-time.
The real interesting part is how it avoids a table look-up by using arithmetic (from this file):
// Base64 character set: // [A-Z] [a-z] [0-9] + / // 0x41-0x5a, 0x61-0x7a, 0x30-0x39, 0x2b, 0x2f inline int base64Decode6Bits(char src) { int ch = (unsigned char) src; int ret = -1; // if (ch > 0x40 && ch < 0x5b) ret += ch - 0x41 + 1; // -64 ret += (((0x40 - ch) & (ch - 0x5b)) >> 8) & (ch - 64); // if (ch > 0x60 && ch < 0x7b) ret += ch - 0x61 + 26 + 1; // -70 ret += (((0x60 - ch) & (ch - 0x7b)) >> 8) & (ch - 70); // if (ch > 0x2f && ch < 0x3a) ret += ch - 0x30 + 52 + 1; // 5 ret += (((0x2f - ch) & (ch - 0x3a)) >> 8) & (ch + 5); // if (ch == 0x2b) ret += 62 + 1; ret += (((0x2a - ch) & (ch - 0x2c)) >> 8) & 63; // if (ch == 0x2f) ret += 63 + 1; ret += (((0x2e - ch) & (ch - 0x30)) >> 8) & 64; return ret; }
Any C library that handles base64 codecs for private key material should use a similar implementation. It’s fine to have a faster base64 implementation for non-secret data.
Worth noting: Libsodium also provides a reasonable Base64 codec.
These issues are not fixed in libolm.
Instead of fixing libolm, the Matrix team recommends all Matrix clients adopt vodozemac.
I can’t speak to the security of vodozemac.
But I can speak against the security of libolm, so moving to vodozemac is probably a good idea. It was audited by Least Authority at one point, so it’s probably fine.
Most Matrix clients that still depended on libolm should treat this blog as public 0day, unless the Matrix security team already notified you about these issues.
If you’re curious about the backstory and context of these findings, read on.
Otherwise, feel free to skip this section. It’s not pertinent to most audiences. The people that need to read it already know who they are.
End-to-end encryption is one of the topics within cryptography that I find myself often writing about.
In 2020, I wrote a blog post covering end-to-end encryption for application developers. This was published several months after another blog I wrote covering gripes with AES-GCM, which included a shallow analysis of how Signal uses the algorithm for local storage.
In 2021, I published weaknesses in another so-called private messaging app called Threema.
In 2022, after Elon Musk took over Twitter, I joined the Fediverse and sought to build end-to-end encryption support for direct messages into ActivityPub, starting with a specification. Work on this effort was stalled while trying to solve Public Key distribution in a federated environment (which I hope to pick up soon, but I digress).
Earlier this year, the Telegram CEO started fearmongering about Signal with assistance from Elon Musk, so I wrote a blog post urging the furry fandom to move away from Telegram and start using Signal more. As I had demonstrated years prior, I was familiar with Signal’s code and felt it was a good recommendation for security purposes (even if its user experience needs significant work).
I thought that would be a nice, self-contained blog post. Some might listen, most would ignore it, but I could move on with my life.
An overwhelming number of people took it upon themselves to recommend or inquire about Matrix, which prompted me to hastily scribble down my opinion on Matrix so that I might copy/paste a link around and save myself a lot of headache.
Just when I thought the firehose was manageable and I could move onto other topics, one of the Matrix developers responded to my opinion post.
Thus, I decided to briefly look at their source code and see if any major or obvious cryptography issues would fall out of a shallow visual scan.
Since you’re reading this post, you already know how that ended.
Since the first draft of this blog post was penned, I also outlined what I mean when I say an encrypted messaging app is a Signal competitor or not, and published my opinion on XMPP+OMEMO (which people also recommend for private messaging).
Because it’s important to know that I have not audited the Olm or Megolm codebases, nor even glanced at their new Rust codebase.
The fact is, I never intended to study Matrix. I was annoyed into looking at it in the first place.
My opinion of their project was already calcified by the previously discovered practically-exploitable cryptographic vulnerabilities in Matrix in 2022.
The bugs described above are the sort of thing I mentally scan for when I first look at a project just to get a feel for the maturity of the codebase. I do this with the expectation (hope, really) of not finding anything at all.
(If you want two specific projects that I’ve subjected to a similar treatment, and failed to discover anything interesting in: Signal and WireGuard. These two set the bar for cryptographic designs.)
It’s absolutely bonkers that an AES cache timing vulnerability was present in their code in 2024.
It’s even worse when you remember that I was inundated with Matrix evangelism in response to recommending furries use Signal.
I’m a little outraged because of how irresponsible this is, in context.
It’s so bad that I didn’t even need to clone their git repository, let alone run basic static analysis tools locally.
So if you take nothing else away from this blog post, let it be this:
There is roughly a 0% chance that I got extremely lucky in my mental grep
and found the only cryptography implementation flaws in their source code. I barely tried at all and found these issues.
I would bet money on there being more bugs or design flaws that I didn’t find, because this discovery was the result of an extremely half-assed effort to blow off steam.
The Matrix developers like to insist that their new Rust hotness “vodozemac” is what people should be using today.
I haven’t looked at vodozemac at all, but let’s pretend, for the sake of argument, that its cryptography is actually secure.
(This is very likely if they turn out to be using RustCrypto for their primitives, but I don’t have the time or energy for that nerd snipe, so I’m not going to look. Least Authority did audit their Rust library, for what it’s worth, and Least Authority isn’t clownshoes.)
It’s been more than 2 years since they released vodozemac. What does the ecosystem penetration for this new library look like, in practice?
A quick survey of the various Matrix clients on GitHub says that libolm is still the most widely used cryptography implementation in the Matrix ecosystem (as of this writing):
Matrix Client | Cryptography Backend |
---|---|
https://github.com/tulir/gomuks | libolm (1, 2) |
https://github.com/niochat/nio | libolm (1, 2) |
https://github.com/ulyssa/iamb | vodozemac (1, 2) |
https://github.com/mirukana/mirage | libolm (1) |
https://github.com/Pony-House/Client | libolm (1) |
https://github.com/MTRNord/cetirizine | libolm (1) |
https://github.com/nadams/go-matrixcli | none |
https://github.com/mustang-im/mustang | libolm (1) |
https://github.com/marekvospel/libretrix | libolm (1) |
https://github.com/yusdacra/icy_matrix | none |
https://github.com/ierho/element | libolm (through the python SDK) |
https://github.com/mtorials/cordless | none |
https://github.com/hwipl/nuqql-matrixd | libolm (through the python SDK) |
https://github.com/maxkratz/element-web | vodozemac (1, 2, 3, 4) |
https://github.com/asozialesnetzwerk/riot | libolm (wasm file) |
https://github.com/NotAlexNoyle/Versi | libolm (1, 2) |
2 of the 16 clients surveyed use the new vodozemac library. 11 still use libolm, and 3 don’t appear to implement end-to-end encryption at all.
If we only focus on clients that support E2EE, vodozemac has successfully been adopted by 15% of the open source Matrix clients on GitHub.
I deliberately excluded any repositories that were archived or clearly marked as “old” or “legacy” software, because including those would artificially inflate the representation of libolm. It would make for a more compelling narrative to do so, but I’m not trying to be persuasive here.
Deprecation policies are a beautiful lie. The impact of a vulnerability in Olm or Megolm is still far-reaching, and should be taken seriously by the Matrix community.
Worth calling out: this quick survey, which is based on a GitHub Topic, certainly misses other implementations. Both FluffyChat and Cinny, which were not tagged with this GitHub Topic, depend a language-specific Olm binding.
These bindings in turn wrap libolm rather than the Rust replacement, vodozemac.
I thought the whole point of choosing Matrix over something like Signal is to be federated, and run your own third-party clients?
If we’re going to insist that everyone should be using Element if they want to be secure, that defeats the entire marketing point about third-party clients that Matrix evangelists cite when they decry Signal’s centralization.
So I really don’t want to hear it.
As I mentioned in the timeline at the top, I thought I found a fourth issue with Matrix’s codebase. Had I been correct, this would have been a critical severity finding that the entire Matrix ecosystem would need to melt down to remediate.
Fortunately for everyone, I made a mistake, and there is no fourth vulnerability after all.
However, I thought it would be interesting to write about what I thought I found, the impact it would have had if it were real, and why I believed it to be an issue.
Let’s start with the code in question:
void ed25519_sign(unsigned char *signature, const unsigned char *message, size_t message_len, const unsigned char *public_key, const unsigned char *private_key) { sha512_context hash; unsigned char hram[64]; unsigned char r[64]; ge_p3 R; sha512_init(&hash); sha512_update(&hash, private_key + 32, 32); sha512_update(&hash, message, message_len); sha512_final(&hash, r); sc_reduce(r); ge_scalarmult_base(&R, r); ge_p3_tobytes(signature, &R); sha512_init(&hash); sha512_update(&hash, signature, 32); sha512_update(&hash, public_key, 32); sha512_update(&hash, message, message_len); sha512_final(&hash, hram); sc_reduce(hram); sc_muladd(signature + 32, hram, private_key, r); }
The highlighted segment is doing pointer arithmetic. This means it’s reading 32 bytes, starting from the 32nd byte in private_key
.
What’s actually happening here is: private_key
is the SHA512 hash of a 256-bit seed. If you look at the function prototype, you’ll notice that public_key
is a separate input.
Virtually every other Ed25519 implementation I’ve ever looked at before expected users to provide a 32 byte seed followed by the public key as a single input.
This led me to believe that this private_key + 32
pointer arithmetic was actually using the public key for calculating r
.
The variable r
(not to be confused with big R) generated via the first SHA512 is the nonce for a given signature, it must remain secret for Ed25519 to remain secure.
If r
is known to an attacker, you can do some arithmetic to recover the secret key from a single signature.
Because I had mistakenly believed that r
was calculated from the SHA512 of only public inputs (the public key and message), which I must emphasize isn’t correct, I had falsely concluded that any previously intercepted signature could be used to steal user’s private keys.
But because private_key
was actually the full SHA512 hash of the seed, rather than the seed concatenated with the public key, this pointer arithmetic did NOT use the public key for the calculation of r
, so this vulnerability does not exist.
If the code did what I thought it did, however, this would have been a complete fucking disaster for the Matrix ecosystem. Any previously intercepted message would have allowed an attacker to recover a user’s secret key and impersonate them. It wouldn’t be enough to fix the code; every key in the ecosystem would need to be revoked and rotated.
Whew!
I’m happy to be wrong about this one, because that outcome is a headache nobody wants.
Well, maybe.
Matrix’s library was not vulnerable, but I honestly wouldn’t put it past software developers at large to somehow, somewhere, use the public key (rather than a secret value) to calculate the EdDSA signature nonces as described in the previous section.
To that end, I would like to propose a test vector be added to the Wycheproof test suite to catch any EdDSA implementation that misuses the public key in this way.
Then, if someone else screws up their Ed25519 implementation in the exact way I thought Matrix was, the Wycheproof tests will catch it.
For example, here’s a vulnerable test input for Ed25519:
{ "should-fail": true, "secret-key": "d1d0ef849f9ec88b4713878442aeebca5c7a43e18883265f7f864a8eaaa56c1ef3dbb3b71132206b81f0f3782c8df417524463d2daa8a7c458775c9af725b3fd", "public-key": "f3dbb3b71132206b81f0f3782c8df417524463d2daa8a7c458775c9af725b3fd", "message": "Test message", "signature": "ffc39da0ce356efb49eb0c08ed0d48a1cadddf17e34f921a8d2732a33b980f4ae32d6f5937a5ed25e03a998e4c4f5910c931b31416e143965e6ce85b0ea93c09" }
A similar test vector would also be worth creating for Ed448, but the only real users of Ed448 were the authors of the xz backdoor, so I didn’t bother with that.
(None of the Project Wycheproof maintainers knew this suggestion is coming, by the way, because I was respecting the terms of the coordinated disclosure.)
Despite finding cryptography implementation flaws in Matric’s Olm library, my personal opinion on Matrix remains largely unchanged from 2022. I had already assumed it would not meet my bar for security.
Cryptography engineering is difficult because the vulnerabilities you’re usually dealing with are extremely subtle. (Here’s an unrelated example if you’re not convinced of this general observation.) As SwiftOnSecurity once wrote:
The people that developed Olm and Megolm has not proven themselves ready to build a Signal competitor. In balance, most teams are not qualified to do so.
I really wish the Matrix evangelists would accept this and stop trying to cram Matrix down other people’s throats when they’re talking about problems with other platforms entirely.
More important for the communities of messaging apps:
You don’t need to be a Signal competitor. Having E2EE is a good thing on its own merits, and really should be table stakes for any social application in 2024.
It’s only when people try to advertise their apps as a Signal alternative (or try to recommend it instead of Signal), and offer less security, that I take offense.
Just be your own thing.
My work-in-progress proposal to bring end-to-end encryption to the Fediverse doesn’t aim to compete with Signal. It’s just meant to improve privacy, which is a good thing to do on its own merits.
If I never hear Matrix evangelism again after today, it would be far too soon.
If anyone feels like I’m picking on Matrix, don’t worry: I have far worse things to say about Telegram, Threema, XMPP+OMEMO, Tox, and a myriad other projects that are hungry for Signal’s market share but don’t measure up from a cryptographic security perspective.
If Signal fucked up as bad as these projects, my criticism of Signal would be equally harsh. (And remember, I have looked at Signal before.)