In 2022, I wrote about my plan to build end-to-end encryption for the Fediverse. The goals were simple:
The primary concern at the time was “honest but curious” Fediverse instance admins who might snoop on another user’s private conversations.
After I finally was happy with the client-side secret key management piece, I had moved on to figure out how to exchange public keys. And that’s where things got complicated, and work stalled for 2 years.
I wrote a series of blog posts on this complication, what I’m doing about it, and some other cool stuff in the draft specification.
Recently, NIST published the new Federal Information Protection Standards documents for three post-quantum cryptography algorithms:
The race is now on to implement and begin migrating the Internet to use post-quantum KEMs. (Post-quantum signatures are less urgent.) If you’re curious why, this CloudFlare blog post explains the situation quite well.
Since I’m proposing a new protocol and implementation at the dawn of the era of post-quantum cryptography, I’ve decided to migrate the asymmetric primitives used in my proposals towards post-quantum algorithms where it makes sense to do so.
The rest of this blog post is going to talk about technical specifics and the decisions I intend to make in both projects, as well as some other topics I’ve been thinking about related to this work.
I’ll discuss these choices in detail, but for the impatient:
Virtually all other uses of cryptography is symmetric-key or keyless (i.e., hash functions), so this isn’t a significant change to the design I have in mind.
While I am personally skeptical if we will see a practical cryptography-relevant quantum computer in the next 30 years, due to various engineering challenges and a glacial pace of progress on solving them, post-quantum cryptography is still a damn good idea even if a quantum computer doesn’t emerge.
Post-Quantum Cryptography comes in two flavors:
Originally, my proposals were going to use Elliptic Curve Diffie-Hellman (ECDH) in order to establish a symmetric key over an untrusted channel. Unfortunately, ECDH falls apart in the wake of a crypto-relevant quantum computer. ECDH is the component that will be replaced by post-quantum KEMs.
Additionally, my proposals make heavy use of Edwards Curve Digital Signatures (EdDSA) over the edwards25519 elliptic curve group (thus, Ed25519). This could be replaced with a post-quantum DSA (e.g., ML-DSA) and function just the same, albeit with bandwidth and/or performance trade-offs.
Lattice-based cryptography has been around almost as long as elliptic curve cryptography. One of the first designs, NTRU, was developed in 1996.
Meanwhile, ECDSA was published in 1992 by Dr. Scott Vanstone (although it was not made a standard until 1999). Lattice cryptography is pretty well-understood by experts.
However, before the post-quantum cryptography project, there hasn’t been a lot of incentive for attackers to study lattices (unless they wanted to muck with homomorphic encryption).
So, naturally, there is some risk of a cryptanalysis renaissance after the first post-quantum cryptography algorithms are widely deployed to the Internet.
However, this risk is mostly a concern for KEMs, due to the output of a KEM being the key used to encrypt sensitive data. Thus, when selecting KEMs for post-quantum security, I will choose a Hybrid construction.
We’re not talking folfs, sonny!
Hybrid isn’t just a thing that furries do with their fursonas. It’s also a term that comes up a lot in cryptography.
Unfortunately, it comes up a little too much.
When I say we use Hybrid constructions, what I really mean is we use a post-quantum KEM and a classical KEM (such as HPKE‘s DHKEM), then combine them securely using a KDF.
For the post-quantum KEM, we only really have one choice: ML-KEM. But this choice is actually three choices: ML-KEM-512, ML-KEM-768, or ML-KEM-1024.
The security margin on ML-KEM-512 is a little tight, so most cryptographers I’ve talked with recommend ML-KEM-768 instead.
Meanwhile, the NSA wants the US government to use ML-KEM-1024 for everything.
Originally, I was looking to use DHKEM with X25519, as part of the HPKE specification. After switching to post-quantum cryptography, I would need to combine it with ML-KEM-768 in such a way that the whole shebang is secure if either component is secure.
But then, why reinvent the wheel here? X-Wing already does that, and has some nice binding properties that a naive combination might not.
So let’s use X-Wing for our KEM.
Notably, OpenMLS is already doing this in their next release.
So our KEM choice seems pretty straightforward. What about post-quantum signatures?
Do we even need post-quantum signatures?
Well, the situation here is not nearly as straightforward as KEMs.
For starters, NIST chose to standardize two post-quantum digital signature algorithms (with a third coming later this year). They are as follows:
Since we’re working at the application layer, we’re less worried about a few kilobytes of bandwidth than the networking or X.509 folks are. Relatively speaking, we care about security first, performance second, and message size last.
After all, people ship Electron, React Native, and NextJS apps that load megabytes of JavaScript code to print, “hello world,” and no one bats an eye. A few kilobytes in this context is easily digestible for us.
(As I said, this isn’t true for all layers of the stack. WebPKI in particular feels a lot of pain with large public keys and/or signatures.)
Performance considerations would eliminate SLH-DSA, which is the most conservative choice. Even with the fastest parameter set (SLH-DSA-128f), this family of algorithms is about 550x slower than Ed25519. (If we prioritize bandwidth, it becomes 8000x slower.)
Between the other two, FN-DSA is a tempting option. Although it’s difficult to implement in constant-time, it offers smaller public key and signature sizes.
However, FN-DSA is not standardized yet, and it’s only known to be safe on specific hardware architectures. (It might be safe on others, but that’s not proven yet.)
In order to allow Fediverse users be secure on a wider range of hardware, this uncertainty would limit our choice of post-quantum signature algorithms to some flavor of ML-DSA–whether stand-alone or in a hybrid construction.
Unlike KEMs, hybrid signature constructions may be problematic in subtle ways that I don’t want to deal with. So if we were to do anything, we would probably choose a pure post-quantum signature algorithm.
There isn’t an immediate benefit to adopting a post-quantum signature algorithm, as David Adrian explains.
The migration to post-quantum cryptography will be a long and difficult road, which is all the more reason to make sure we learn from past efforts, and take advantage of the fact the risk is not imminent. Specifically, we should avoid:
- Standardizing without real-world experimentation
- Standardizing solutions that match how things work currently, but have significant negative externalities (increased bandwidth usage and latency), instead of designing new things to mitigate the externalities
- Deploying algorithms pre-standardization in ways that can’t be easily rolled back
- Adding algorithms that are pre-standardization or have severe shortcomings to compliance frameworks
We are not in the middle of a post-quantum emergency, and nothing points to a surprise “Q-Day” within the next decade. We have time to do this right, and we have time for an iterative feedback loop between implementors, cryptographers, standards bodies, and policymakers.
The situation may change. It may become clear that quantum computers are coming in the next few years. If that happens, the risk calculus changes and we can try to shove post-quantum cryptography into our existing protocols as quickly as possible. Thankfully, that’s not where we are.
David Adrian, Lack of post-quantum security is not plaintext.
Furthermore, there isn’t currently any commitment from the Sigsum developers to adopt a post-quantum signature scheme in the immediate future. They hard-code Ed25519 for the current iteration of the specification.
Given all of the above, I’m going to opt to simply not adopt post-quantum signatures until a later date.
Version 1 of our design will continue to use Ed25519 despite it not being secure after quantum computers emerge (“Q-Day”).
When the security industry begins to see warning signs of Q-Day being realistically within a decade, we will prioritize migrating to use post-quantum signature algorithms in a new version of our design.
Should something drastic happen that would force us to decide on a post-quantum algorithm today, we would choose ML-DSA-44. However, that’s unlikely for at least several years.
Remember, Store Now, Decrypt Later doesn’t really break signatures the way it would break public-key encryption.
Okay, that’s enough about post-quantum for now. I worry that if I keep talking about key encapsulation, some of my regular readers will start a shitty garage band called My KEMical Romance before the end of the year.
Let’s talk about some other technical topics related to end-to-end encryption for the Fediverse!
MLS was implicitly designed with the idea of having one central service for passing messages around. This makes sense if you’re building a product like Signal, WhatsApp, or Facebook Messenger.
It’s not so great for federated environments where your Delivery Service may be, in fact, more than one service (i.e., the Fediverse). An expired Internet Draft for Federated MLS talks about these challenges.
If we wanted to build atop MLS for group key agreement (like has been suggested before), we’d need to tackle this in a way that doesn’t cede control of MLS epochs to any server that gets compromised.
First, the Authentication Service component can be replaced by client-side protocols, where public keys are sourced from the Public Key Directory (PKD) services.
That is to say, from the PKD, you can fetch a valid list of Ed25519 public keys for each participant in the group.
When a group is created, the creator’s Ed25519 public key is known. Everyone they invite, their software necessarily has to know their Ed25519 public key in order to invite them.
In order for a group action to be performed, it must be signed by one of the public keys enrolled into the group list. Additionally, some actions may be limited by permissions attached at the time of the invite (or elevated by a more privileged user; which necessitates another group action).
By requiring a valid signature from an existing group member, we remove the capability of the Fediverse instance that’s hosting the discussion group to meddle with it in any way (unless, for some reason, the server is somehow also a participant that was invited).
But therein lies the other change we need to make: In many cases, groups will span multiple Fediverse servers, so groups shouldn’t be dependent on a single instance.
Put simply, we need a consensus algorithm to determine which instance hosts messages. We could look to Raft as a starting point, but whatever we land on should be fair, fault-tolerant, and deterministic to all participants who can agree on the same symmetric keying material at some point in time.
To that end, I propose using an additional HKDF output from the Group Key Agreement protocol to select a “leader” for all instances involved in the group, weighted by the number of participants on each instance.
Then, every N messages (where N >= 1), a new leader is elected by the same deterministic protocol. This will be performed entirely client-side, and clients will choose N. I will refer to this as a sub-epoch, since it doesn’t coincide with a new MLS epoch.
Since the agreed-upon group key always ratchets forward when a group action occurs (i.e., whenever there’s a new epoch), getting another KDF output to elect the next leader is straightforward.
This isn’t a fully fleshed out idea. Building consensus protocols that can handle real-world operational issues is heavily specialized work and there’s a high risk of falling to the illusion of safety until it’s too late. I will probably need help with this component.
That said, we aren’t building an anonymity network, so the cost of getting a detail wrong isn’t measurable in blood.
We aren’t really concerned with Sybil attacks. Winning the election just means you’re responsible for being a dumb pipe for ciphertext. Client software should trust the instance software as little as possible.
We also probably don’t need to worry about availability too much. Since we’re building atop ActivityPub, when a server goes down, the other instances can hold encrypted messages in the outbox for the host instance to pick up when it’s back online.
If that’s not satisfactory, we could also select both a primary and secondary leader for each epoch (and sub-epoch), to have built-in fail-over when more than one instance is involved in a group conversation.
If messages aren’t being delivered for an unacceptable period of time, client software can forcefully initiate a new leader election by expiring the current MLS epoch (i.e. by rotating their own public key and sending the relevant bundle to all other participants).
Those are just some thoughts. I plan to talk it over with people who have more expertise in the relevant systems.
And, as with the rest of this project, I will write a formal specification for this feature before I write a single line of production code.
I could’ve swore I talked about this already, but I can’t find it in any of my previous ramblings, so here’s a good place as any.
The intent for end-to-end encryption is privacy, not secrecy.
What does this mean exactly? From the opening of Eric Hughes’ A Cypherpunk’s Manifesto:
Privacy is necessary for an open society in the electronic age. Privacy is not secrecy.
A private matter is something one doesn’t want the whole world to know, but a secret matter is something one doesn’t want anybody to know.
Privacy is the power to selectively reveal oneself to the world.
Eric Hughes (with whitespace and emphasis added)
Unrelated: This is one reason why I use “secret key” when discussing asymmetric cryptography, rather than “private key”. It also lends towards sk
and pk
as abbreviations, whereas “private” and “public” both start with the letter P, which is annoying.
With this distinction in mind, abuse reporting is not inherently incompatible with end-to-end encryption or any other privacy technology.
In fact, it’s impossible to create useful social technology without the ability for people to mitigate abuse.
So, content warning: This is going to necessarily discuss some gross topics, albeit not in any significant detail. If you’d rather not read about them at all, feel free to skip this section.
When thinking about the sorts of problems that call for an abuse reporting mechanism, you really need to consider the most extreme cases, such as someone joining group chats to spam unsuspecting users with unsolicited child sexual abuse material (CSAM), flashing imagery designed to trigger seizures, or graphic depictions of violence.
That’s gross and unfortunate, but the reality of the Internet.
However, end-to-end encryption also needs to prioritize privacy over appeasing lazy cops who would rather everyone’s devices include a mandatory little cop that watches all your conversations and snitches on you if you do anything that might be illegal, or against the interest of your government and/or corporate masters. You know the type of cop. They find privacy and encryption to be rather inconvenient. After all, why bother doing their jobs (i.e., actual detective work) when you can just criminalize end-to-end encryption and use dragnet surveillance instead?
Whatever we do, we will need to strike a balance that protects users’ privacy, without any backdoors or privileged access for lazy cops, with community safety.
Thus, the following mechanisms must be in place:
I’m going to focus on item 3, because that’s where the technically and legally thorny issues arise.
Keep in mind, this is just a core-dump of thoughts about this topic, and I’m not committing to anything right now.
First, the end-to-end encryption must be immune to Invisible Salamanders attacks. If it’s not, go back to the drawing board.
Every instance will need to have a moderator account, who can receive abuse reports from users. This can be a shared account for moderators or a list of moderators maintained by the server.
When an abuse report is sent to the moderation team, what needs to happen is that the encryption keys for those specific messages are re-wrapped and sent to the moderators.
So long as you’re using a forward-secure ratcheting protocol, this doesn’t imply access to the encryption keys for other messages, so the information disclosed is limited to the messages that a participant in the group consents to disclosing. This preserves privacy for the rest of the group chat.
When receiving a message, moderators should not only be able to see the reported message’s contents (in the order that they were sent), but also how many messages were omitted in the transcript, to prevent a type of attack I colloquially refer to as “trolling through omission”. This old meme illustrates the concept nicely:
And this all seems pretty straightforward, right? Let users protect themselves and report abuse in such a way that doesn’t invalidate the privacy of unrelated messages or give unfettered access to the group chats. “Did Captain Obvious write this section?”
But things aren’t so clean when you consider the legal ramifications.
Suppose Alice, Bob, and Troy start an encrypted group conversation. Alice is the group admin and delete messages or boot people from the chat.
One day, Troy decides to send illegal imagery (e.g., CSAM) to the group chat.
Bob immediately, disgusted, reports it to his instance moderator (Dave) as well as Troy’s instance moderator (Evelyn). Alice then deletes the messages for her and Bob and kicks Troy from the chat.
Here’s where the legal questions come in.
If Dave and Evelyn are able to confirm that Troy did send CSAM to Alice and Bob, did Bob’s act of reporting the material to them count as an act of distribution (i.e., to Dave and/or Evelyn, who would not be able to decrypt the media otherwise)?
If they aren’t able to confirm the reports, does Alice’s erasure count as destruction of evidence (i.e., because they cannot be forwarded to law enforcement)?
Are Bob and Alice legally culpable for possession? What about Dave and Evelyn, whose servers are hosting the (albeit encrypted) material?
It’s not abundantly clear how the law will intersect with technology here, nor what specific technical mechanisms would need to be in place to protect Alice, Bob, Dave, and Evelyn from a particularly malicious user like Troy.
Obviously, I am not a lawyer. I have an understanding with my lawyer friends that I will not try to interpret law or write my own contracts if they don’t roll their own crypto.
That said, I do have some vague ideas for mitigating the risk.
To contend with this issue, one thing we could do is separate the abuse reporting feature from the “fetch and decrypt the attached media” feature, so that while instance moderators will be capable of fetching the reported abuse material, it doesn’t happen automatically.
When the “reason” attached to an abuse report signals CSAM in any capacity, the client software used by moderators could also wholesale block the download of said media.
Whether that would be sufficient mitigate the legal matters raised previously, I can’t say.
And there’s still a lot of other legal uncertainty to figure out here.
We may not know the answers to these questions until the courts make specific decisions that establish relevant case law, or our governments pass legislation that clarifies everyone’s rights and responsibilities for such cases.
Until then, the best answer may simply to do nothing.
That is to say, let admins delete messages for the whole group, let users delete messages they don’t want on their own hardware, and let admins receive abuse reports from their users… but don’t do anything further.
Okay, we should definitely require an explicit separate action to download and decrypt the media attached to a reported message, rather than have it be automatic, but that’s it.
For the immediate future, I plan on continuing to develop the Federated Public Key Directory component until I’m happy with its design. Then, I will begin developing the reference implementations for both client and server software.
Once that’s in a good state, I will move onto finishing the E2EE specification. Then, I will begin building the client software and relevant server patches for Mastodon, and spinning up a testing instance for folks to play with.
Timeline-wise, I would expect most of this to happen in 2025.
I wish I could promise something sooner, but I’m not fond of moving fast and breaking things, and I do have a full time job unrelated to this project.
Hopefully, by the next time I pen an update for this project, we’ll be closer to launching. (And maybe I’ll have answers to some of the legal concerns surrounding abuse reporting, if we’re lucky.)