When trying to protect a secret on a client device, there are many strategies, but most of them are doomed. However, as a long-standing problem, many security experts have tried to chip away at its edges over the years. Over the last decade there’s been growing interest in using enclaves as a means to protect secrets from attackers.
The basic idea of an enclave is simple: you can put something into an enclave, but never take it out. For example, this means you can put the private key of a public/private key pair into the enclave so that it cannot be stolen. Whenever you need to decrypt a message, you simply pass the ciphertext into the enclave and the private key is used to decrypt the message and return the plaintext, crucially without the private key ever leaving the enclave.
There are several types of enclaves, but on modern systems, most are backed by hardware — either a TPM chip, a custom security processor, or features in mainstream CPUs that enable enclave-style isolation within a general-purpose CPU. Windows exposes a CreateEnclave API to allow running code inside an enclave, backed by virtualization-based security (VBS) features in modern processors. The general concept behind Virtual Secure Mode is simple: code running at the normal Virtual Trust Level (VTL0) cannot read or write memory “inside” the enclave, which runs its code at VTL1. Even the highly-privileged OS Kernel code running at VTL0 cannot spy on the content of a VTL1 enclave.
DLLs loaded into an enclave must be signed by a particular type of certificate (currently, as I understand it, only available for Microsoft code) and the code’s signature and integrity are validated before it is loaded into the enclave. After the privileged code is loaded into the enclave, it has access to all of the memory of the current process (both untrusted VTL0 and privileged VTL1 memory). In-enclave code cannot load most libraries and thus can only call a tiny set of external library functions, mostly related to cryptography.
Security researchers spend a lot of time trying to attack enclaves for the same reason that robbers try to rob banks: because that’s where the valuables are. At this point, most enclaves offer pretty solid security guarantees– attacking hardware is usually quite difficult which makes many attacks impractically expensive or unreliable.
However, it’s important to recognize that enclaves are far from a panacea, and the limits of the protection provided by an enclave are quite subtle.
Imagine a real-world protection problem: You don’t want anyone to get into your apartment, so you lock the door when you leave. However, you’re in the habit of leaving your keys on the bar when you’re out for drinks and bad guys keep swiping them and entering your apartment. Some especially annoying bad guys don’t just enter themselves, they also make copies of your key and share it with their bad-guy brethren to use at their leisure.
You hit on the following solution: you change your apartment’s lock, making only one key. You hire a doorkeeper to hold the key for you, and he wears it on a chain around his neck, never letting it leave his person. Every time you need to get in your apartment, you ask the doorkeeper to let you in and he unlocks the door for you.
No one other than the doorkeeper ever touches the key, so there’s no way for a bad guy to steal or copy the key.
Is this solution secure?
Well, no. The problem is that you never gave your doorkeeper instructions on who is allowed to tell him to unlock the door, so he’ll open it for anyone who asks. Your carefully-designed system is perfectly effective in protecting the key but utterly fails in achieving the actual protection goal, protecting the contents of your apartment.
What does this have to do with enclaves?
Sometimes, security engineers get confused about their goals, and believe that their objective is to keep the private key secret. Keeping the private key secret is simply an annoying requirement in service of the real goal: ensuring that messages can be decrypted/encrypted only by the one legitimate owner of the key. The enclave serves to prevent that key from being stolen, but preventing the key from being abused is a different thing altogether.
Consider, for example, the case of locally-running malware. The malware can’t steal the enclaved key, but it doesn’t need to! It can just hand a message to the code running inside the enclave and say “Decrypt this, please and thank you.” The code inside the enclave dutifully does as it’s asked and returns the plaintext out to the malware. Similarly, the attacker can tell the enclave “Encrypt this message with the key” and the code inside the enclave does as directed. The key remains a secret from the malware, but the crypto system has been completely compromised, with the attacker able to decrypt and encrypt messages of his choice.
So, what can we do about this? It’s natural to think: “Ah, we’ll just sign/encrypt messages from the app into the enclave and the code inside the enclave will validate that the calls are legitimate!” but a moment later you’ll remember: “Ah, but how do we prevent the protect that app’s key?” and we’re back where we started. Oops.
Another idea is that the code inside the enclave will examine the running process and determine whether the app/caller is the expected legitimate app. Unfortunately, this is extremely difficult. While the VTL1 code can read all of app’s VTL0 memory, to confidently determine that the host app is legitimate would require something like groveling all of the executable pages in the process’ memory, hashing them, and comparing them to a “known good” value. If the process contains any unexpected code, it may be compromised. Even if you could successfully implement this process snapshot hash check, an attacker could probably exploit a race condition to circumvent the check, and you’d be forever bedeviled with false-positives caused by non-malicious code injection from accessibility utilities or security tools.
In general, any security checks from inside the enclave that look at memory in VTL0 are potentially subject to a TOCTOU attack — an attacker can change any values at any time unless they have been copied into VTL1 memory.
Another idea would be to prompt the user: the code inside the enclave could pop up an unspoofable dialog asking “Hey, would you like me to sign this message [data] with your key?” Unfortunately, in the Windows model this isn’t possible — code running inside an enclave can’t show any UI, and even if it could, there’s nothing that would prevent such a confirmation UI from being redressed by VTL0 code. Rats.
Before you reach for an enclave, consider your full threat model and whether using the enclave would mitigate any threats.
For example, token binding uses an enclave to prevent cookie theft– while this doesn’t mitigate the threat of locally-running malware, it does mitigate the threat of XSS/cookie leaks and complicates the lives of malicious insiders on tightly-locked down corporate PCs.
-Eric