Back in October of 2022, this announcement by AMI caught my eye. AMI has contributed a product named “Tektagon Open Edition” to the Open Compute Project (OCP).
Tektagon OpenEdition is an open-source Platform Root of Trust (PRoT) solution with foundational firmware security features that detect platform firmware corruption, recover the firmware and protect firmware integrity. With its open-source code, Tektagon OpenEdition™ augments transparency, resulting in high-quality code […]
I decided to dig in and audit the recently open sourced code. But first, some background: Tektagon is a hardware root-of-trust (HRoT) that implements Intel PFR 2.0. So… What exactly is PFR?
PFR, or Platform Firmware Resiliency, is a standard defined by everyone’s favorite standards body, NIST, in SP 800-193. The specification describes guidelines that support the resiliency of platform firmware and data against destructive attacks or unauthorized changes. These security properties are upheld by a new HRoT device that implements the PFR logic.
At its core, PFR acknowledges that in addition to the boot firmware (e.g., the BIOS), a platform contains numerous other peripheral devices which execute firmware and therefore also require integrity verification. Examples of these peripherals typically include GPUs, network cards, storage controllers, display controllers, and so on. Many of these peripherals are highly privileged (e.g., DMA capable), and so they are attractive targets for an attacker. It is important that their firmware images are protected from tampering. That is, if an attacker could compromise one of these peripherals by tampering with its firmware, they might be able to:
Although these motivations sound like they are centered around only protecting the integrity of the platform firmware and its data assets, the SP 800-193 specification also describes how PFR is crucial for protecting firmware availability. Here, availability refers to the ability to recover from corrupted flash storage, which might occur due to a failed firmware update, or perhaps, cosmic rays that cause bit flips in flash.
In the PFR specification, these security requirements appear as three guiding principles:
This is a somewhat crowded technology space. In addition to AMI’s Tektagon product, many other vendors have created their own PFR (or PFR-like) solutions whose purpose is to help assure device firmware authenticity and availability, further complicating the already complex x86 system boot process. Examples include Microsoft’s Project Cerberus which is used in Azure, Intel PFR, Google Titan, Lattice’s Root of Trust FPGA solution, and more.
PFR introduces a new device, a microcontroller or FPGA, that positions itself as the man-in-the-middle on the flash memory SPI bus. By sitting on the bus, PFR chipsets can interpose all bus transactions. Whenever a device (such as the Board Management Controller (BMC) or Platform Controller Hub (PCH)) reads or writes SPI flash, the PFR chipset proxies that request. This grants PFR the crucial responsibility of verifying the authenticity and integrity of all code and data that resides in the persistent storage media.
However, by interposing buses in this manner, PFR exposes itself to a rather large attack surface. It must read, parse, and verify various binary blobs (firmware and data) that exist in flash. Such parsing can be a tedious and delicate process. If the code is not written defensively (a challenge for even the best C programmers) then memory safety violations may arise. Another concern is race conditions such as time-of-check-time-of-use (TOCTOU) or double fetch problems.
The PFR attack surface is also expanded by the fact that it communicates with other devices via I2C or SMBus. The bus typically carries the MCTP and SPDM protocols. Without going into too much detail about these specifications, these protocols are used to:
Within the HRoT, these command handlers may accept variable length arguments, and so memory safety is again required when managing the message queues.
So, with that in mind, I decided to jump into the recently open-sourced AMI Tektagon project and hunt for bugs.
This first vulnerability occurs in the PCH/BMC command handler. This is the same I2C communication interface that was mentioned above. Two of the command handlers violate memory safety.
uint8_t gUfmFifoData[64]; uint8_t gReadFifoData[64]; ... uint8_t gFifoData; ... static unsigned int mailBox_index; uint8_t PchBmcCommands(unsigned char *CipherText, uint8_t ReadFlag) { byte DataToSend = 0; uint8_t i = 0; switch (CipherText[0]) { ... case UfmCmdTriggerValue: if (ReadFlag == TRUE) { DataToSend = get_provision_commandTrigger(); } else { if (CipherText[1] EXECUTE_UFM_COMMAND) { ... } else if (CipherText[1] FLUSH_WRITE_FIFO) { memset( gUfmFifoData, 0, sizeof(gUfmFifoData)); gFifoData = 0; } else if (CipherText[1] FLUSH_READ_FIFO) { memset( gReadFifoData, 0, sizeof(gReadFifoData)); gFifoData = 0; mailBox_index = 0; } } break; case UfmWriteFIFO: gUfmFifoData[gFifoData++] = CipherText[1]; break; case UfmReadFIFO: DataToSend = gReadFifoData[mailBox_index]; mailBox_index++; break; ...
Above, the UfmWriteFIFO
command can eventually write data past the end of the gUfmFifoData[]
array. This may occur if the attacker issues more than 64 commands in sequence without flushing the FIFO by sending a UfmCmdTriggerValue
command. Because gFifoData
is a uint8_t
type, this enables an attacker to overwrite up to 192 bytes past the end of the FIFO buffer.
Similarly, the UfmReadFIFO
command can read data out-of-bounds by repeated invocations of the command between FIFO flushes. This OOB data appears to be eventually disclosed in the I2C response message in DataToSend
. Because mailbox_index
is an unsigned int
type, this would enable an attacker to disclose a significant quantity of PFR SRAM, albeit relatively slowly due to only 1 byte being exposed at a time.
I estimate that these command processing vulnerabilities can be triggered in three different scenarios:
The next vulnerability occurs when the Tektagon firmware reads a public key from SPI flash. In the linked GitHub issue, I found and reported five instances where this same bug appears throughout the Tektagon source code, but for the sake of brevity, I will focus on just one simple example here.
int get_rsa_public_key(uint8_t flash_id, uint32_t address, struct rsa_public_key *public_key) { int status = Success; uint16_t key_length; uint8_t exponent_length; uint32_t modules_address, exponent_address; // Key Length status = pfr_spi_read(flash_id, address, sizeof(key_length), key_length); if (status != Success){ return Failure; } modules_address = address + sizeof(key_length); // rsa_key_module status = pfr_spi_read(flash_id, modules_address, key_length, public_key->modulus); ...
The code above performs two SPI flash reads. The first read operation obtains a size value (key_length
) from a public key structure in flash, and the second read operation uses this key_length
to obtain the RSA public key modulus.
The bug arises due to lack of input validation. If the contents of external SPI flash were tampered with by an attacker, then key_length
may be larger than expected. This length value is not validated before being passed as the size argument to the second pfr_spi_read()
call, which can lead to out-of-bounds memory writes of public_key->modulus[]
.
The modulus buffer is RSA_MAX_KEY_LENGTH
(512) bytes in length, and in all locations where get_rsa_public_key()
is called, the public_key
structure is declared on the stack. Because the Zephyr build config used by Tektagon does not define CONFIG_STACK_CANARIES, such a stack-based memory corruption vulnerability would be highly exploitable.
These two vulnerabilities were extremely shallow, and I discovered them both in the same afternoon after first pulling the source code from GitHub. I am fairly certain that other vulnerabilities exist in this code.
(As an aside, you might also be interested to know that Tektagon is based on the Zephyr RTOS, for which we published a research report a few years back, highlighting numerous vulnerabilities in both its implementation and design.)
These bugs are great illustrations of how a “security feature” is not always a “secure feature”. Although PFR aims to improve platform security, it does so at the cost of introducing new attack surfaces. Bugs in these attack surfaces can be abused to achieve privilege escalation by the very same adversaries and threats that PFR is designed to defend against – that is, threats involving maliciously tampered SPI flash contents, and adversaries who have compromised a peripheral device and are seeking to pivot laterally to attack another device firmware.
Think carefully about the threat model of your products, and how adding new features and attack surfaces might affect your overall security posture. As always, we recommend you perform a full assessment of any third-party firmware components before they make it into your product. This is just as true for open source as it is for proprietary code bases, and in particular, new and untested components and technologies.
As of April 6th 2022, these vulnerabilities were fixed in commit d6d935e. No CVEs were issued by AMI.
Random number generators are the backbone of most cryptographic protocols, the crucial cornerstone upon which the security of all systems rely, yet they remain often overlooked. This blog post presents a real-world vulnerability discovered in the implementation of a Pseudo-Random Number Generator (PRNG) based on the ChaCha20 cipher. Discovery of…
In July 2023 Penumbra Labs engaged NCC Group’s Cryptography Services team to perform an implementation review of their Rank-1 Constraint System (R1CS) code and the associated zero-knowledge proofs within the Penumbra system. These proofs are built upon decaf377 and poseidon377, which have been previously audited by NCC Group, with a…
As the name suggests, multivariate cryptography refers to a class of public-key cryptographic schemes that use multivariate polynomials over a finite field. Solving systems of multivariate polynomials is known to be NP-complete, thus multivariate constructions are top contenders for post-quantum cryptography standards. In fact, 11 out of the 50 submissions…