Circomspect has more passes!

By Fredrik Dahlgren, Principal Security Engineer

TL;DR: We have released version 0.8.0 of Circomspect, our static analyzer and linter for Circom. Since our initial release of Circomspect in September 2022, we have added five new analysis passes, support for tags, tuples, and anonymous components, links to in-depth descriptions of each identified issue, and squashed a number of bugs. Please download the new version and tell us what you think!

New analysis passes

The new analysis passes, added to the tool’s initial nine, check for a range of issues that could occur in Circom code:

Failure to properly constrain intermediate signals
Failure to constrain output signals in instantiated templates
Failure to constrain divisors in division operations to nonzero values
Use of BN254-specific templates from Circomlib with a different curve
Failure to properly constrain inputs to Circomlib’s LessThan circuit

Apart from finding the issue related to the Circomlib LessThan circuit discussed below, these analysis passes would also have caught the “million dollar ZK bug” recently identified by Veridise in the circom-pairing library.

To understand the types of issues that Circomspect can identify, let’s dig into the final example in this list. This analysis pass identifies an issue related to the LessThan circuit implemented by Circomlib, the de facto standard library for Circom. To fully understand the issue, we first need to take a step back and understand how signed values are represented by Circom.

Signed arithmetic in GF(p)

Circom programs operate on variables called signals, which represent elements in the finite field GF(p) of integers modulo a prime number p. It is common to identify the elements in GF(p) with the unsigned integers in the half-open interval [0, p). However, it is sometimes convenient to use field elements to represent signed quantities in the same way that we may use the elements in [0, 2³²) to represent signed 32-bit integers. Mirroring the definition for two’s complement used to represent signed integer values, we define val(x) as follows:

We then say that a is less than b in GF(p) if val(a) < val(b) as signed integers. This means that q = floor(p/2) is the greatest signed value representable in GF(p), and that -q = q + 1 is the least signed value representable in GF(p). It also means, perhaps somewhat surprisingly, that q + 1 is actually less than q. This is also how the comparison operator < is implemented by the Circom compiler.

As usual, we say that a is positive if a > 0 and negative if a < 0. One way to ensure that a value a is nonnegative is to restrict the size (in bits) of the binary representation of a. In particular, if the size of a is strictly less than log(p) - 1 bits, then a must be less than or equal to q and, therefore, nonnegative.

Circomlib’s ‘LessThan’ template

With this out of the way, let’s turn our attention to the LessThan template defined by Circomlib. This template can be used to constrain two input signals a and b to ensure that a < b, and is implemented as follows:

Looking at the implementation, we see that it takes an input parameter n and two input signals in[0] and in[1], and it defines a single output signal out. Additionally, the template uses the Num2Bits template from Circomlib to constrain the output signal out.

The Num2Bits template from Circomlib takes a single parameter n and can be used to convert a field element to its n-bit binary representation, which is given by the array out of size n. Since the size of the binary representation is bounded by the parameter n, the input to Num2Bits is also implicitly constrained to n bits. In the implementation of LessThan above, the expression (1 << n) + in[0] - in[1] is passed as input to Num2Bits, which constrains the absolute value |in[0] - in[1]| to n bits.

To understand the subtleties of the implementation of the LessThan template, let’s first consider the expected use case when both inputs to LessThan are at most n bits, where n is small enough to ensure that both inputs are nonnegative.

We have two cases to consider. If in[0] < in[1], then in[0] - in[1] is a negative n-bit value, and (1 << n) + in[0] - in[1] is a positive n-bit value. It follows that bit n in the binary representation of the input to Num2Bits is 0, and thus out must be equal to 1 - 0 = 1.

On the other hand, if in[0] ≥ in[1], then in[0] - in[1] is a nonnegative n-bit value (since both inputs are positive), and (1 << n) + in[0] - in[1] is a positive (n + 1)-bit value with the most significant bit equal to 1, It follows that bit n in the binary representation of the input to Num2Bits is 1, and out must be given by 1 - 1 = 0.

This all makes sense and gives us some confidence if we want to use LessThan for range proofs in our own circuits. However, things become more complicated if we forget to constrain the size of the inputs passed to LessThan.

Using signals to represent unsigned quantities

To describe the first type of issue that may affect circuits defined using LessThan, consider the case in which signals are used to represent unsigned values like monetary amounts. Say that we want to allow users to withdraw funds from our system without revealing sensitive information, like the total balance belonging to a single user or the amounts withdrawn by users. We could use LessThan to implement the part of the circuit that validates the withdrawn amount against the total balance as follows:

The ValidateWithdrawal template should ensure that users cannot withdraw more than their total balance.

Now, suppose that a malicious user with a zero balance decides to withdraw p - 1 tokens from the system, where p is the size of the underlying prime field. Clearly, this should not be allowed since p - 1 is a ridiculously large number and, in any case, the user has no tokens available for withdrawal. However, looking at the implementation of LessThan, we see that in this case, the input to Num2Bits will be given by (1 << 64) + (p - 1) - (0 + 1) = (1 << 64) - 2 (as all arithmetic is done modulo p). It follows that bit 64 of the binary representation of the input will be 0, and the output from LessThan will be 1 - n2b.out[64] = 1 - 0 = 1. This also means that ValidateWithdrawal will identify the withdrawal as valid.

The problem here is that p - 1 also represents the signed quantity –1 in GF(p). Clearly, -1 is less than 1, and we have not constrained the withdrawn amount to be nonnegative. Adding a constraint restricting the size of the amount to be less than log(p) - 1 bits would ensure that the amount must be positive, which would prevent this issue.

More generally, since the input parameter n to LessThan restricts only the size of the difference |in[0] - in[1]|, we typically cannot use LessThan to prevent overflows and underflows. This is a subtle point that many developers miss. As an example, consider the section on arithmetic overflows and underflows from the zero-knowledge (ZK) bug tracker maintained by 0xPARC. In an earlier version, 0xPARC suggested using LessThan to constrain the relevant signals in an example that was almost identical to the vulnerable ValidateWithdrawal template defined above!

Another vulnerability of this type was found by Daira Hopwood in an early version of the ZK space-conquest game Dark Forest. Here, the vulnerability allowed users to colonize planets far outside the playing field. The developers addressed the issue by adding a range proof based on the Num2Bits template that restricted the size of the coordinates to 31 bits.

Using signals to represent signed quantities

Now, suppose signals are used to represent signed quantities. In particular, let’s consider what would happen if we passed q = floor(p/2) and q + 1 as inputs to LessThan. We will show that even though q + 1 < q according to the Circom compiler, q is actually less than q + 1 according to LessThan. Returning to the input to Num2Bits in the definition of LessThan, we see that if in[0] = q and in[1] = q + 1, the input to Num2Bits is given by the following expression:

(1 << n) + in[0] - in[1] = (1 << n) + q - (q + 1) = (1 << n) - 1

It follows that the nth bit in the binary representation of this value is 0, and the output from LessThan is 1 - n2b.out[n] = 1 - 0 = 1. Thus, q < q + 1 according to LessThan, even though q + 1 < q according to the compiler!

It is worth reiterating here that the input parameter n to LessThan does not restrict the size of the inputs, only the absolute value of their difference. Thus, we are free to pass very large (or very small) inputs to LessThan. Again, this issue can be prevented if the size of both of the inputs to the LessThan template are restricted to be less than log(p) - 1 bits.

Circomspect to the rescue (part 1)

To find issues of this type, Circomspect identifies locations where LessThan is used. It then tries to see whether the inputs to LessThan are constrained to less than log(p) - 1 bits using the Num2Bits template from Circomlib, and it emits a warning if it finds no such constraints. This allows the developer (or reviewer) to quickly identify locations in the codebase that require further review.

The output from the latest version of Circomspect running on the ValidateWithdrawal template defined above.

As shown in the screenshot above, each warning from Circomspect will now typically also contain a link to a description of the potential issue, and recommendations for how to resolve it.

Circomspect to the rescue (part 2)

We would also like to mention another of our new analysis passes. The latest version of Circomspect identifies locations where a template is instantiated but the output signals defined by the template are not constrained.

As an example, consider the ValidateWithdrawal template introduced above. Suppose that we rewrite the template to include range proofs for the input signals amount and total. However, during the rewrite we accidentally forget to include a constraint ensuring that the output from LessThan is 1. This means that users may be able to withdraw amounts that are greater than their total balance, which is obviously a serious vulnerability!

We rewrite ValidateWithdrawal to include range proofs for the two inputs amount and total, but accidentally forget to constrain the output signal lt.out to be 1!

There are examples (like Num2Bits) in which a template constrains its inputs and no further constraints on the outputs are required. However, forgetting to constrain the output from a template generally indicates a mistake and requires further review to determine whether it constitutes a vulnerability. Circomspect will flag locations where output signals are not constrained to ensure that each location can be manually checked for correctness.

Circomspect now generates a warning if it finds that the output signals defined by a template are not properly constrained.

Let’s talk!

We at Trail of Bits are excited about contributing to the growing range of tools and libraries for ZK that have emerged in the last few years. If you are building a project using ZK, we would love to talk to you to see if we can help in any way. If you are interested in having your code reviewed by us, or if you’re looking to outsource parts of the development to a trusted third party, please get in touch with our team of experienced cryptographers.