By Fredrik Dahlgren, Principal Security Engineer
TL;DR: We have released version 0.8.0 of Circomspect, our static analyzer and linter for Circom. Since our initial release of Circomspect in September 2022, we have added five new analysis passes, support for tags, tuples, and anonymous components, links to in-depth descriptions of each identified issue, and squashed a number of bugs. Please download the new version and tell us what you think!
New analysis passes
The new analysis passes, added to the tool’s initial nine, check for a range of issues that could occur in Circom code:
- Failure to properly constrain intermediate signals
- Failure to constrain output signals in instantiated templates
- Failure to constrain divisors in division operations to nonzero values
- Use of BN254-specific templates from Circomlib with a different curve
- Failure to properly constrain inputs to Circomlib’s
LessThan
circuit
Apart from finding the issue related to the Circomlib LessThan circuit discussed below, these analysis passes would also have caught the “million dollar ZK bug” recently identified by Veridise in the circom-pairing
library.
To understand the types of issues that Circomspect can identify, let’s dig into the final example in this list. This analysis pass identifies an issue related to the LessThan
circuit implemented by Circomlib, the de facto standard library for Circom. To fully understand the issue, we first need to take a step back and understand how signed values are represented by Circom.
Signed arithmetic in GF(p)
Circom programs operate on variables called signals, which represent elements in the finite field GF(p)
of integers modulo a prime number p
. It is common to identify the elements in GF(p)
with the unsigned integers in the half-open interval [0, p)
. However, it is sometimes convenient to use field elements to represent signed quantities in the same way that we may use the elements in [0, 232)
to represent signed 32-bit integers. Mirroring the definition for two’s complement used to represent signed integer values, we define val(x)
as follows:
We then say that a is less than b in GF(p)
if val(a) < val(b)
as signed integers. This means that q = floor(p/2)
is the greatest signed value representable in GF(p)
, and that -q = q + 1
is the least signed value representable in GF(p)
. It also means, perhaps somewhat surprisingly, that q + 1
is actually less than q
. This is also how the comparison operator <
is implemented by the Circom compiler.
As usual, we say that a
is positive if a > 0
and negative if a < 0
. One way to ensure that a value a is nonnegative is to restrict the size (in bits) of the binary representation of a
. In particular, if the size of a is strictly less than log(p) - 1
bits, then a
must be less than or equal to q
and, therefore, nonnegative.
Circomlib’s ‘LessThan’ template
With this out of the way, let’s turn our attention to the LessThan template defined by Circomlib. This template can be used to constrain two input signals a
and b
to ensure that a < b
, and is implemented as follows:
Looking at the implementation, we see that it takes an input parameter n and two input signals in[0]
and in[1]
, and it defines a single output signal out. Additionally, the template uses the Num2Bits
template from Circomlib to constrain the output signal out.
The Num2Bits
template from Circomlib takes a single parameter n
and can be used to convert a field element to its n-bit binary representation, which is given by the array out of size n
. Since the size of the binary representation is bounded by the parameter n, the input to Num2Bits
is also implicitly constrained to n
bits. In the implementation of LessThan
above, the expression (1 << n) + in[0] - in[1]
is passed as input to Num2Bits
, which constrains the absolute value |in[0] - in[1]|
to n
bits.
To understand the subtleties of the implementation of the LessThan
template, let’s first consider the expected use case when both inputs to LessThan
are at most n
bits, where n
is small enough to ensure that both inputs are nonnegative.
We have two cases to consider. If in[0] < in[1]
, then in[0] - in[1]
is a negative n-
bit value, and (1 << n) + in[0] - in[1]
is a positive n-bit value. It follows that bit n
in the binary representation of the input to Num2Bits
is 0
, and thus out must be equal to 1 - 0 = 1
.
On the other hand, if in[0] ≥ in[1]
, then in[0] - in[1]
is a nonnegative n-bit value (since both inputs are positive), and (1 << n) + in[0] - in[1]
is a positive (n + 1)
-bit value with the most significant bit equal to 1
, It follows that bit n
in the binary representation of the input to Num2Bits
is 1
, and out must be given by 1 - 1 = 0
.
This all makes sense and gives us some confidence if we want to use LessThan
for range proofs in our own circuits. However, things become more complicated if we forget to constrain the size of the inputs passed to LessThan
.
Using signals to represent unsigned quantities
To describe the first type of issue that may affect circuits defined using LessThan
, consider the case in which signals are used to represent unsigned values like monetary amounts. Say that we want to allow users to withdraw funds from our system without revealing sensitive information, like the total balance belonging to a single user or the amounts withdrawn by users. We could use LessThan
to implement the part of the circuit that validates the withdrawn amount against the total balance as follows:
Now, suppose that a malicious user with a zero balance decides to withdraw p - 1
tokens from the system, where p
is the size of the underlying prime field. Clearly, this should not be allowed since p - 1
is a ridiculously large number and, in any case, the user has no tokens available for withdrawal. However, looking at the implementation of LessThan
, we see that in this case, the input to Num2Bits
will be given by (1 << 64) + (p - 1) - (0 + 1) = (1 << 64) - 2
(as all arithmetic is done modulo p
). It follows that bit 64 of the binary representation of the input will be 0
, and the output from LessThan
will be 1 - n2b.out[64] = 1 - 0 = 1
. This also means that ValidateWithdrawal
will identify the withdrawal as valid.
The problem here is that p - 1
also represents the signed quantity –1
in GF(p)
. Clearly, -1
is less than 1
, and we have not constrained the withdrawn amount to be nonnegative. Adding a constraint restricting the size of the amount to be less than log(p) - 1
bits would ensure that the amount must be positive, which would prevent this issue.
More generally, since the input parameter n
to LessThan
restricts only the size of the difference |in[0] - in[1]|
, we typically cannot use LessThan
to prevent overflows and underflows. This is a subtle point that many developers miss. As an example, consider the section on arithmetic overflows and underflows from the zero-knowledge (ZK) bug tracker maintained by 0xPARC. In an earlier version, 0xPARC suggested using LessThan
to constrain the relevant signals in an example that was almost identical to the vulnerable ValidateWithdrawal
template defined above!
Another vulnerability of this type was found by Daira Hopwood in an early version of the ZK space-conquest game Dark Forest. Here, the vulnerability allowed users to colonize planets far outside the playing field. The developers addressed the issue by adding a range proof based on the Num2Bits
template that restricted the size of the coordinates to 31 bits.
Using signals to represent signed quantities
Now, suppose signals are used to represent signed quantities. In particular, let’s consider what would happen if we passed q = floor(p/2)
and q + 1
as inputs to LessThan
. We will show that even though q + 1 < q
according to the Circom compiler, q
is actually less than q + 1
according to LessThan
. Returning to the input to Num2Bits
in the definition of LessThan
, we see that if in[0] = q
and in[1] = q + 1
, the input to Num2Bits is given by the following expression:
(1 << n) + in[0] - in[1] = (1 << n) + q - (q + 1) = (1 << n) - 1
It follows that the n
th bit in the binary representation of this value is 0
, and the output from LessThan
is 1 - n2b.out[n] = 1 - 0 = 1
. Thus, q < q + 1
according to LessThan
, even though q + 1 < q
according to the compiler!
It is worth reiterating here that the input parameter n
to LessThan
does not restrict the size of the inputs, only the absolute value of their difference. Thus, we are free to pass very large (or very small) inputs to LessThan
. Again, this issue can be prevented if the size of both of the inputs to the LessThan
template are restricted to be less than log(p) - 1
bits.
Circomspect to the rescue (part 1)
To find issues of this type, Circomspect identifies locations where LessThan is used. It then tries to see whether the inputs to LessThan are constrained to less than log(p) - 1
bits using the Num2Bits
template from Circomlib, and it emits a warning if it finds no such constraints. This allows the developer (or reviewer) to quickly identify locations in the codebase that require further review.
As shown in the screenshot above, each warning from Circomspect will now typically also contain a link to a description of the potential issue, and recommendations for how to resolve it.
Circomspect to the rescue (part 2)
We would also like to mention another of our new analysis passes. The latest version of Circomspect identifies locations where a template is instantiated but the output signals defined by the template are not constrained.
As an example, consider the ValidateWithdrawal
template introduced above. Suppose that we rewrite the template to include range proofs for the input signals amount
and total
. However, during the rewrite we accidentally forget to include a constraint ensuring that the output from LessThan
is 1
. This means that users may be able to withdraw amounts that are greater than their total balance, which is obviously a serious vulnerability!
There are examples (like Num2Bits
) in which a template constrains its inputs and no further constraints on the outputs are required. However, forgetting to constrain the output from a template generally indicates a mistake and requires further review to determine whether it constitutes a vulnerability. Circomspect will flag locations where output signals are not constrained to ensure that each location can be manually checked for correctness.
Let’s talk!
We at Trail of Bits are excited about contributing to the growing range of tools and libraries for ZK that have emerged in the last few years. If you are building a project using ZK, we would love to talk to you to see if we can help in any way. If you are interested in having your code reviewed by us, or if you’re looking to outsource parts of the development to a trusted third party, please get in touch with our team of experienced cryptographers.