The increased adoption of financial blockchains has fueled a lot of cryptography research in recent years. One area of high interest is transaction confidentiality which requires hiding investors’ account balances and transaction amounts, while enforcing compliance rules and performing validity checks on all activities.
This blog post will look at the Zether [2] protocol, which uses ElGamal public key encryption to hide transaction amounts and utilizes zero-knowledge proofs to demonstrates the validity of a transaction to stakeholders, namely network validators, investors, and auditors.
At a high level, with this formulation, Alice (sender) transfers x assets to Bob (receiver), by encrypting x with her own and Bob’s public keys, and including CiphertextAlice(x) and CiphertextBob(x) in the transaction. She then, without revealing x, cryptographically proves that CiphertextAlice(x) encrypts a value less than or equal to her account balance. She also proves that CiphertextAlice(x) and CiphertextBob(x) encrypt the same value. When the network validators receive the transaction, they verify the zero-knowledge proofs, and if successful, subtract CiphertextAlice(x) from Alice’s balance and add CiphertextBob(x) to Bob’s balance.
Let’s take a look at the cryptographic components before we discuss the implementation details. We first explain Pedersen Commitment and its role in ElGamal encryption. We then briefly examine relevant zero-knowledge proofs.
In cryptographic protocols there is often a need for a party to prove the knowledge of a privately chosen value, without initially revealing it or being able to change it at a later stage. More specifically, commitment schemes must hide the value, and bind to it such that the committer cannot change the value without changing the commitment.
Let be a cyclic group of prime order p (), for which the decisional Diffie-Hellman (DDH) assumption holds, and let g and h be generators of group . The DDH assumption states that for , the tuple (g, ga, gb, ga.b) is indistinguishable from (g, ga, gb, gc), given that the discrete logarithm is a hard problem.
To commit to the value v in , the committer uniformly samples a random number, r in , and publishes commitment(v, r) = gvhr. This commitment is binding under the DDH assumption since the probability that the committer can compute a (v’, r’) != (v, r) such that commitment(v, r) = commitment(v’, r’) is low. It is also hiding meaning that it is computationally infeasible to calculate v from the commitment, since (with equal probability) it could be any member of .
A key property of this definition is that multiplying commitments to 2 values, (v1, r1) and (v2, r2), results in a commitment to the sum of those values.
This property is referred to as additive homomorphism. It means that without knowledge of the 2 values, (v1, r1) and (v2, r2), one can multiply their corresponding commitments to calculate a commitment to their sum, (v1+v2, r1+r2).
Considering the same group , we can define the ElGamal encryption scheme as follows:
Key Generation: A secret key (x) is randomly sampled from , and the public key (y) is derived as:
Encryption: Knowing the public key (y), anyone can randomly choose a blinding factor (r) and encrypt the value b for the secret key owner:
Decryption: Knowing the secret key, first hb can be recovered as:
Since discrete logarithm is assumed to be hard on a classical computer, there is no solution to calculate b from hb efficiently. Therefore, b is computed by brute-forcing every possible value in hvalue to find the one that matches with hb. This variation of ElGamal encryption scheme is sometimes referred to as the twisted ElGamal encryption [1] or message in the exponent ElGamal encryption.
If you look closely at the second term of the ciphertext (Y), it is a Pedersen commitment to the plaintext and the blinding factor:
As a result one can show that ElGamal encryption is additively homomorphic, meaning that if Ciphertext1 encrypts b1 and Ciphertext2 encrypts b2, to the public key y, encrypts b1 + b2.
Homomorphic encryption has been used to preserve users’ privacy in many applications. Besides its’ use in confidential financial blockchains, it has also been used in collecting medical data to encrypt patients’ biometrics and hide their identity, while keeping the dataset searchable [11].
Another advantage of this property of the ElGamal ciphertexts is that we now can leverage the zero-knowledge proofs that are developed for Pedersen commitments to prove properties of the encrypted value.
In cryptographic applications there is often a need to prove knowledge of a secret, or some property about it, to a verifier without leaking any additional information. This method is called a zero-knowledge proof. It must be complete, sound, and reveal no information (zero knowledge) about the secret, meaning that an honest verifier will be convinced that the prover has the knowledge of the secret (completeness), and a dishonest prover cannot mislead an honest verifier to believe, with non-negligible probability, that they know the secret (soundness).
Let’s first look at a class of zero-knowledge proofs called (Sigma) protocols [6], which consist of 3 rounds of interactions between the prover and the verifier: 1. commitment phase, 2. challenge phase, and 3. response phase. These proofs can be used to show the knowledge of a discrete logarithm, e.g. plaintext for a twisted ElGamal ciphertext.
For example, the following Sigma protocol proves that 2 ciphertexts encrypt the same value to 2 different public keys. This is useful in the context of transaction confidentiality as described in the introduction, where the transaction creator must prove that the number of withdrawn assets (v) from the sender’s account, with public key y1, is equal to the deposited assets into receiver’s account, with public key y2. So it essentially proves that Ciphertext1 = (X1, Y) = (y1r, hvgr) encrypts the same value as Ciphertext2 = (X2, Y) = (y2r, hvgr):
For similar zero-knowledge proofs, regarding ElGamal ciphertexts, see reference [1].
Another class of useful proofs are Bulletproofs [3] which are non-interactive and are used to prove a ciphertext has a specific form, without requiring a trusted setup like zk-SNARKs do (references [7] and [10] give nice overviews of the trusted setup for zk-SNARKs). Bulletproofs’ range proof can be used to show that an ElGamal ciphertext encrypts a value that is within the range of positive numbers. We can utilize it to prove that the sender’s remaining balance is non-negative, and therefore they are not double spending. Although the initial proposal was not as efficient as zk-SNARKs, recently researchers have found ways to aggregate proofs in a transaction to make them smaller and the verification faster. Focusing on performance is key for adoption of zero-knowledge proofs in blockchain applications, as they need to process thousands of transactions per second in order to handle the traffic in real world financial applications. A popular implementation of Bulletproofs over Ristretto and curve-25519 is open-sourced by the Dalek project [4].
This approach to transaction confidentiality is at the core of a few blockchain protocols such as Zether [2], MERCAT [1], and PGC [5]. Its main advantage over zk-SNARKs [7] is that it does not require an initial setup phase to generate a master key [9]. It also has small ciphertexts and proof sizes. Proof validation is fast and researchers are actively looking into ways to aggregate Bulletproofs, so even faster batch verification might be possible. However, there are some challenges in using Zether in real-world applications. The following is a non-exhaustive list of important aspects that should be considered.
In this blog post we explored the possibility of utilizing ElGamal encryption and Zero-Knowledge proofs for transaction confidentiality. While homomorphic additivity properties of the ElGamal encryption make it possible for the network validators to maintain account states, it has some pitfalls that must be carefully considered in practice.