Exploring the True Reasons Behind the Enduring Innovation of Zero-Knowledge Proofs
Written by: LambdaClass
Compiled by: mutourend; Yiping, IOSG Ventures
1. Introduction
Zero-knowledge, succinct, non-interactive arguments of knowledge (zk-SNARKs) are a powerful cryptographic primitive that allows a prover to convince a verifier of the correctness of a statement without revealing any information beyond the statement itself. Due to their applications in verifiable private computation, proofs of correctness of computer program execution, and blockchain scalability, zk-SNARKs have garnered significant attention. We believe, as described in this article, that zk-SNARKs will have a profound impact on shaping the world. zk-SNARKs encompass different types of proof systems that utilize various polynomial commitment schemes, arithmetization schemes, interactive oracle proofs, or probabilistically checkable proofs. However, these fundamental ideas and concepts can be traced back to the mid-1980s.
Since the launch of Bitcoin and Ethereum, the development of zk-SNARKs has accelerated significantly, as they can scale by using zero-knowledge proofs (often referred to as proofs of validity for this specific use case). zk-SNARKs play a crucial role in blockchain scalability. As Ben-Sasson noted, there has been a Cambrian explosion of cryptographic proofs in recent years. Each proof system has its advantages and disadvantages, and specific trade-offs were considered during their design. Advances in hardware, algorithms, new proofs, and tools continuously improve performance and give rise to new systems. Many of these systems are already in use, and we are constantly pushing the boundaries. Will we have a universal proof system suitable for all applications, or will there be several systems tailored to different needs? We believe the likelihood of a single proof system dominating all others is low, for reasons including:
1) Diversity of applications.
2) Different types of constraints (regarding memory, verification time, proof time).
3) Demand for robustness (if one proof system is compromised, there are others).
Even as proof systems have changed significantly, they all provide an important feature: proofs can be verified quickly. Having a layer that verifies proofs and can easily adapt to new proof systems addresses the difficulties associated with changing the base layer (such as Ethereum). This article will outline the different characteristics of SNARKs:
1) Cryptographic assumptions: collision-resistant hash functions, discrete logarithm problem on elliptic curves, knowledge of exponent.
2) Transparent vs trusted setup.
3) Proof time: linear vs superlinear.
4) Verifier time: constant time, logarithmic, sublinear, linear.
5) Proof size.
6) Ease of recursion.
7) Arithmetization schemes.
8) Univariate polynomials vs multivariate polynomials.
This article will explore the origins of SNARKs, some fundamental building blocks, and the rise (and fall) of different proof systems. It does not intend to provide an exhaustive analysis of proof systems. Instead, it focuses only on those that have impacted us. Of course, these developments would not have been possible without the great work and ideas of pioneers in the field.
2. Basics
Zero-knowledge proofs are not new. Definitions, foundations, important theorems, and even significant protocols have been established since the mid-1980s. Some key ideas and protocols used to construct modern SNARKs were proposed in the 1990s (the sumcheck protocol) and even before the advent of Bitcoin (the GKR protocol in 2007). The main issues with using them were related to the lack of strong use cases (the internet was not well-developed in the 1990s) and the computational power required.
1) Zero-Knowledge Proofs: Origins (1985/1989).
The field of zero-knowledge proofs emerged in the academic literature with the paper "The Knowledge Complexity of Interactive Proof Systems" by Goldwasser, Micali, and Rackoff. For discussions on the origins, you can watch the video ZKP MOOC Lecture 1: Introduction and History of ZKP from January 2023. This paper introduced the concepts of completeness, soundness, and zero-knowledge, providing constructions for quadratic residuosity and quadratic non-residuosity.
2) Sumcheck Protocol (1992).
The sumcheck protocol was proposed by Lund, Fortnow, Karloff, and Nisan in their 1992 paper "Algebraic Methods for Interactive Proof Systems." It is one of the most important building blocks for succinct interactive proofs. It helps reduce the requirement of summing evaluations of multivariate polynomials to a single evaluation at a randomly chosen point.
3) Goldwasser-Kalai-Rothblum (GKR) (2007).
The GKR protocol (see the paper "Delegating Computation: Interactive Proofs for Muggles") is an interactive protocol where the prover runs in linear time based on the number of gates in the circuit, while the verifier runs in sublinear time based on the size of the circuit. In the protocol, the prover and verifier agree on a fan-in-two circuit over a finite field of depth d, where layer d corresponds to the input layer and layer 0 corresponds to the output layer. The protocol starts with a statement about the circuit's output, which is simplified to a statement about the values of the previous layer. Using recursion, it can be transformed into a statement about the circuit's inputs, making it easy to check. These reductions are achieved through the sumcheck protocol.
4) KZG Polynomial Commitment Scheme (2010).
In 2010, Kate, Zaverucha, and Goldberg introduced a polynomial commitment scheme using bilinear pairing groups in their paper "Constant-Size Commitments to Polynomials and Their Applications." The commitment consists of a single group element, allowing the committer to efficiently open the commitment for any correct evaluation of the polynomial. Additionally, due to batching techniques, multiple evaluations can be opened. KZG commitments provide one of the fundamental building blocks for several efficient SNARKs, such as Pinocchio, Groth16, and Plonk. It is also central to EIP-4844. For an intuitive understanding of batching techniques, refer to the Mina to Ethereum ZK bridge.
3. Practical SNARKs Using Elliptic Curves
The first practical constructions of SNARKs emerged in 2013. These constructions require a preprocessing step to generate proof and verification keys and are specific to programs/circuits. These keys can be quite large and depend on secret parameters unknown to the parties; otherwise, proofs can be forged. Converting code into provable code requires compiling the code into a polynomial constraint system. Initially, this had to be done manually, which was time-consuming and error-prone. Advances in the field have sought to eliminate some major issues:
1) Having more efficient provers.
2) Reducing the amount of preprocessing.
3) Having universal rather than circuit-specific setups.
4) Avoiding the use of trusted setups.
5) Developing methods to describe circuits using high-level languages instead of manually writing polynomial constraints.
Currently, practical SNARK schemes using elliptic curves include:
1) Pinocchio (2013)
2) Groth 16 (2016)
3) Bulletproofs & IPA (2016)
4) Sonic, Marlin, and Plonk (2019)
5) Lookups (2018/2020)
6) Spartan (2019)
7) HyperPlonk (2022)
8) Folding schemes (2008/2021)
3.1 Pinocchio (2013)
Pinocchio (see the paper "Pinocchio: Nearly Practical Verifiable Computation") is the first practical, usable zk-SNARK. The SNARK is based on quadratic arithmetic programs (QAPs). The proof size was initially 288 bytes. The toolchain for Pinocchio provides a compiler from C code to arithmetic circuits and further transforms it into QAP. The protocol requires the verifier to generate circuit-specific keys. It uses elliptic curve pairings to check equations. The asymptotics of proof generation and key setup are linear with respect to the computational size, while verification time is linear with respect to the size of public inputs and outputs.
3.2 Groth 16 (2016)
The 2016 paper by Groth, "On the Size of Pairing-based Non-interactive Arguments," introduced a new proof system that improved the performance of problems described by R1CS. It has the smallest proof size (only three group elements) and fast verification involving three pairings. It also involves a preprocessing step to obtain a structured reference string. The main drawback is that each program to be proven requires a different trusted setup, which is inconvenient. Groth16 has been used in ZCash. More details can also be found in the blog "An overview of the Groth 16 proof system."
3.3 Bulletproofs & IPA (2016)
One of the weaknesses of KZG PCS is that it requires a trusted setup. In the 2016 paper "Efficient Zero-Knowledge Arguments for Arithmetic Circuits in the Discrete Log Setting," Bootle et al. introduced an efficient zero-knowledge proof system that satisfies inner product relations with Pedersen commitment openings. The inner product proof system has a linear prover, logarithmic communication, and interaction, but linear verification time. It also developed a polynomial commitment scheme that does not require a trusted setup. Halo 2 and Kimchi both adopt the IPA PCS ideas.
3.4 Sonic, Marlin, and Plonk (2019)
Sonic, Plonk, and Marlin addressed the trusted setup issue for each program in Groth16 by introducing universal and updatable structured reference strings. Marlin provides an R1CS-based proof system and is central to Aleo.
Plonk introduced a new arithmetization scheme (later referred to as Plonkish) and uses grand-product checks for copy constraints. Plonkish also allows for the introduction of specialized gates for certain operations, known as custom gates. Several projects have custom versions of Plonk, including Aztec, ZK-Sync, Polygon ZKEVM, Mina's Kimchi, Plonky2, Halo 2, and Scroll. For more information, see the blog "All you wanted to know about Plonk."
3.5 Lookups (2018/2020)
Gabizon and Williamson introduced plookup in 2020, using grand product checks to prove that a value is contained in a precomputed value table. Although lookup arguments were previously proposed in Arya, their construction required determining the multiplicities of lookups, which was inefficient. The PlonkUp paper demonstrated how to introduce plookup arguments into Plonk. The issue with these lookup arguments is that they force the prover to pay for the entire table regardless of the number of lookups performed. This means the cost for large tables is quite substantial, and significant effort has been invested to reduce the prover's cost to the number of lookups used.
Haböck introduced LogUp, which uses logarithmic derivatives to convert product checks into sums of reciprocals. LogUp is crucial for the performance of Polygon's plonky2 ZKEVM (Beyond Limits: Pushing the Boundaries of ZK-EVM), which requires splitting the entire table into multiple STARK modules. These modules must be correctly linked and cross-table lookups are needed to reinforce this operation. The introduction of LogUp-GKR leverages the GKR protocol to enhance the performance of LogUp. Caulk is the first lookup argument where prover time is sublinear with respect to table size, with preprocessing time of O(N log N) and storage of O(N), where N is the table size. Several other schemes have since emerged, such as Baloo, flookup, cq, and caulk+. Lasso proposed several improvements, avoiding committing to the table when it has a given structure. Additionally, the prover in Lasso only pays for the table entries accessed during lookup operations. Jolt uses Lasso to prove the execution of virtual machines through lookups.
3.6 Spartan (2019)
Spartan provides an IOP for circuits described by R1CS, leveraging the properties of multivariate polynomials and the sumcheck protocol. Using an appropriate polynomial commitment scheme, it produces a transparent SNARK with a linear-time prover.
3.7 HyperPlonk (2022)
HyperPlonk is built on the Plonk idea using multivariate polynomials. It does not rely on quotients to check the execution of constraints but instead relies on the sumcheck protocol. It also supports high-degree constraints without compromising the prover's runtime. Since it relies on multivariate polynomials, there is no need to perform FFT, and the prover's runtime is linear with respect to the circuit size. HyperPlonk introduces new permutation IOPs suitable for smaller fields and a sumcheck-based batch opening protocol, which reduces the prover's workload, proof size, and verifier time.
3.8 Folding schemes (2008/2021)
Nova introduced the idea of folding schemes, a new method for achieving incremental verifiable computation (IVC). The concept of IVC can be traced back to Valiant, who demonstrated how to merge two proofs of length k into a proof of length k. The idea is that one can prove the correctness of the execution from step i to step i+1 through recursive proofs, and verify that the transition proof from step i-1 to step i is correct, thereby proving any long-running computation.
Nova handles uniform computations well; later, with the introduction of Supernova, it was extended to handle different types of circuits. Nova uses a relaxed version of R1CS and operates over friendly elliptic curves. Using friendly curve cycles (such as the Pasta curve) to implement IVC is also utilized in Pickles, which is a core module of Mina for achieving succinct state. However, the idea of folding differs from recursive SNARK verification. The concept of accumulators has a deeper connection to the concept of batching proofs. Halo introduced the concept of accumulation as an alternative to recursive proof composition. Protostar provides a non-uniform IVC scheme for Plonk, supporting high-degree gates and vector lookups.
4. SNARKs Using Collision-Resistant Hash Functions
Around the same time as the development of Pinocchio, some ideas for generating circuits/arithmetic schemes emerged to prove the correctness of virtual machine execution. Although developing arithmetic for virtual machines may be more complex or less efficient than writing dedicated circuits for certain programs, it offers the advantage that any program, no matter how complex, can be proven correct by demonstrating its execution in a virtual machine. The ideas in TinyRAM were later improved with the design of the Cairo VM and subsequent virtual machines (such as zk-evms or general zkvms). Using collision-resistant hash functions eliminates the need for trusted setups or elliptic curve operations, but at the cost of longer proofs.
1) TinyRAM (2013)
In "SNARKs for C," a PCP-based SNARK was developed to prove the correctness of C program execution, which was compiled into TinyRAM (a reduced instruction set computer). This computer adopts a Harvard architecture with byte-addressable random access memory. Utilizing non-determinism, the size of the circuit is quasilinear in relation to the computational size, effectively handling arbitrary and data-dependent loops, control flow, and memory access.
2) STARKs (2018)
STARKs were proposed by Ben Sasson et al. in 2018. Their proof size is O(log²n), with fast provers and verifiers, requiring no trusted setup, and are considered post-quantum secure. They were first used by Starkware/Starknet alongside the Cairo virtual machine. Key components include:
Algebraic Intermediate Representation (AIR)
and the FRI protocol (Fast Reed-Solomon Interactive Oracle Proof of Proximity).
STARKs are also used by other projects (Polygon Miden, Risc0, Winterfell, Neptune) or some adaptations of them (ZK-Sync's Boojum, Plonky2, Starky).
3) Ligero (2017)
Ligero introduced a proof system with a proof size of O(√n), where n is the circuit size. It arranges polynomial coefficients in matrix form and uses linear codes.
Brakedown builds on Ligero and introduces the idea of a domain-independent polynomial commitment scheme.
5. Some New Developments in ZKP
The use of different proof systems in production has shown the advantages of each approach and led to new developments. For example, plonkish arithmetization provides a straightforward way to include custom gates and lookup arguments; FRI has demonstrated excellent performance as a PCS, outperforming Plonky. Similarly, using grand product checks in AIR (leading to randomized AIR with preprocessing) has improved its performance and simplified memory access arguments. Hash function-based commitments have become popular—leveraging the speed of hash functions in hardware or introducing new SNARK-friendly hash functions.
1) New Polynomial Commitment Schemes (2023)
With the emergence of efficient SNARKs based on multivariate polynomials (such as Spartan or HyperPlonk), there is increasing interest in new commitment schemes suitable for such polynomials. Binius, Zeromorph, and Basefold have all proposed new forms dedicated to multilinear polynomials. Binius has the advantage of zero overhead for representing data types (where many proof systems use at least 32-bit field elements to represent a single bit) and works over binary fields. Binius commitments are based on Brakedown, designed to be domain-independent. Basefold extends FRI to codes beyond Reed-Solomon, forming a domain-independent PCS.
2) Customizable Constraint Systems (CCS) (2023)
CCS generalizes R1CS while capturing R1CS, Plonkish, and AIR arithmetics without any overhead. Combining CCS with Spartan IOP produces SuperSpartan, which supports high-degree constraints without the prover incurring cryptographic costs that scale with constraint degree. In particular, SuperSpartan generates SNARKs for AIR with linear-time provers.
6. Conclusion
This article has described the progress of SNARKs since their introduction in the mid-1980s. Advances in computer science, mathematics, and hardware, coupled with the introduction of blockchain, have given rise to new, more efficient SNARKs, opening doors to many applications that could change our society. Researchers and engineers have proposed improvements and adjustments to SNARKs based on their needs, focusing on proof size, memory usage, transparent setups, post-quantum security, prover time, and verifier time.
While there were initially two main lines (SNARK and STARK), the boundaries between the two have begun to blur, attempting to combine the advantages of different proof systems. For example, combining different arithmetization schemes with new polynomial commitment schemes. It is expected that new proof systems will continue to emerge, with performance improving accordingly, while some systems that require time to adapt will struggle to keep up with these developments unless these tools can be easily utilized without altering some core infrastructure.