Why is the "small table mode" zkEVM considered more efficient?

Fox Tech
2023-04-07 14:19:01
Collection
The "small table mode" zkEVM not only ensures that native Solidity Ethereum developers can seamlessly migrate to zkEVM, but also significantly reduces the redundant costs incurred when encapsulating EVM into the ZK proof system.

Author: Fox Tech CEO Kang Shuiyue, Fox Tech CTO Lin Yanxi

Introduction: The Ethereum Virtual Machine (EVM) is a code execution environment built on the Ethereum blockchain, where contract code can be completely isolated and run internally within the EVM. Its main function is to handle smart contracts within the Ethereum system. The reason Ethereum is said to be Turing complete is that developers can use the Solidity language to create applications that run on the EVM, allowing for the computation of any calculable problem. However, being Turing complete is not enough; people are also trying to encapsulate the EVM within ZK proof systems, but the issue is that this encapsulation generates a lot of redundancy. The "Small Table Model" zkEVM invented by Fox significantly reduces the redundancy costs incurred when encapsulating the EVM into a ZK proof system while ensuring that native Solidity Ethereum developers can seamlessly migrate to zkEVM.

Since its inception in 2015, the EVM has been undergoing an epic ZK transformation. This major transformation has two main directions.

The first direction is the so-called zkVM track, which is dedicated to optimizing the performance of applications, where compatibility with the Ethereum Virtual Machine is not the primary concern. There are two sub-directions here: one is to create a Domain Specific Language (DSL), such as StarkWare's promotion of the Cairo language, which is not an easy task. The second is to aim for compatibility with existing mature languages, such as RISC Zero's efforts to make zkVM compatible with C++/Rust. The challenge of this track lies in the complexity of the constraints generated due to the introduction of the Instruction Set Architecture (ISA).

The second direction is the so-called zkEVM track, which focuses on the compatibility of EVM Bytecode, meaning that EVM code at the Bytecode level and above generates corresponding zero-knowledge proofs through zkEVM. This allows native Solidity Ethereum developers to migrate to zkEVM at no cost. The main competitors in this track include Polygon zkEVM, Scroll, Taiko, and Fox. The challenge of this track is to manage the redundancy costs generated when encapsulating EVM, which is not well-suited for ZK proof systems. After a long period of contemplation and validation, Fox finally found the key to fundamentally reduce the enormous redundancy of the first generation zkEVM: the "Small Table Model" zkEVM.

Data and proof circuits are the two core elements for generating proofs in zkEVM. On one hand, in zkEVM, the prover needs all the data involved in the transactions to prove that the state transitions resulting from the transactions are correct, while the amount of data in the EVM is large and complex in structure. Therefore, how to organize and arrange the data required for the proof is a critical consideration in building an efficient zkEVM. On the other hand, how to efficiently prove (or verify) the validity and correctness of the computation execution through a series of circuit constraints is fundamental to ensuring the security of zkEVM.

We will first discuss the second issue, as it is a concern for all teams designing zkEVM. The essence of this issue is "What exactly do we want to prove?" Currently, the thought process regarding this question is similar among everyone. Since a transaction (or its involved op-code) can vary widely, it seems unrealistic to directly prove that each step of the operation leads to a correct state change in sequence. Therefore, we need to categorize the proofs.

image

Figure 1: Two generations of zkEVM solutions - Large Table and Small Table

For example, we can group the changes of elements in the stack together and write a dedicated stack circuit proof, and write a dedicated arithmetic circuit for pure arithmetic operations, etc. This way, the situations that each circuit needs to consider become relatively simple. These different functional circuits have different names in different zkEVMs; some directly call them circuits, while others refer to them as (sub) state machines, but the essence of this idea is the same.

To clarify the significance of this approach, let's take an example. Suppose we want to prove an addition operation (taking the top two elements from the stack and placing their sum back on top of the stack):

Assuming the original stack is [1,3,5,4,2]

If we do not categorize and split, we would need to prove that after performing the above operation, the stack becomes [1,3,5,6].

However, if we categorize and split, we only need to prove the following:

  • Stack circuit:
  • C1: Prove that [1,3,5,4,2] becomes [1,3,5] after popping 2 and 4.
  • C2: Prove that [1,3,5] becomes [1,3,5,6] after pushing (6).
  • Arithmetic circuit:
  • C3: a=2, b=4, c=6, prove that a+b=c.

It is worth noting that the complexity of the proof is related to the number of various situations that the circuit needs to consider. If we do not categorize and split, the possibilities that the circuit needs to cover will be enormous.

image

Figure 2: The Large Table Model used in the first generation zkEVM

Once we categorize and split, the situations for each part become relatively simple, thus significantly reducing the difficulty of the proof.

However, categorization and splitting also bring other issues, namely the data consistency problem between different category circuits. For example, in the above case, we actually need to prove the following two things:

  • C4: "The number popped out in C1" = "a and b in C3"
  • C5: "The number pushed in C2" = "c in C3"

To solve this problem, we return to the first question: how do we organize the data involved in the transactions? Let's continue to explore this topic:

An intuitive method is as follows: through tracing, we can break down every step involved in all transactions, know the data involved, and request the portion of data not included in the trace from the nodes. We then arrange it into a large table T as follows:

"First operation" "Data involved in the first operation"

"Second operation" "Data involved in the second operation"


"n-th operation" "Data involved in the n-th operation"

In this way, in the above example, we would have a record of:

"k-th step: Addition" "a=2, b=4, c=6"

And the above C4 can be proven as follows:

  • C4(a): The number popped in C1 is consistent with the k-th step in the large table T.
  • C4(a): The a and b in C3 are consistent with the k-th step in the large table T.

C5 is similar. This operation (proving that some elements have appeared in a table) is called lookup. We will not detail the specific algorithm for lookup in this article, but it can be imagined that the complexity of the lookup operation is closely related to the size of the large table T. Therefore, we return to the first question: how do we organize the data that will be used for the proof?

image

Figure 3: Fox's invented "Small Table Model" zkEVM

We consider the following series of table constructions:

Table Ta:

"First operation of type a" "Data involved in the first operation of type a"

"Second operation of type a" "Data involved in the second operation of type a"


"m-th operation of type a" "Data involved in the m-th operation of type a"

Table Tb:

"First operation of type b" "Data involved in the first operation of type b"

"Second operation of type b" "Data involved in the second operation of type b"


"m-th operation of type b" "Data involved in the n-th operation of type b"

By constructing multiple small tables in this way, the advantage is that we can directly perform lookups in the corresponding small table based on the type of operations involved in the required data, thus greatly improving efficiency.

A simple example (assuming we can only lookup one element at a time) is that if we want to prove that the letters a~h all exist in [a,b,c,d,e,f,g,h], we would need to perform 8 lookups on a table of size 8. However, if we divide the table into [a,b,c,d] and [e,f,g,h], we only need to perform 4 lookups on each of the two tables of size 4!

In the FOX layer2 zkEVM, this small table design is used to enhance efficiency. To ensure that complete proofs can be made in various situations, the specific splitting method of small tables needs to be carefully designed, and the key to improving efficiency lies in the classification of table contents and balancing their sizes. Although implementing a complete zkEVM within this framework requires a massive workload, we anticipate that such zkEVM will achieve breakthrough improvements in performance.

Conclusion: The "Small Table Model" zkEVM invented by Fox significantly reduces the redundancy costs incurred when encapsulating the EVM into a ZK proof system while ensuring that native Solidity Ethereum developers can migrate to zkEVM at no cost. This represents a significant transformation in the structure of zkEVM and will have a profound impact on Ethereum's scalability solutions.

ChainCatcher reminds readers to view blockchain rationally, enhance risk awareness, and be cautious of various virtual token issuances and speculations. All content on this site is solely market information or related party opinions, and does not constitute any form of investment advice. If you find sensitive information in the content, please click "Report", and we will handle it promptly.
banner
ChainCatcher Building the Web3 world with innovators