Temasek Web3 Fund Superscrypt: Proof of Storage Will Unlock a Large Number of Cross-Chain New Use Cases
Original Title: Blockchain Interoperability Part III: Storage Proofs, Powering new cross-chain use cases
Author: Jacob, Superscrypt
Compilation: bayemon.eth, ChainCatcher
In the second part about interoperability, we explored how consensus proofs, as an emerging trust-minimized approach, facilitate bridging between blockchains.
In this article, we will explore storage proofs, which adopt the concept of trust-minimized verification and extend it to historical transaction records on the blockchain. By verifying historical transactions and user activities through these storage proofs, a multitude of cross-chain use cases can be unlocked.
In the second part, we introduced consensus proofs, a trust-minimized method for bridging funds across blockchains. Since bridging users typically want transactions to be completed instantly without delays, consensus proofs are very effective. This is because they can continuously check the latest state of the blockchain during the ongoing synchronization process.
The concept of "trust-minimized bridging" can also be applied in reverse, tracing history and using zero-knowledge proofs to verify transactions and data in old blocks. These "historical storage proofs" can enable a range of different cross-chain use cases, and in this article, we will define storage proofs, explain their principles, and discuss their use cases.
Retrieving Historical Data
Blockchain historical data has various uses. It can prove asset ownership, record user behavior and transaction history, and then feed this information into on-chain smart contracts or applications. As of now, over 18 million blocks have been written to Ethereum. However, smart contracts can only access the latest 256 blocks (or data from the last approximately 30 minutes), so "historical data" refers to information beyond the last 256 blocks.
Today, to access historical data, protocols typically query archive node providers, such as Infura, Alchemy, or other third-party indexers. This means having to trust and rely on them and their data.
Historical Data
However, data retrieval can be accomplished with a relatively lower level of trust by using storage proofs.
Storage proofs are zero-knowledge proofs that can verify historical data stored on the blockchain. More specifically, "storage proofs" can be used to prove that a specific state existed in a past block. They are characterized by not requiring trust in third parties or oracles, but instead embedding trust within the storage proofs themselves.
How do storage proofs help verify that certain data exists in earlier historical blocks? This involves two steps:
- Step 1: Check whether a specific block indeed exists in the chain's historical record, for example, whether the block is a valid part of the source chain's history.
- Step 2: Check whether specific data is part of the block, i.e., whether specific transaction information is part of that block (this part of the verification can be completed through Merkle inclusion proofs).
Once the recipient (such as a smart contract on the destination chain) receives and completes the proof, it will gain trust in the validity of the data, thus executing the corresponding instruction set. This concept can be further extended: validated data can be used to run additional off-chain computations, which then generate another zero-knowledge proof to prove the validity of the data and computation.
In short, storage proofs allow for the retrieval of historical on-chain data in a trust-minimized manner. This is crucial because, as outlined in the first part, we believe that in the coming years, Web3 will become increasingly multi-chain and multi-layered. The emergence of various Layer 1s along with rollups and application chains means that users' on-chain activities may occur simultaneously across multiple chains. This further emphasizes the need for trust-minimized interoperability solutions that can maintain the composability of user assets, identities, and transaction histories across multiple domains. This is precisely the problem that storage proofs can help solve.
Use Cases for Storage Proofs
Storage proofs allow smart contracts to check any historical transaction or data as a prerequisite. This provides great flexibility for cross-chain application design.
First, storage proofs can prove any historical data on the source blockchain, such as:
- Account balances and token ownership
- User transaction activities
- Historical prices of asset trades over a specified period
- Real-time asset balances in liquidity pools across different chains
Secondly, storage proofs can be sent to the target chain, unlocking various cross-chain use cases:
- Enabling users to vote on governance proposals at a lower cost on L2
- Allowing NFT holders to mint NFTs and gain community benefits on a new chain
- Rewarding users based on their historical interactions with specific dApps (e.g., airdrops).
- Providing interest rate loans based on users' comprehensive transaction and credit history
- Recovering dormant accounts
- Calculating historical TWAP for future trades
- Computing more accurate AMM trading prices based on liquidity pools across multiple chains
Essentially, storage proofs allow applications to query and transplant users' on-chain activities and historical records across multiple chains to provide information for smart contracts or applications on another chain.
Storage Proofs - Use Cases
Next, we will explain the working mechanism of storage proofs through a more detailed example.
Detailed Example of Storage Proof Mechanism Use Case
Suppose "X" is a DeFi protocol using tokens on Ethereum. X will publish a governance proposal, and the project team wants to publish it on a lower-cost chain to facilitate user voting. Users can only vote if they hold X tokens on Ethereum at a specific point in time (i.e., a snapshot, such as block #17,000,000).
How is this currently implemented?
The current method is to query archive nodes to obtain a complete list of token holders who meet the requirements in block #17,000,000. Subsequently, the DAO administrator stores this list in a smart contract on the target chain to determine the final eligible voting list. However, this method has some limitations:
- The list of voters can be very large, and it changes with each snapshot, making the on-chain storage and update costs for each voting proposal very high;
- There is implicit trust in the archive node providers and the data they provide;
- It must be ensured that members managing the DAO do not tamper with the voting list.
How Storage Proofs Achieve This
As explained in the second part, expensive computations can be delegated to off-chain zero-knowledge provers.
The zk prover will generate a concise proof and send it to the target chain for verification. Taking the above DAO voting eligibility as an example:
- The prover generates a zero-knowledge proof that proves block #17,000,000 is part of Ethereum's history (as in Step 1 above).
- After proving the validity of the block, we can use Merkle inclusion proofs to prove that the user held DAO tokens at the time the block was finalized (as in Step 2 above).
Historical Data Proof Enables Cross-Chain Voting
Subsequently, the proof is sent to the smart contract on the target chain for verification. If the verification is successful, the smart contract on L2 will grant the user voting rights.
There are several advantages to using storage proofs, as their existence means that the verification process does not require:
- Trust in archive node providers;
- The protocol does not need to maintain an expensive on-chain voter list;
- Users do not need to transfer their assets to the target chain.
Setup Required for Storage Proofs
So far, we have abstracted some complexities of storage proofs. However, using storage proofs requires service providers to conduct meticulous initial setups to ensure that storage proofs can be used without trusting the providers. As part of this process, two things will be generated and stored on-chain:
- Zero-knowledge proof of the entire chain ("zk commitment"): The service provider will divide all historical blocks on the source chain into continuous and fixed-size "chunks" using a Merkle Tree and generate zero-knowledge proofs for each chunk to verify the grouping. These proofs are then recursively merged until a final zero-knowledge proof is obtained, which is the "zk commitment" of the entire chain. This proves that the provider has correctly indexed the entire chain's history.
zk Commitment Based on Ethereum Historical Information
- Merkle Mountain Range Data Structure: The provider will also store the Keccak Merkle root of the block hashes (chunks) of the source chain in a chain-based data structure called Merkle Mountain Range (MMR). This data structure is used because it is easy to query and update, allowing the provider to efficiently prove that a given block exists in the chain's history. MMR can be created using Keccak256 hashes, Poseidon hashes, or both. Poseidon hashes are more zero-knowledge friendly, allowing computations on historical data, which can then be proven valid through zero-knowledge proofs.
Merkle Mountain Range (MMR)
As new blocks are added, the service provider will regularly (e.g., hourly or daily) update the "zk commitment" and MMR, synchronizing with the source chain. The purpose of this is to keep the historical blocks always associated with the current 256 blocks accessible from the EVM. This ensures the relevance of historical data to the currently available blocks on Ethereum.
In the following diagram, we detail how to implement this setup:
In summary, the following outlines how storage proofs work in the DAO voting example we introduced earlier after the setup is complete:
- The service provider creates and stores the "zk commitment" (i.e., Ethereum transaction history) across the entire chain and the MMR on the target chain.
- The service provider offers an API to query historical data on-chain or off-chain.
- The voting dApp on the target chain sends a query to the provider's smart contract to confirm whether the user held DAO tokens at block #17,000,000 on Ethereum.
- Additionally, the provider needs to verify:
- The queried block is part of Ethereum's historical record (Step 1 above); then, a zero-knowledge proof of block inclusion is generated through the MMR.
- The user held DAO tokens in block #17,000,000 (Step 2 above); then the provider generates another zero-knowledge proof to prove that the user held DAO tokens within that block.
- The provider aggregates the generated proofs into a single zero-knowledge proof.
- The aggregated zero-knowledge proof is then sent back to the voting dApp smart contract on the target chain for verification, and if successful, allows the user to vote.
Project Teams Committed to This Field
Several companies are building smart contracts to access on-chain historical data in a trust-minimized manner.
Axiom, which is currently live on Ethereum, aims to provide access to Ethereum's historical data for smart contracts through zk-based storage proofs. The team is also enhancing the capability to perform off-chain computations based on historical data and proving the correctness of these data and computations in zero-knowledge.
Relic Protocol employs a similar technical approach to Axiom and is operational on Ethereum and zkSync Era. Relic uses Merkle inclusion proofs to prove data inclusion (which differs from Axiom's method of proving Merkle inclusion in zero-knowledge).
Herodotus is working to provide Ethereum historical data for L2. Currently, the testnet is live on Starknet and zkSync Era. With funding from the OP Foundation, the next goals for the Herodotus team are becoming very clear.
Lagrange Labs has introduced fully updatable proofs through its recent ZK MapReduce (ZKMR) innovation. It uses a new vector commitment called Recproofs, extending the concept of updatability to data computation.
Conclusion
In this part, we introduced how storage proofs can verify on-chain historical data without trusting third parties. This makes them an important tool for on-chain composition and cross-chain interoperability.
As total value locked continues to migrate from Ethereum to Layer 2 ecosystems, we expect more expressive applications utilizing on-chain historical data through storage proofs to emerge.
While the verification speed of zero-knowledge proofs is increasing and costs are decreasing, the ongoing cost of generating storage proofs to keep up with on-chain states remains a challenge. The profitability of such services will depend on the volume of queries generated by querying applications.
Despite the challenges, the importance of consensus proofs and storage proofs driven by zero-knowledge technology cannot be overstated. We look forward to seeing how these technologies will be used to build a more trust-minimized multi-chain future.