Temasek Web3 Fund Superscrypt: Proof of Storage Will Unlock a Large Number of Cross-Chain New Use Cases

Retrieving Historical Data

Blockchain historical data has various uses. It can prove asset ownership, record user behavior and transaction history, and then feed this information into on-chain smart contracts or applications. As of now, over 18 million blocks have been written to Ethereum. However, smart contracts can only access the latest 256 blocks (or data from the last approximately 30 minutes), so "historical data" refers to information beyond the last 256 blocks.

Today, to access historical data, protocols typically query archive node providers, such as Infura, Alchemy, or other third-party indexers. This means having to trust and rely on them and their data.

Historical Data

However, data retrieval can be accomplished with a relatively lower level of trust by using storage proofs.

Storage proofs are zero-knowledge proofs that can verify historical data stored on the blockchain. More specifically, "storage proofs" can be used to prove that a specific state existed in a past block. They are characterized by not requiring trust in third parties or oracles, but instead embedding trust within the storage proofs themselves.

How do storage proofs help verify that certain data exists in earlier historical blocks? This involves two steps:

Step 1: Check whether a specific block indeed exists in the chain's historical record, for example, whether the block is a valid part of the source chain's history.
Step 2: Check whether specific data is part of the block, i.e., whether specific transaction information is part of that block (this part of the verification can be completed through Merkle inclusion proofs).

Once the recipient (such as a smart contract on the destination chain) receives and completes the proof, it will gain trust in the validity of the data, thus executing the corresponding instruction set. This concept can be further extended: validated data can be used to run additional off-chain computations, which then generate another zero-knowledge proof to prove the validity of the data and computation.

In short, storage proofs allow for the retrieval of historical on-chain data in a trust-minimized manner. This is crucial because, as outlined in the first part, we believe that in the coming years, Web3 will become increasingly multi-chain and multi-layered. The emergence of various Layer 1s along with rollups and application chains means that users' on-chain activities may occur simultaneously across multiple chains. This further emphasizes the need for trust-minimized interoperability solutions that can maintain the composability of user assets, identities, and transaction histories across multiple domains. This is precisely the problem that storage proofs can help solve.

Use Cases for Storage Proofs

Storage proofs allow smart contracts to check any historical transaction or data as a prerequisite. This provides great flexibility for cross-chain application design.

First, storage proofs can prove any historical data on the source blockchain, such as:

Account balances and token ownership
User transaction activities
Historical prices of asset trades over a specified period
Real-time asset balances in liquidity pools across different chains

Secondly, storage proofs can be sent to the target chain, unlocking various cross-chain use cases:

Enabling users to vote on governance proposals at a lower cost on L2
Allowing NFT holders to mint NFTs and gain community benefits on a new chain
Rewarding users based on their historical interactions with specific dApps (e.g., airdrops).
Providing interest rate loans based on users' comprehensive transaction and credit history
Recovering dormant accounts
Calculating historical TWAP for future trades
Computing more accurate AMM trading prices based on liquidity pools across multiple chains

Essentially, storage proofs allow applications to query and transplant users' on-chain activities and historical records across multiple chains to provide information for smart contracts or applications on another chain.

Storage Proofs - Use Cases

Next, we will explain the working mechanism of storage proofs through a more detailed example.

Detailed Example of Storage Proof Mechanism Use Case

Suppose "X" is a DeFi protocol using tokens on Ethereum. X will publish a governance proposal, and the project team wants to publish it on a lower-cost chain to facilitate user voting. Users can only vote if they hold X tokens on Ethereum at a specific point in time (i.e., a snapshot, such as block #17,000,000).

How is this currently implemented?

The current method is to query archive nodes to obtain a complete list of token holders who meet the requirements in block #17,000,000. Subsequently, the DAO administrator stores this list in a smart contract on the target chain to determine the final eligible voting list. However, this method has some limitations:

The list of voters can be very large, and it changes with each snapshot, making the on-chain storage and update costs for each voting proposal very high;
There is implicit trust in the archive node providers and the data they provide;
It must be ensured that members managing the DAO do not tamper with the voting list.

How Storage Proofs Achieve This

As explained in the second part, expensive computations can be delegated to off-chain zero-knowledge provers.

The zk prover will generate a concise proof and send it to the target chain for verification. Taking the above DAO voting eligibility as an example:

The prover generates a zero-knowledge proof that proves block #17,000,000 is part of Ethereum's history (as in Step 1 above).
After proving the validity of the block, we can use Merkle inclusion proofs to prove that the user held DAO tokens at the time the block was finalized (as in Step 2 above).

Historical Data Proof Enables Cross-Chain Voting

Subsequently, the proof is sent to the smart contract on the target chain for verification. If the verification is successful, the smart contract on L2 will grant the user voting rights.

There are several advantages to using storage proofs, as their existence means that the verification process does not require:

Trust in archive node providers;
The protocol does not need to maintain an expensive on-chain voter list;
Users do not need to transfer their assets to the target chain.

Setup Required for Storage Proofs

So far, we have abstracted some complexities of storage proofs. However, using storage proofs requires service providers to conduct meticulous initial setups to ensure that storage proofs can be used without trusting the providers. As part of this process, two things will be generated and stored on-chain:

Zero-knowledge proof of the entire chain ("zk commitment"): The service provider will divide all historical blocks on the source chain into continuous and fixed-size "chunks" using a Merkle Tree and generate zero-knowledge proofs for each chunk to verify the grouping. These proofs are then recursively merged until a final zero-knowledge proof is obtained, which is the "zk commitment" of the entire chain. This proves that the provider has correctly indexed the entire chain's history.

zk Commitment Based on Ethereum Historical Information

Merkle Mountain Range Data Structure: The provider will also store the Keccak Merkle root of the block hashes (chunks) of the source chain in a chain-based data structure called Merkle Mountain Range (MMR). This data structure is used because it is easy to query and update, allowing the provider to efficiently prove that a given block exists in the chain's history. MMR can be created using Keccak256 hashes, Poseidon hashes, or both. Poseidon hashes are more zero-knowledge friendly, allowing computations on historical data, which can then be proven valid through zero-knowledge proofs.

Merkle Mountain Range (MMR)

As new blocks are added, the service provider will regularly (e.g., hourly or daily) update the "zk commitment" and MMR, synchronizing with the source chain. The purpose of this is to keep the historical blocks always associated with the current 256 blocks accessible from the EVM. This ensures the relevance of historical data to the currently available blocks on Ethereum.

In the following diagram, we detail how to implement this setup:

In summary, the following outlines how storage proofs work in the DAO voting example we introduced earlier after the setup is complete:

The service provider creates and stores the "zk commitment" (i.e., Ethereum transaction history) across the entire chain and the MMR on the target chain.
The service provider offers an API to query historical data on-chain or off-chain.
The voting dApp on the target chain sends a query to the provider's smart contract to confirm whether the user held DAO tokens at block #17,000,000 on Ethereum.
Additionally, the provider needs to verify:
The queried block is part of Ethereum's historical record (Step 1 above); then, a zero-knowledge proof of block inclusion is generated through the MMR.
The user held DAO tokens in block #17,000,000 (Step 2 above); then the provider generates another zero-knowledge proof to prove that the user held DAO tokens within that block.
The provider aggregates the generated proofs into a single zero-knowledge proof.
The aggregated zero-knowledge proof is then sent back to the voting dApp smart contract on the target chain for verification, and if successful, allows the user to vote.

Project Teams Committed to This Field

Several companies are building smart contracts to access on-chain historical data in a trust-minimized manner.

Axiom, which is currently live on Ethereum, aims to provide access to Ethereum's historical data for smart contracts through zk-based storage proofs. The team is also enhancing the capability to perform off-chain computations based on historical data and proving the correctness of these data and computations in zero-knowledge.

Relic Protocol employs a similar technical approach to Axiom and is operational on Ethereum and zkSync Era. Relic uses Merkle inclusion proofs to prove data inclusion (which differs from Axiom's method of proving Merkle inclusion in zero-knowledge).

Herodotus is working to provide Ethereum historical data for L2. Currently, the testnet is live on Starknet and zkSync Era. With funding from the OP Foundation, the next goals for the Herodotus team are becoming very clear.

Lagrange Labs has introduced fully updatable proofs through its recent ZK MapReduce (ZKMR) innovation. It uses a new vector commitment called Recproofs, extending the concept of updatability to data computation.

Conclusion

In this part, we introduced how storage proofs can verify on-chain historical data without trusting third parties. This makes them an important tool for on-chain composition and cross-chain interoperability.

As total value locked continues to migrate from Ethereum to Layer 2 ecosystems, we expect more expressive applications utilizing on-chain historical data through storage proofs to emerge.

While the verification speed of zero-knowledge proofs is increasing and costs are decreasing, the ongoing cost of generating storage proofs to keep up with on-chain states remains a challenge. The profitability of such services will depend on the volume of queries generated by querying applications.

Despite the challenges, the importance of consensus proofs and storage proofs driven by zero-knowledge technology cannot be overstated. We look forward to seeing how these technologies will be used to build a more trust-minimized multi-chain future.

ChainCatcher reminds readers to view blockchain rationally, enhance risk awareness, and be cautious of various virtual token issuances and speculations. All content on this site is solely market information or related party opinions, and does not constitute any form of investment advice. If you find sensitive information in the content, please click "Report", and we will handle it promptly.