a16z Crypto: Exploring the Impossibility of Implementing Stateless Blockchains

2023-08-25 16:14:37

Collection

Recent studies show that without taking additional measures to manage the "state," no matter how intelligent the stateless blockchain solutions are, they are not feasible.

Scan with WeChat

Original Title: On the impossibility of stateless blockchains

Authors: Miranda Christ, PhD in CS from Columbia University, a16z Summer Intern Researcher & Joseph Bonneau, a16z Research Partner

Compiled by: bayemon.eth, ChainCatcher

As the number of users of blockchain technology increases and transactions occurring on it become more frequent, the amount of information that validators store to verify transactions (the "state") is also increasing. For example, in Bitcoin, the state consists of a set of unspent transaction outputs (UTXOs). In Ethereum, the state includes the account balances of each account as well as the code and storage of each smart contract.

For blockchains with a sufficient number of accounts or UTXOs, this storage burden will make it difficult for the blockchain to support the real daily transactions of a significant portion of users, thereby preventing these users from becoming validators and thus posing a threat to decentralization. Since tools like Merkle Trees and ZK have already helped blockchains achieve incredible goals, we want to turn to cryptography for corresponding solutions.

This is precisely the goal of a "stateless blockchain." However, despite significant work in this area, corresponding solutions are still far from being practical. It has proven to be an inherent lag due to the unbridgeable gap between construction and practicality, which seems to be a natural barrier in the development of any industry. Our latest research indicates that without additional measures to manage the "state," no matter how intelligent the stateless blockchain solution is, it is unfeasible. However, as we mention at the end of this article, even if the results are not practical, we should not be discouraged.

What is "Stateless"

Today, while the state is large, it is manageable. For example, Bitcoin nodes store about 7 GB of data, while Ethereum nodes store about 650 GB of data. However, the storage burden of full nodes grows roughly linearly with the chain's throughput (transactions per second or TPS), yet the current throughput of the main chain is indeed ridiculously low. Given the current design, the state required to truly support daily transactions (tens of thousands to hundreds of thousands of TPS) would be enormous, requiring several TB or even PB of storage space.

The limited performance of the main chain has prompted the search for technical methods to significantly reduce the amount of state required by validators. The best technical approach is to construct a stateless blockchain, where validators only need to store a constant-sized state regardless of transaction throughput. (This term is actually a misunderstanding: the state still exists in a constant size, just small enough to operate normally under any future throughput). Stateless blockchains can significantly reduce the hardware storage requirements, making it easier to run validator nodes, optimistically allowing everyone to run a node on their mobile phones. Since increasing the number of validators enhances the security of the chain, it is crucial to lower the entry barrier for validators.

Although Todd, Buterin, Boneh, and Srinivasan have conducted extensive research on stateless blockchains, there is still some distance to complete realization, and to our knowledge, no stateless blockchain has been successfully deployed. The fundamental problem with all known stateless blockchains is that they require users to store additional data (called "witnesses") to help validators verify transactions involving their accounts. For example, this witness might be a Merkle inclusion proof indicating that the user's account and its balance are included in the global state commitment. When users make transactions, they submit this proof to validators to show that their accounts have sufficient balances.

Unlike immutable private keys, these witnesses change frequently, even for users who do not actively transact, which imposes an unimaginable burden on users. Imagine if you had to continuously monitor all other credit card transactions globally and update some local data accordingly to use your credit card; it would indeed cause unnecessary hassle. Therefore, to make blockchains practically viable, users must be able to remain offline and only interact with the blockchain when submitting transactions. This also means that in various scenarios, including hardware wallets, witnesses cannot be updated at all.

This naturally leads us to a research question: Can we establish a stateless blockchain that does not require (or requires very few) updates to witnesses? To answer this question, we developed a novel theoretical framework (revocable proof system) that generalizes stateless blockchains. Using this framework, we proved a definitive impossibility: it is almost impossible to balance between concise global state and frequent witness updates. Our proof technique is information-theoretic, meaning that future computers will not be powerful enough to solve this problem: the gap between stateless blockchain construction and practicality will never be bridged.

Research Background

To further intuitively understand the "impossibility" mentioned above, we will first describe the natural method of stateless blockchains using Merkle Trees (although Merkle Tree-based solutions are not very efficient). Our goal is for validators to determine whether the transactions submitted by users are valid— for example, whether the user has sufficient account balance for the transaction. In a stateless blockchain scheme, validators store a constant-sized state, and when users make transactions, their transactions must include a witness. Validators can use the current state and the (transaction, witness) pair submitted by the user to verify whether the user has sufficient account balance to make the transaction.

We first construct a Merkle Tree, where each pair (account ID, balance) is a leaf on the tree. The constant-sized state V stored by the validator is the root of this tree and serves as a commitment to the account-balance key-value pairs. Each user must maintain their (account ID, balance) information through a Merkle inclusion proof. The Merkle inclusion proof for the leaf (a,b) consists of the set of nodes (v1, …, vk) on the path from the leaf to the root of the tree. Validators can check whether b is indeed the balance of account a by verifying the proof (v1, …, vk) against their current state V. One feature of Merkle Trees is that if a Merkle inclusion proof for a leaf is given, it is easy to compute the resulting root when the leaf changes. In other words, validators can easily compute the updated state V' that captures the new balance of account a after the transaction is executed.

However, Merkle Trees have two main drawbacks. First, user witnesses are relatively large, growing logarithmically with the total number of accounts in the system. Ideally, they should be of constant size, which can be achieved using schemes like RSA accumulators (Boneh et al. have researched this in the context of stateless blockchains).

The second drawback is harder to avoid: as long as other users transact, the proofs for the account-balance key-value pairs will change. Recall that the proof for a leaf consists of the set of nodes on the path from the leaf to the root. If any other leaf changes, one of these nodes will also change, which presents a practical problem. Most blockchain users wish to passively keep their coins in wallets and only go online when they want to transact. However, in the practical application of such a stateless blockchain, users must constantly monitor the transactions of others to ensure their witnesses are up to date. (While third parties can monitor on behalf of users, this deviates from the standard stateless blockchain model. We will discuss this at the end of this article). In reality, this poses a challenging burden for users in stateless blockchains.

Conclusion: Stateless Blockchains Are Impossible to Achieve

This phenomenon is not a unique flaw of Merkle Trees— all currently known stateless blockchain schemes require users to frequently update their corresponding witness information. More accurately, the number of users who must update their witness information is roughly linearly related to the total number of transactions by all users and is monotonically increasing.

This means that even if user Alice has not made any transactions, her witness information may still need to change with the transactions of other users. As long as the concise state stored by validators is too small to capture the complete state (i.e., the set of all account balances), increasing the scale of concise state storage does not help. We visualize this implicit relationship while also outputting the number of validator changes required daily for blockchains of different throughputs. These graphs show the number of witness changes required for the optimal stateless blockchain, where data universes refer to the total number of accounts under the account model or the total number of UTXOs under the UTXO model.

The core of our proof is an information-theoretic argument. The core principle of information theory formally proposed by Claude Shannon is: if Alice randomly selects an object from a set of size 2n and wants to inform Bob of her choice, she must send at least n bits to Bob. If there exists a stateless blockchain scheme where users do not need to update their witnesses, Alice could inform Bob of her specific choice with less than n bits of cost— however, Shannon proved this is impossible. Thus, we also believe that such a stateless blockchain cannot exist.

For simplicity, we first provide a weakened proof that it is impossible for users to never need to update their witnesses in a stateless blockchain. The key process of the proof involves Alice encoding her information using the stateless blockchain scheme to transmit it to Bob. Initially, assume that Alice and Bob both know all n users' complete account-balance pairs, and that each account balance is at least 1. Additionally, Alice and Bob know the concise state V of the stateless blockchain and the witnesses wi for all account-balance pairs (ai, bi), and they have agreed on a mapping between the information and the account set. Alice will choose a set of accounts A corresponding to her information and use the account balances for consumption, ultimately using the stateless blockchain to convey the set of accounts she selected to Bob, who can understand her information from this account set.

Encoding: Alice computes the updated state V' using the stateless blockchain scheme and sends V' to Bob.

Decoding: For each i, Bob checks Verify(wi, (ai, bi)). Bob outputs the account set B such that Verify(wi, (ai, bi)) = false.

Ultimately, Bob successfully outputs the same set as Alice selected, i.e., B = A. First, note that if Alice spends a coin from account ai, the witness for its old balance should no longer be accepted— otherwise, Alice could double spend. Therefore, for each account ai in A, Verify(wi, (ai, bi)) = false, and Bob will include this account in B. On the other hand, Bob will not include accounts that do not have unchanged balances in B, because the balances of these accounts remain unchanged, and (recall our "weakened" statement to be proven) their witnesses will never change. Thus, B is exactly equal to A.

Finally, we derive a contradiction by calculating the number of bits Alice should have sent to Bob. She could choose from 2n subsets of accounts, so according to Shannon's law, she should send at least n bits to Bob. However, she only sent a constant-sized state V', which is much shorter than n bits.

(Readers familiar with cryptography may notice that we have omitted some details here; for example, the probability of decoding failure can be ignored. Our paper includes the complete proof).

While we describe our proof using stateless blockchains, Alice and Bob could also use various other authenticated data structures (including accumulators, vector commitments, and authenticated dictionaries) to perform similar efficient communication. We formalize this class of data structures using a new abstract concept, which we call a revocable proof system.

Implications of the Conclusion

The above results indicate that we cannot "completely eliminate state with cryptography," and therefore there is no easy solution that can make the establishment of stateless blockchains a reality, allowing users to never update witnesses. The state does not disappear; rather, it shifts from validators to users in the form of frequently updated witnesses.

Currently, there are indeed several other reasonable solutions that deviate from the strict stateless blockchain model. A natural relaxation of this model is to allow a third party (neither a user nor a validator) to be responsible for storing the entire state. This party is called a proof service node (Srinivasan et al. have conducted the most rigorous research on this), which uses the complete state to generate the latest witnesses on behalf of users. Users can then use these witnesses for transactions, just as in a regular stateless blockchain, where validators still only store concise states. The incentive mechanism of this system, especially how users compensate proof service nodes, is an interesting open research direction.

While the discussions so far have mainly focused on L1 blockchains, our results will also impact L2 systems such as rollups. Whether optimistic or zk rollups, they typically generate a larger state value and submit a smaller value to L1 through corresponding means for storage. We hope that these users can withdraw funds directly on L1 by directly publishing account balance witnesses without the cooperation of L2 servers. In our model, this setup is also an instance of a revocable proof system. In fact, it can be said that stateless blockchains have already been implemented in practice in the form of L2 rollups.

Unfortunately, our "impossibility" cannot be directly applied. Users' rollup withdrawal witnesses must frequently change and synchronize; otherwise, almost the entire L2 state must be written to L1. Therefore, current rollup solutions typically set up a data availability committee (sometimes called "validium"), which functions similarly to a "proof service node," helping users calculate new witnesses when they are ready to exit. Our research results indicate that without access to transaction data, users cannot compute the Merkle Tree needed to prove ownership of funds and execute withdrawals, which is a warning that always applies to users in Ethereum documents.

As blockchain systems evolve, developing more effective methods for managing blockchain state will become increasingly important. While our research suggests that the impossibility of stateless blockchains being applied seems negative, this conclusion remains meaningful for blockchain designers. Because it tells them to focus their research efforts elsewhere, ideally helping us find feasible solutions more quickly.

Ethereum Scalability Dynamics and Analysis

Track the dynamics of major Ethereum scaling solutions, interpret the trends and future of various scaling plans.

Topic or theme

ChainCatcher reminds readers to view blockchain rationally, enhance risk awareness, and be cautious of various virtual token issuances and speculations. All content on this site is solely market information or related party opinions, and does not constitute any form of investment advice. If you find sensitive information in the content, please click "Report", and we will handle it promptly.