Vitalik's Blog: What Challenges Does the Ethereum L2-Centric Ecosystem Face?

Vitalik Buterin
2024-06-12 11:21:05
Collection
The L2-centered ecosystem is technically speaking a sharding, but within the sharding, you can build your own shard with your own rules. This is very powerful, with limitless creativity, and can achieve a lot of autonomous innovation. However, it also presents some key challenges, especially in terms of coordination.

Author: Vitalik Buterin

Compiled by: Peng Sun, Foresight News

Original Title: How do layer 2s really differ from execution sharding?

Two and a half years ago, I mentioned in the article "Endgame" that the different development paths for blockchains look very similar from a technical perspective. In both cases, there are a large number of transactions on-chain, and processing these transactions requires: (1) a large amount of computation; (2) a large amount of data bandwidth. A regular Ethereum node (like the 2 TB reth archive node I am running on my computer) is not sufficient to directly verify the huge amounts of data and computation, even with strong software engineering performance and Verkle trees. Instead, in both the "L1 sharding" and rollup-centric approaches, ZK-SNARKs are used to verify computations, and DAS is used to verify data availability. Whether it's L2 sharding or rollups, DAS is the same, and the ZK-SNARKs technology is also the same. They are both smart contract code and a function of the protocol. From a true technical perspective, Ethereum is sharding, and rollups are sharding.

ImageImage

This naturally raises the question: what are the differences between the two? One difference is the consequences of code vulnerabilities: in rollups, tokens can be stolen; in sharding, consensus can break. However, I expect that as the protocol stabilizes and formal verification technologies improve, the impact of code vulnerabilities will diminish. So, what other differences exist between these two potentially long-lasting solutions?

Diversity of Execution Environments

In 2019, we briefly discussed the idea of execution environments in Ethereum. Essentially, Ethereum would have different "zones," each of which could set different rules for accounts (including completely different methods like UTXO), how the virtual machine operates, and other functionalities. This would allow for diversity of methods across various parts of the stack, but it would be difficult to achieve if Ethereum tried to consolidate multiple functions into one.

In the end, we abandoned some of the more ambitious plans and retained only the EVM. However, Ethereum L2 (including rollups, validiums, and plasmas) can be said to ultimately serve as execution environments. Currently, we usually focus on EVM-equivalent L2s, but we overlook the diversity brought by many other methods:

● Arbitrum Stylus, which adds a second WASM-based oracle outside of the EVM;

● Fuel, which uses a UTXO-based architecture similar to Bitcoin (but more feature-rich);

● Aztec, which introduces a new language and programming paradigm centered around ZK-SNARK-based privacy-preserving smart contracts.

Image

UTXO-based architecture, source: Fuel documentation

We could try to make the EVM a super virtual machine that encompasses all possible paradigms, but doing so would significantly reduce the efficiency of each function, and it would be better to let these platforms do what they specialize in.

Security Trade-offs: Scalability and Transaction Speed

Ethereum L1 provides very strong security guarantees. If certain data is included in a block finalized on L1, then the entire consensus (including extreme cases of social consensus) will strive to ensure that this data is not modified, that any execution triggered by this data cannot be reverted, and that this data remains accessible. To achieve this security guarantee, Ethereum L1 is willing to accept high costs. As of this writing, transaction fees are relatively low: Layer 2 charges less than 1 cent per transaction, and even basic ETH transfers on L1 are less than 1 dollar. If technological advancements are fast enough, the growth of available block space could keep pace with the increase in demand, meaning these fees might remain low in the future, but they might not. For many non-financial applications (like social media or gaming), even a transaction fee of 0.01 dollars is too high.

However, social media and gaming do not require the same security model as L1. If someone can pay a million dollars to erase their record of losing a chess game, or make it seem like a tweet was posted three days later than it actually was, that’s acceptable. Therefore, these applications should not bear the same security costs. L2 solutions achieve this by supporting a range of data availability methods from rollups, plasma to validiums.

Another trade-off arises from the issue of asset transfers from L2 to L2. I expect that in the next 5 to 10 years, all rollups will be ZK rollups, and ultra-efficient proof systems like Binius and Circle STARKs with lookups, combined with proof aggregation layers, will make it possible for L2 to provide the final state root in every slot. However, currently, we can only complexly blend Optimistic Rollup and ZK Rollup together and use different proof time windows. If we had implemented execution sharding in 2021, the security model for keeping sharding honest would have been Optimistic Rollup rather than ZK, meaning L1 would have to manage the complex fraud proof logic of the on-chain system, with withdrawal times extending up to a week to transfer assets between shards. But like code vulnerabilities, I believe this issue will also be temporary.

Transaction speed is the third aspect of security trade-offs and is a more enduring one. Ethereum produces a block every 12 seconds, and it cannot be faster without becoming overly centralized. However, many L2s are exploring ways to compress block times to a few hundred milliseconds. Twelve seconds is not too bad: users typically wait about 6-7 seconds after submitting a transaction to be included in a block (not just 6 seconds, as the next block may not include them). This is comparable to the time I wait when paying with a credit card. However, many applications require faster speeds, and L2 can provide that.

To achieve faster speeds, L2 has a preconfirmation mechanism: L2's own validators commit to including transactions at a specific time with digital signatures, and if the transaction is not included, they are penalized. The StakeSure mechanism further extends this.

Image

L2 Preconfirmation

Now, we could try to implement all these features on L1. L1 could include a "fast preconfirmation" and "slow final confirmation" system. It could contain different shards with varying security levels. However, this would increase the complexity of the protocol. Additionally, having L1 handle all the work risks overloading the consensus, as many larger-scale or higher-throughput methods carry greater centralization risks or require stronger forms of "governance." If these stronger demands are completed on L1, their impact would ripple through other parts of the protocol. By providing trade-offs through L2, Ethereum can largely avoid these risks.

Benefits of Layer 2 for Organization and Culture

Imagine a country split in two, with one half becoming a capitalist nation and the other half becoming a government-dominated state (unlike what happens in reality, assume in this thought experiment that this is not the result of any traumatic war, but simply that one day the borders naturally appeared). In the capitalist part, restaurants are composed of different decentralized ownership, blockchains, and electoral rights. In the government-dominated state, they are all branches of the government, like police stations. On the first day, there wouldn’t be much change. People would essentially follow existing habits, which are feasible or not depending on technical realities like labor skills and infrastructure. However, a year later, you would see huge changes, as different incentive and control structures lead to significant behavioral changes, affecting who stays or leaves, what gets built, what gets maintained, and what gets abandoned.

Industrial organization theory discusses many such distinctions: it not only talks about the differences between government-managed economies and capitalist economies but also about the differences between economies dominated by large franchise owners and those where every supermarket is run by independent entrepreneurs. I believe the distinction between L1-centric ecosystems and L2-centric ecosystems is similar.

Image

The "core developers manage everything" structure has serious problems

As an L2-centric ecosystem, I believe Ethereum's main advantages are as follows:

Because Ethereum is an L2-centric ecosystem, you can freely build a sub-ecosystem with your own unique functionalities while still being part of the larger Ethereum.

If you are just building an Ethereum client, you are part of the larger Ethereum, and while you have some room for innovation, it is far less than L2. If you are building a completely independent chain, your creative space is vast, but you also lose the benefits of shared security and shared network effects. L2 strikes a good balance.

It not only provides opportunities to experiment with new execution environments and security trade-offs that can achieve scalability, flexibility, and speed but also offers an incentive mechanism that encourages developers to build and maintain while also motivating community support.

In fact, each L2 is isolated, which also means that deploying new methods is permissionless: there is no need to persuade all core developers that your new method is "safe" for other parts of the entire chain. If your L2 fails, that is your responsibility. Anyone can propose novel ideas (like Intmax's Plasma method), and even if Ethereum core developers are completely unaware, they can continue to build and eventually deploy. L1 functionalities and precompiles are not like this; even in Ethereum, the success or failure of L1 development often depends on politics, to a greater extent than we would like. Regardless of what can theoretically be built, the different incentive mechanisms generated by L1-centric ecosystems and L2-centric ecosystems will ultimately have a significant impact on the content, quality level, and order of what is actually built.

What Challenges Does Ethereum's L2-Centric Ecosystem Face?

Image

The L1 + L2 architecture has serious problems. Image source: Reddit

This L2-centric approach faces a key challenge that L1-centric ecosystems almost do not have to confront: coordination. In other words, while Ethereum has many L2s, the challenge is how to make it still feel like "Ethereum" and maintain Ethereum's network effects, rather than N independent chains. Currently, this situation is lacking in many ways:

● Cross-chain transactions between L2s often require centralized cross-chain bridges, which are very complex for ordinary users. If you have tokens on Optimism, you cannot just paste someone else's Arbitrum address into your wallet to send funds.

● Support for cross-chain smart contract wallets for individual smart contract wallets and organizational wallets (including DAOs) is not very good. If you change a key on one L2, you still need to change keys on every other L2.

● Decentralized verification infrastructure is often lacking. Ethereum has finally begun to have decent light clients, like Helios. However, if all activities occur on L2 and require their own centralized RPC, it becomes meaningless. In principle, once you have Ethereum block headers, building light clients for L2 is not difficult; but in practice, there has been too little emphasis on this.

The community is working to improve these three aspects. For cross-chain token exchanges, the ERC-7683 standard is a new solution that, unlike existing "centralized cross-chain bridges," has no fixed centralized nodes, tokens, or governance. For cross-chain accounts, most wallets take the approach of using cross-chain replayable messages to update keys in the short term, and keystore rollups in the long term. Light clients for L2 are starting to emerge, such as Beerus for Starknet. Additionally, recent improvements in user experience through next-generation wallets have addressed more fundamental issues, such as allowing users to access DApps without manually switching networks.

Image

Rabby multi-chain asset balance overview, which previous wallets could not achieve!

But it must be recognized that an L2-centric ecosystem does face challenges in trying to coordinate. Because a single L2 does not have natural economic incentives to build infrastructure for coordination: small-scale L2s will not do so because they are only after a small portion of the benefits; large-scale L2s will not either, as they can gain just as much or more from strengthening their local network effects. If each L2 only considers its own interests without anyone thinking about how to align with the broader Ethereum system, we will fail, much like the urban utopia depicted in the images above.

It is hard to say that there is a perfect solution to this problem. I can only say that the ecosystem needs to recognize more fully that cross-L2 infrastructure, like L1 clients, development tools, and programming languages, is a type of Ethereum infrastructure that deserves attention and funding. We have Protocol Guild; perhaps we need a Basic Infrastructure Guild.

Conclusion

In various public discussions, "L2" and "sharding" are often seen as two opposing strategies for blockchain scalability. However, when you study the underlying technology, you find a dilemma: the actual underlying scalability methods are completely the same. Whether it is data sharding, fraud validators, or ZK-SNARK validators, or solutions for cross "rollup, sharding" communication, the main difference lies in: who is responsible for building and updating these components, and how much autonomy they have?

An L2-centric ecosystem is, from a true technical perspective, sharding, but in sharding, you can build your own shard with your own rules. This is incredibly powerful, with limitless creativity, allowing for a great deal of autonomous innovation. However, it also presents some key challenges, particularly in coordination. For an L2-centric ecosystem like Ethereum to succeed, it must understand these challenges and tackle them head-on to reap as many benefits from the L1-centric ecosystem as possible and get as close as possible to the optimal state of both.

ChainCatcher reminds readers to view blockchain rationally, enhance risk awareness, and be cautious of various virtual token issuances and speculations. All content on this site is solely market information or related party opinions, and does not constitute any form of investment advice. If you find sensitive information in the content, please click "Report", and we will handle it promptly.
banner
ChainCatcher Building the Web3 world with innovators