Vitalik's Long Review: The "Paths Not Taken" by Ethereum

vitalik
2022-03-30 13:14:36
Collection
Ethereum's biggest challenge lies in balancing two visions: one that values security and a simple, pure blockchain, and another that is a high-performance and feature-rich platform for building advanced applications.

Author: Vitalik, Founder of Ethereum

Original Title: 《The roads not taken

Translation: Mary Ma, Wu Says Blockchain

The Ethereum development community made many decisions in the early stages of Ethereum that had a huge impact on the project's trajectory. In some cases, Ethereum developers consciously made decisions to improve upon areas where we believed Bitcoin had issues. In other areas, we were creating something entirely new, and we just had to come up with something to fill the gaps, but there were many options to choose from. In some places, we needed to weigh the trade-offs between more complex and simpler solutions. Sometimes we chose the simpler option, but at other times, we opted for the more complex one.

This article will focus on these crossroads of Ethereum as I remember them. Many of these features were seriously discussed within the core development circle; some were barely considered but perhaps should have been. Even so, it is necessary to look at what a different Ethereum might have looked like and what we can learn from it.

Should we have adopted a simpler PoS mechanism?

The upcoming Gasper PoS mechanism for Ethereum is a complex system, but it is also a very powerful one. Some of its attributes include:

  • Very strong single-block finality: Once a transaction is included in a block, that block is typically finalized within seconds, and it cannot be reversed unless a large portion of the nodes are dishonest or there is extreme network latency.
  • Economic finality: Once a block is finalized, it cannot be reversed unless an attacker can withstand the loss of millions of ETH being slashed.

  • Very predictable rewards: Validators can reliably earn rewards every epoch (6.4 minutes).

  • Support for a very high number of validators: Unlike most other chains with the above characteristics, the Ethereum beacon chain supports hundreds of thousands of validators (for example, Tendermint provides faster finality than Ethereum but only supports a few hundred validators).

However, creating a system with these characteristics is difficult. It requires years of research, years of failed experiments, and typically a lot of effort, resulting in a fairly complex output.

image

If our researchers didn't have to worry so much about consensus and had more free thinking time, then perhaps, just perhaps, rollups could have been invented in 2016. This raises a thought: should we really hold our PoS to such high standards? Because even a simpler and weaker PoS would be a significant improvement over the current PoW.

Many people have a misconception that PoS itself is very complex, but in reality, there are many PoS algorithms that are almost as simple as Satoshi's PoW consensus. NXT's PoS has existed since 2013 and was a ready candidate; while it has some issues, those issues are easily fixable, and we could have had a reasonably viable PoS since 2017, or even from the very beginning. Gasper is more complex than these algorithms simply because it tries to accomplish a lot more. However, if we hadn't aimed so high from the start, we could have focused on achieving a more limited set of goals.

In my view, implementing PoS from the beginning was a mistake; PoW helped with the distribution of the initial issuance and made Ethereum more accessible, and it encouraged a community of enthusiasts. But in 2017, or even in 2020, switching to a simpler PoS might have resulted in less environmental damage (and the anti-crypto sentiment arising from environmental damage), and more research talent could have had the freedom to think about scaling issues. Would we ultimately have to spend a lot of resources to create a better PoS? I think we would, but it seems that regardless, we would end up doing so eventually.

De-complexifying Sharding

Since research on Ethereum sharding began in 2014, it has been moving towards a less complex direction. First, we had complex sharding with built-in execution and cross-shard transactions; then we simplified the protocol by shifting more responsibility to users, where users had to pay gas fees separately for two shards in cross-shard transactions; next, we switched to a rollup-centric roadmap, where, from a protocol perspective, sharding is just data sharding. Finally, with danksharding, the shard fee market is merged into a whole, and the final design looks like a non-sharded chain, but here, data availability sampling enables shard validation.

image

But what if we had gone the opposite route? In fact, there are some Ethereum researchers who have deeply explored a more complex sharding system: shards would act as chains, with forking selection rules, where child chains depend on parent chains, cross-shard messages would be routed by the protocol, validators would rotate between shards, and even DApps would automatically achieve load balancing across shards.

The problem with this approach is that these forms of sharding are largely just ideas and mathematical models, while danksharding is a complete, almost implementable specification. Therefore, given the circumstances and constraints of Ethereum, I believe that simplifying and disambiguating sharding is absolutely the right move. That said, more ambitious research also plays a very important role: it identifies promising research directions, and even very complex ideas often have "reasonably simple" versions that still provide a lot of benefits and are likely to significantly influence Ethereum's development (and even layer two protocols) in the coming years.

Choosing Features in the EVM

In reality, aside from security audits, the EVM specification could have been released around mid-2014. However, in the months that followed, we continued to actively explore new features that we believed might be truly important for decentralized blockchains. Some features were added to the EVM, while others were not.

  • We considered adding a POST opcode but decided against it. The POST opcode would perform asynchronous calls that would be executed after the transaction is complete.
  • We considered adding an ALARM opcode but decided against it. The ALARM function is similar to POST, except it executes asynchronous calls in a future block, allowing contracts to schedule operations.
  • We added logs, which allow contracts to output records that do not touch the state but can be interpreted by DApp interfaces and wallets. Notably, we also considered allowing ETH transfers to emit logs but decided against it on the grounds that "people will soon switch to smart contract wallets anyway."
  • We considered expanding SSTORE to support byte arrays but decided against it due to concerns about complexity and security.

  • We added precompiles, which are contracts that execute dedicated cryptographic operations using native implementations, which are much cheaper than executing in the EVM.

  • In the months following the launch, we repeatedly considered state rent but never included it. It was simply too complex. Today, people are actively exploring better state expiry solutions, although stateless validation and proposer/builder separation (PBS) mean it is now a much lower priority.

Looking back, most of the decisions not to add features have proven to be very good ones. There is no obvious reason to add a POST opcode. The ALARM opcode would actually be very difficult to implement safely: what happens if everyone in blocks 1…9999 sets an ALARM, and a lot of code is executed in block 100000? Would that block take hours to process? Would some scheduled operations be pushed to later blocks? But if that happens, what guarantees would ALARM still provide? SSTORE for byte arrays would be difficult to implement safely and would greatly expand the worst-case witness size.

The state rent issue is more challenging: if we had truly implemented some form of state rent from day one, Ethereum would not have to always develop around the normalization assumption of persistent state. Ethereum would be harder to build, but it might be more scalable and sustainable. Meanwhile, our state expiry plan at the time was indeed much worse than what we have now. Sometimes, good ideas just take years to come to fruition, and there is no better way.

Alternatives to LOG

LOG can be accomplished in two different ways.

  1. We could have made ETH transfers automatically emit a LOG. This would save a lot of effort and software error issues for exchanges and many other users, and would accelerate everyone's reliance on LOG, which would help with the adoption of smart contract wallets.

  2. We could completely do without the LOG opcode and turn it into an ERC: there would be a standard contract with a function submitLog that uses the technology of the Ethereum deposit contract to compute the Merkle root of all logs in that block. Either EIP-2929 or block-scoped storage (equivalent to TSTORE but cleared after the block) would make this cheap.

We strongly considered the first option but rejected it. The main reason was that logs only come from the LOG opcode, which is easier. We also very mistakenly expected that most users would quickly migrate to smart contract wallets, which could explicitly use opcodes to record transfers.

We did not consider the second option, but looking back, it was indeed an option. The main drawback of the second option is the lack of a fast scanning log bloom filter mechanism. But it turns out that the bloom filter mechanism is too slow and not friendly for DApps, so more and more people are now using TheGraph for queries.

Overall, either of these methods could potentially be better than the status quo. Not including LOG would simplify things, but including LOG and automatically recording all ETH transfers would make it more useful.

Today, I might advocate for the eventual removal of the LOG opcode from the EVM.

What if the current EVM had taken a completely different path?

Initially, the EVM had two very different paths to choose from:

  1. Make the EVM a higher-level language with built-in structures like variables, if statements, loops, etc.

  2. Make the EVM a copy of certain existing virtual machines (LLVM, WASM, etc.).

The first path was never seriously considered. The appeal of this path is that it could simplify the compiler and allow more developers to code directly in the EVM. It could also simplify the structure of ZK-EVM. The weakness of this path is that it would make EVM code structurally more complex: it would no longer be a simple list of opcodes but a more complex data structure that must be stored in some way. That said, we missed a win-win opportunity: some changes to the EVM could bring us many benefits while keeping the basic EVM structure intact: prohibiting dynamic jumps, adding some opcodes aimed at supporting subroutines (see also: EIP-2315), only allowing memory access at 32-byte word boundaries, etc.

The second path has been proposed many times and rejected many times. The arguments in favor of it are usually that it would allow programs to compile from existing languages (C, Rust, etc.) to the EVM. The counterargument has always been that, given Ethereum's unique constraints, it would not actually provide any benefits:

Compilers for existing high-level languages often do not care about overall code size, while blockchain code must be heavily optimized to minimize the size of every byte of code.

We need multiple implementations of the virtual machine and strictly require that the two implementations do not handle the same code differently. It would be harder to conduct security audits and validations on code we did not write.

If the virtual machine specification changes, Ethereum would have to always update with it or become increasingly out of sync.

Thus, the EVM may never have a completely different viable path than what we have today, although there are many smaller details (jumps, 64-bit vs 256-bit, etc.) that, if they could be done differently, would yield better results.

Should ETH supply be allocated differently?

The current ETH supply can roughly be represented by this chart from Etherscan:

image

Currently, about half of the ETH was sold in the Ethereum public offering, where anyone could send BTC to a Bitcoin address, and the initial ETH supply allocation was calculated through an open-source script. The rest was primarily generated through mining. The black portion of 12 million ETH marked as "other" actually refers to the pre-mined portion, allocated between the Ethereum Foundation and about 100 early contributors to the Ethereum protocol.

There are two main criticisms of this process:

  • Both the pre-mining and the Ethereum Foundation's control over the public offering funds lack credible neutrality. Some recipient addresses were manually selected through a closed process, and the Ethereum Foundation must be trusted not to further leverage the public offering funds through loans to gain more ETH (we did not, nor did anyone claim we did, but even the requirement of being trusted offended some people).

  • Pre-mining overly rewarded very early contributors while leaving too little for later contributors. 75% of the pre-mined ETH was used to reward the work of contributors before the launch, while after the launch, the Ethereum Foundation was left with only 3 million ETH. In a span of 6 months, the demand to sell for survival reduced the supply to around 1 million ETH.

To some extent, these issues are related: the desire to minimize the perception of centralization contributed to a smaller pre-mining, but a smaller pre-mining would deplete faster.

This is not the only solution. Zcash took a different approach: 20% of the block rewards are fixedly allocated to a hard-coded group of recipients in the protocol, which renegotiates every four years (this has happened once so far). This would be more sustainable, but it would face harsher criticism for being overly centralized (the Zcash community seems to be more openly accepting of more technical experts in leadership than the Ethereum community).

One possible alternative path is similar to the "DAO from day 1" route that is popular in some DeFi projects today. Here is a possible strawman proposal:

  • We agree to allocate 2 ETH from each block reward to a development fund over 2 years.

  • Anyone who buys ETH in the Ethereum public offering can allocate votes to their preferred development funds (for example: "1 ETH from each block reward to the Ethereum Foundation, 0.4 ETH to the Consensys research team, 0.2 ETH to Vlad Zamfir…").

  • The share of the development fund received by the voted recipients equals the median of everyone's votes, proportionally calculated, totaling 2 ETH per block (the median is to prevent self-dealing: if you vote for yourself, you get nothing unless at least half of the other buyers mention you).

The public offering could be operated by a legal entity, promising to allocate the Bitcoin received during the public offering in the same proportion to the ETH development fund (or burn it, if we really want to please Bitcoin players). This could lead to the Ethereum Foundation receiving a large amount of funding, while non-Ethereum Foundation groups also receive significant funding (leading to more decentralization of the ecosystem), all without compromising credible neutrality in the slightest. Of course, the main downside is that token voting is really bad, but pragmatically speaking, we can recognize that 2014 was still an early and idealistic time, and the most severe drawbacks of token voting would not start to manifest until long after the public offering ended.

Would this be a better idea and set a better precedent? Perhaps! Although from a realistic perspective, even if the development fund is completely credibly neutral, those who are currently shouting at Ethereum's miners are likely to double down on their shouting against DAO forks.

What can we learn from all this?

Overall, sometimes I feel that Ethereum's biggest challenge comes from balancing between two visions: one that values security and a simple, pure blockchain, and another that is a high-performance and feature-rich platform for building advanced applications. Many of the examples above are just one aspect of this: do we want to have fewer features and be more like Bitcoin, or do we want to have more features and be more developer-friendly? Are we concerned about making development funding more neutral, more like Bitcoin, or are we primarily concerned with ensuring that developers receive enough rewards to make Ethereum better?

My personal dream is to try to achieve both visions simultaneously. A base layer whose specifications get smaller every year, along with a powerful developer-friendly ecosystem of advanced applications centered around layer two protocols. That said, reaching such an ideal world will take a long time, and if we can more clearly recognize that it takes time, it may greatly help us to consider the roadmap step by step.

Today, there are many things we cannot change, but there are also many things we can still change, and there remains a solid path to improve functionality and simplicity. Sometimes this path is winding: we need to add some complexity first to achieve sharding, which in turn can enable a lot of layer two scalability on top. That said, reducing complexity is possible, and Ethereum's history has proven this.

  • EIP-150 made the call stack depth limit irrelevant, reducing security concerns for contract developers.

  • EIP-161 separated the concept of "empty accounts" from accounts with fields set to zero.

  • EIP-3529 removed part of the refund mechanism, making gas tokens no longer viable.

Ideas in the pipeline, such as Verkle trees, could further reduce complexity. But how to better balance these two visions in the future is something we should start thinking more actively about.

ChainCatcher reminds readers to view blockchain rationally, enhance risk awareness, and be cautious of various virtual token issuances and speculations. All content on this site is solely market information or related party opinions, and does not constitute any form of investment advice. If you find sensitive information in the content, please click "Report", and we will handle it promptly.
banner
ChainCatcher Building the Web3 world with innovators