Vitalik's new work: Potential technical roadmap for Ethereum after The Merge
Original Title: 《Possible futures of the Ethereum protocol, part 1: The Merge》
Author: Vitalik Buterin
Compiled by: Tia, Techub News
Initially, "the Merge" referred to the transition from proof of work to proof of stake. Today, Ethereum has been operating as a proof of stake system for nearly two years, and proof of stake has performed exceptionally well in terms of stability, performance, and avoiding centralization risks. However, there are still areas where proof of stake needs improvement.
The roadmap I outlined in 2023 includes several components: improving technical features, such as increasing stability, performance, and accessibility for small validators, as well as economic reforms to address centralization risks. The former is part of "the Merge," while the latter is part of "the Scourge."
This article will focus on the "the Merge" part: What improvements can be made to the technical design of proof of stake, and what are the ways to achieve these improvements?
Please note that this is a list of ideas, not an exhaustive checklist of things that need to be accomplished for proof of stake.
Single Slot Finality and Staking Democratization
What problem are we trying to solve?
Currently, it takes 2-3 epochs (about 15 minutes) to finalize a block, and 32 ETH is required to become a validator. This is due to the need to balance three objectives:
Maximizing the number of validators participating in staking (which directly means minimizing the minimum ETH required for staking)
Minimizing finality time
Minimizing the overhead of running nodes
These three objectives are in conflict: to achieve economic finality (i.e., attackers need to destroy a large amount of ETH to revert a finalized block), each validator needs to sign two messages every time finality is achieved. Therefore, if you have many validators, either it takes a long time to process all signatures, which requires very powerful node performance, or you can only process all signatures simultaneously.
Ultimately, all of this serves the goal: Attackers need to incur enormous costs to succeed. This is what "economic finality" means. If we disregard this goal, we could simply randomly select a committee (as Algorand does) to finalize each slot. However, the problem with this approach is that if an attacker can control 51% of the validators, they can attack at a very low cost (reverting finalized blocks, censorship, or delaying finality): only those nodes in the committee can be detected as participating in the attack and punished, either through slashing or minor soft forks. This means attackers can repeatedly attack the chain. Therefore, if we want to achieve economic finality, a simple committee-based system will not work.
At first glance, we do need all validators to participate.
But in an ideal scenario, we can still achieve economic finality while improving the following two aspects:
Finalizing blocks within a slot (ideally maintaining or even reducing the current 12-second length), rather than 15 minutes
Allowing validators to stake 1 ETH (down from 32 ETH)
These goals can be seen as "bringing Ethereum's performance closer to (more centralized) performance-focused L1s."
But it will still use a finality mechanism with higher security guarantees to ensure the safety of all Ethereum users. Currently, most users cannot access this level of security because they are unwilling to wait 15 minutes; with a Single Slot Finality mechanism, users can achieve transaction finality almost immediately after confirming a transaction. Furthermore, if users and applications do not have to worry about chain rollbacks, it can simplify the protocol and surrounding infrastructure, reducing the factors that the protocol and infrastructure need to consider.
The second goal is aimed at supporting solo stakers (users staking independently rather than relying on institutions). The main barrier preventing more people from solo staking is the 32 ETH minimum requirement. Reducing the minimum requirement to 1 ETH would address this issue, making other factors the primary limitations for individual staking.
However, there is a challenge: Faster finality and more democratized staking conflict with minimizing overhead. This is why we did not adopt Single Slot Finality from the start. However, recent research has proposed some potential solutions to this problem.
What is it and how does it work?
Single Slot Finality refers to a consensus algorithm that finalizes blocks within a single slot. This is not an inherently difficult goal to achieve: many algorithms (such as Tendermint consensus) have already achieved this with optimal properties. However, Ethereum's unique "++inactivity leaks++" property is not present in Tendermint. This property allows Ethereum to continue operating and eventually recover even if more than 1/3 of the validators are offline. Fortunately, there are now solutions to achieve this property: there are proposals to modify Tendermint-style consensus to accommodate inactivity leaks.
Single Slot Finality Proposal
The most challenging part is figuring out how to make Single Slot Finality work effectively with a very high number of validators without incurring extremely high operational costs for node operators. Currently, there are several solutions:
- Option 1: Brute Force ------ Striving to achieve better signature aggregation protocols, possibly using ZK-SNARKs, which would effectively allow us to process millions of validator signatures in each slot.
Horn, one of the optimized aggregation protocol designs
- **Option 2: ** Orbit Committee, a new mechanism that randomly selects a medium-sized committee to determine the finality of the chain while ensuring high economic attack costs.
One way to think about Orbit SSF is that it opens up a compromise option that does not lack economic finality like an Algorand-style committee, but can still achieve a certain degree of high economic attack costs, allowing Ethereum to maintain sufficient economic finality for extreme security while also improving the efficiency of single slots.
Orbit leverages the pre-existing heterogeneity in validator deposit sizes to achieve as much economic finality as possible while still giving small validators a corresponding role in participation. Additionally, Orbit uses a slow committee rotation mechanism to ensure high overlap between adjacent quorums, thereby ensuring that its economic finality remains applicable even during committee rotations.
Option 3: Dual-layer staking, which divides stakers into two categories, one with higher deposit requirements and the other with lower deposit requirements. Only stakers in the higher deposit tier will directly achieve economic finality. There have been some proposals (see Rainbow Staking post) that specify the rights and responsibilities required for lower-tier deposit stakers. Common ideas include:
Delegating stakes to higher-tier stakeholders
Randomly selecting lower-tier stakers to validate and finalize each block
Generating inclusion lists rights
What connections exist with existing research?
Pathways to achieve Single Slot Finality (2022): https://notes.ethereum.org/@vbuterin/singleslotfinality
Specific proposals for Ethereum's Single Slot Finality protocol (2023): https://eprint.iacr.org/2023/280
Orbit SSF: https://ethresear.ch/t/orbit-ssf-solo-staking-friendly-validator-set-management-for-ssf/19928
Further analysis of Orbit-style mechanisms: https://notes.ethereum.org/@anderselowsson/Vorbit_SSF
Horn, signature aggregation protocol (2022): https://ethresear.ch/t/horn-collecting-signatures-for-faster-finality/14219
Large-scale consensus signatures the Merge (2023): https://ethresear.ch/t/signature-merging-for-large-scale-consensus/17386?u=asn
Signature aggregation protocol proposed by Khovratovich: https://hackmd.io/@7dpNYqjKQGeYC7wMlPxHtQ/BykM3ggu0#/
STARK-based signature aggregation (2022): https://hackmd.io/@vbuterin/stark_aggregation
Rainbow Staking: https://ethresear.ch/t/unbundling-staking-towards-rainbow-staking/18683
What else needs to be done? What are the trade-offs?
There are four paths to choose from (we can also take a mixed path):
Maintain the status quo
Orbit SSF
Brute force SSF
SSF with a dual-layer staking mechanism
(1) means doing nothing and keeping things as they are, but this would worsen Ethereum's security experience and the centralization attributes of staking beyond what they should be.
(2) avoids "high-tech" solutions by cleverly rethinking protocol assumptions: we relax the requirement for "economic finality," so we require that attacks are expensive, but the cost of attacks could be 10 times lower than now (for example, an attack cost of $2.5 billion instead of $25 billion). It is generally believed that Ethereum's current economic finality far exceeds what it needs, and its main security risks lie elsewhere, so this could be considered an acceptable sacrifice.
The main work is to verify whether the Orbit mechanism is secure and possesses the desired properties, then fully formalize and implement it. Additionally, EIP-7251 (Increase Maximum Effective Balance) allows voluntary validator balances the Merge, which will immediately reduce the verification overhead of the chain and serve as an effective initial phase for the launch of Orbit.
(3) forcibly solves the problem with high-tech solutions. Achieving this requires collecting a large number of signatures (over 1 million) in a very short time (5-10 seconds).
(4) creates a dual-layer staking system without overthinking the mechanism or using high-tech, but it still carries centralization risks. The risks largely depend on the specific rights granted to the lower staking tier. For example:
If lower-tier stakers need to delegate their proof rights to higher-tier stakers, then the delegation could become centralized, ultimately resulting in two highly centralized staking tiers.
If random sampling is required from the lower tier to approve each block, then attackers could spend a minimal amount of ETH to prevent finality.
If lower-tier stakers can only create inclusion lists, then the proof layer could remain centralized, at which point a 51% attack on the proof layer could censor the inclusion list itself.
Multiple strategies can be combined, such as:
(1 + 2): Add Orbit but do not implement Single Slot Finality
(1 + 3): Use brute force technology to reduce the minimum deposit without implementing Single Slot Finality. The required aggregation amount is 64 times less than in pure (3), making the problem easier.
(2 + 3): Execute Orbit SSF with conservative parameters (e.g., a 128k validator committee instead of 8k or 32k) and use brute force technology to make it highly efficient.
(1 + 4): Add Rainbow Staking but do not implement Single Slot Finality.
How does it interact with other parts of the roadmap?
In addition to other benefits, Single Slot Finality also reduces the risk of certain types of multi-block MEV attacks. Furthermore, in a Single Slot Finality world, the proposer-validator separation design and other protocol-level block production mechanisms will need to be designed differently.
The weakness of achieving the goal through brute force is that reducing slot time becomes more challenging.
Single Secret Leader Election
What problem are we trying to solve?
Today, which validator will propose the next block can be known in advance. This creates a security vulnerability: attackers can monitor the network, determine which validators correspond to which IP addresses, and launch DoS attacks against them when they are about to propose a block.
What is it and how does it work?
The best way to solve the DoS problem is to hide the information about which validator will generate the next block (at least until the block is actually generated). If we disregard the "single" requirement (only one party generates the next block), one solution is to allow anyone to create the next block, but this requires randao reveal to be less than 2 (256) / N. Generally, only one validator can meet this requirement (but sometimes there may be two or more, or sometimes none). Therefore, combining the "confidentiality" requirement with the "single" requirement has always been a challenge.
The Single Secret Leader Election protocol creates a "blind" validator ID for each validator using some cryptographic techniques, then allows many proposers to have the opportunity to shuffle and re-blind the blind ID pool (similar to how mix networks work), thereby solving this problem. In each slot, a random blind ID is selected. Only the owner of that blind ID can generate a valid proof to propose a block, but no one knows which validator corresponds to that blind ID.
Whisk SSLE Protocol
What connections exist with existing research?
Dan Boneh's paper (2020): https://eprint.iacr.org/2020/025.pdf
Whisk (Ethereum-specific proposal, 2022): https://ethresear.ch/t/whisk-a-practical-shuffle-based-ssle-protocol-for-ethereum/11763
Single Secret Leader Election tag on ethresear.ch: https://ethresear.ch/tag/single-secret-leader-election
Simplified SSLE using ring signatures: https://ethresear.ch/t/simplified-ssle/12315
What remains to be done? What are the trade-offs?
In reality, what remains is to find and implement a sufficiently simple protocol so that we can easily deploy it on the mainnet. We place a high value on Ethereum's simplicity, and we do not want to increase complexity further. The SSLE implementations we have seen add hundreds of lines of specification code and introduce new assumptions in complex cryptography. Finding a sufficiently efficient quantum-resistant SSLE implementation is also an unresolved issue.
Ultimately, it may be the case that the "marginal additional complexity" of SSLE will only drop to a sufficiently low level when we boldly attempt to introduce mechanisms for executing general zero-knowledge proofs into the Ethereum protocol at L1 for other reasons (such as state trees, ZK-EVM).
Another option is to completely ignore SSLE and instead use off-protocol mitigations (such as at the p2p layer) to address the DoS problem.
How does it interact with other parts of the roadmap?
If we add a proposer-validator separation (APS) mechanism, such as execution tickets, then executing blocks (i.e., blocks containing Ethereum transactions) will not require SSLE, as we can rely on dedicated block builders. However, for consensus blocks (i.e., blocks containing protocol messages, such as proofs, possibly including lists, etc.), we will still benefit from SSLE.
Faster Transaction Confirmation
What problem are we trying to solve?
Shortening Ethereum's transaction confirmation time from 12 seconds to 4 seconds is valuable. Doing so will significantly improve the user experience for L1 and rollup-based users while making DeFi protocols more efficient. It will also make it easier for L2 to decentralize, as it will allow a large number of L2 applications to run based on rollups, reducing the need for L2 to build their own decentralized ordering based on committees.
What is it and how does it work?
There are roughly two technologies here:
Reducing slot time, for example, to 8 seconds or 4 seconds. This does not necessarily mean 4 seconds of finality: finality itself requires three rounds of communication, so we can treat each round of communication as a separate block, which will at least achieve preliminary confirmation after 4 seconds.
Allowing proposers to publish pre-confirmations during the slot. In extreme cases, proposers can real-time include transactions they see in their blocks and immediately publish pre-confirmation messages for each transaction ("My first transaction is 0×1234…", "My second transaction is 0×5678…"). The situation where proposers publish two conflicting confirmations can be handled in two ways: (i) slashing the proposer, or (ii) using witnesses to vote on which one came first.
What connections exist with existing research?
Based on pre-confirmations: https://ethresear.ch/t/based-preconfirmations/17353
Protocol Enforced Proposer Commitments (PEPC): https://ethresear.ch/t/unbundling-pbs-towards-protocol-enforced-proposer-commitments-pepc/13879
Interleaved periods on parallel chains (2018 idea for achieving low latency): https://ethresear.ch/t/staggered-periods/1793
What remains to be done? What are the trade-offs?
The feasibility of shortening slot time is currently unclear. Even today, many validators in various parts of the world struggle to obtain proofs quickly enough. Attempting a 4-second slot time poses risks of centralizing the validator set, and due to latency, it is impractical to become a validator outside of a few privileged regions.
The weakness of the proposer pre-confirmation method is that it can greatly improve the average case inclusion time but does not improve the worst case: if the current proposer is performing well, your transaction will be pre-confirmed in 0.5 seconds instead of (on average) 6 seconds for inclusion, but if the current proposer is offline or performing poorly, you still have to wait a full 12 seconds to start the next slot and provide a new proposer.
Additionally, there is an outstanding question of how to incentivize pre-confirmations. Proposers have the incentive to maximize their optionality for as long as possible. If witnesses sign the timeliness of pre-confirmations, then transaction senders could condition part of the fees on immediate pre-confirmation, but this would place an additional burden on witnesses and could make it harder for witnesses to continue acting as neutral "dumb pipes."
On the other hand, if we do not attempt to do this and keep the finality time at 12 seconds (or longer), the ecosystem will place more emphasis on L2's pre-confirmation mechanisms, and interactions across L2s will take longer.
How does it interact with other parts of the roadmap?
Proposer-based pre-confirmation actually relies on the proposer-validator separation (APS) mechanism, such as execution tickets. Otherwise, the pressure to provide real-time pre-confirmations for regular validators may become too centralized.
Other Research Areas
51% Attack Recovery
It is generally believed that if a 51% attack occurs (including attacks that cannot be proven through cryptography, such as censorship), the community will work together to implement minority soft forks to ensure that good actors win while bad actors are penalized or slashed due to inactivity. However, this level of over-reliance on the social layer can be considered unhealthy. We can try to reduce reliance on the social layer by making the recovery processas automated as possible.
Complete automation is impossible because if it were fully automated, it would equate to a consensus algorithm with a fault tolerance of >50%, and we already know the very strict mathematical limitations of such algorithms. But we can achieve partial automation: for example, if a client has seen a transaction for a long time, it can automatically refuse to accept a chain as finalized, or even refuse to accept it as the head of a fork choice. A key goal is to ensure that attackers at least cannot quickly achieve a complete victory.
Raising the Quorum Threshold
Currently, a block is finalized as long as 67% of stakers support it. Some argue that this practice is too aggressive. Throughout Ethereum's history, there has only been one (very brief) failure of finality. If this ratio were raised to 80%, the number of additional non-finality periods would be relatively low, but Ethereum would gain security: specifically, many more controversial situations would lead to temporary halts in finality. This seems to be a healthier situation than the "wrong side" immediately winning, whether the wrong side is an attacker or a client with errors.
This also answers the question of "what is the point of solo stakers." Nowadays, most stakers stake through staking pools, making it seem unlikely that individual stakers could stake ETH to reach 51%. However, if we put in the effort, it seems possible to enable individual stakers to reach a quorum that prevents a minority, especially if the quorum is 80% (thus requiring only 21% to prevent a minority). As long as individual stakers do not participate in a 51% attack (whether it is a finality reversal or censorship), such an attack would not achieve a "clean victory," and individual stakers would have the incentive to help organize a minority soft fork.
Quantum Attack Resistance
Metaculus currently believes that, although with a large margin of error, quantum computers may start breaking cryptography sometime in the 2030s:
Quantum computing experts, such as Scott Aaronson, have recently begun to take the possibility of quantum computers working in the medium term more seriously. This will impact the entire Ethereum roadmap: it means that every part of the Ethereum protocol currently relying on elliptic curves will need some alternative based on hashes or other quantum resistance. This particularly means we cannot assume that we will be able to rely on the excellent performance of BLS aggregation indefinitely to handle signatures from large validator sets. This demonstrates that conservatism in performance assumptions for proof of stake design is reasonable and is also a reason to more actively develop quantum-resistant alternatives.