Vitalik's EthCC Speech Transcript: How Should Ethereum Be Optimized for the Future?
Speaker: Vitalik Buterin, Founder of Ethereum
Compiled by: 0xxz, Golden Finance
EthCC7 was recently held in Brussels, where the organizers invited Ethereum founder Vitalik to give a keynote speech.
Notably, 2024 marks the 10th anniversary of Ethereum's ICO. After Vitalik's speech, the three core founders of Ethereum, Vitalik Buterin, Joseph Lubin, and Gavin Wood, took a commemorative photo together once again.
This article covers the keynote speech by Ethereum founder Vitalik at EthCC7.
Speech Topic
Strengthening L1: Optimizing Ethereum to become a highly reliable, trustworthy, and permissionless Layer 2 foundation layer.
Ethereum Vision Spectrum
I believe there is a possible spectrum of different roles that the Ethereum base layer could play in the ecosystem over the next five to ten years. You can think of it as a spectrum from left to right.
On the left side of the spectrum, it essentially tries to be a very minimalist base layer, primarily serving as a proof validator for all L2s. It may also provide the ability to transfer ETH between different L2s. But other than that, that's about it.
On the right side of the spectrum, it essentially refocuses on dApps primarily running on L1, while L2 is only used for some very specific and high-performance transactions.
There are some interesting options in the middle of the spectrum. I place Ethereum as an L2 foundation layer towards the left second position. The far left represents an extreme version, where we completely abandon the entire execution client part of Ethereum, keeping only the consensus part, and adding some zero-knowledge proof validators, essentially turning the entire execution layer into a Rollup.
What I mean is that very extreme options are on the left, while on the right, it can be a foundation layer but also try to provide more functionality for L2. One idea in this direction is to further reduce Ethereum's swap time, which is currently 12 seconds, possibly down to 2-4 seconds. The purpose of this is to make the base rollups operate as the main way for L2s to function. So now, if you want L2 to have a top-notch user experience, you need to have your own pre-confirmation, which means either a centralized sorter or your own decentralized sorter. If their consensus speed increases, then L2 will no longer need to do this. If you really want to enhance L1's scalability, then the demand for L2 will also decrease.
So, this is a spectrum. Currently, I am mainly focused on the left second version, but the things I suggest here also apply to other visions, and the suggestions here do not actually hinder other visions. I think this is an important point.
Ethereum's Robustness Advantages
One major advantage of Ethereum is that it has a large and relatively decentralized staking ecosystem.
On the left side of the image is the hash rate distribution of all Bitcoin mining pools, and on the right side is the staking distribution of Ethereum stakers.
The distribution of Bitcoin hash rate is currently not very good, with two mining pools combined accounting for over 50% of the hash rate, and four mining pools combined accounting for over 75%.
In contrast, Ethereum's situation is actually better than what the chart shows, because the second largest gray area is actually unidentified, meaning it could be a combination of many people, and there may even be many independent stakers in there. The blue part, Lido, is actually a strange, loosely coordinated structure made up of 37 different validators. So, Ethereum actually has a relatively decentralized staking ecosystem that performs quite well.
We can make many improvements in this area, but I think it is still valuable to recognize this. This is one of the unique advantages we can truly build upon.
Ethereum's robustness advantages also include:
Having a multi-client ecosystem: There are Geth execution clients as well as non-Geth execution clients, with non-Geth execution clients even surpassing Geth execution clients in proportion. A similar situation occurs in the consensus client system;
International community: People from many different countries, including projects, L2s, teams, etc.;
Multi-centered knowledge ecosystem: There is the Ethereum Foundation, client teams, and even teams like Paradigm's Reth team, which have been increasing leadership in open source recently;
A culture that values these attributes.
So, the Ethereum ecosystem already has these very strong advantages as a foundation layer. I think this is very valuable and should not be easily given up. I could even say that there are clear steps that can be taken to further advance these advantages and even compensate for our weaknesses.
Where Ethereum L1 Falls Short of High Standards and How to Improve?
This is a poll I conducted about six months ago on Farcaster: If you have not done solo staking, what is preventing you from doing so?
I can repeat this question in this venue: Who is doing solo staking? If you are not doing solo staking, who thinks the 32 ETH threshold is the biggest obstacle? Who thinks running a node is too difficult is the biggest obstacle? Who thinks the biggest obstacle is that you cannot simultaneously put your ETH into DeFi protocols? Who thinks the biggest obstacle is the concern that having to put your private key on a running node makes it easier to be stolen?
It can be seen that the two most commonly agreed-upon obstacles are: the minimum requirement of 32 ETH and the difficulty of operating a node. Recognizing this is always important.
Many times, when we start to delve into how to maximize people's ability to double-use their collateral in DeFi protocols, we find that a large number of people do not even use DeFi protocols at all. So let’s focus on the main issues: what can we do to try to solve these problems?
Starting from running a validating node, or rather, starting from the 32 ETH threshold. In fact, these two issues are related because they are both functions of the number of validators in Ethereum's Proof of Stake.
Today we have about 1 million validator entities, each with a deposit of 32 ETH, so if the minimum requirement is changed to 4 ETH, then we would have 8 million or possibly over 8 million, maybe 9 million or 10 million validators. If we want to reduce it to 100,000 validators, then the minimum requirement might need to rise to around 300 ETH.
So, this is a trade-off. Ethereum has historically tried to be in the middle of this trade-off. However, if we can find any ways to improve, then we will have additional statistical points that can be used to reduce the minimum requirement or to make running nodes easier.
In fact, I now believe that aggregating signatures is not even the main difficulty of running a node. Initially, we might focus more on reducing the minimum requirement, but ultimately both will be involved.
So, there are two technologies that can improve both aspects.
One technology allows staking or finality without requiring every validator to sign. Essentially, you need some form of random sampling, sampling enough nodes to achieve significant economic security.
Now, I believe we have far more than enough economic security. The cost of conducting a 51% attack, calculated in terms of the amount of ETH to be slashed, is one-third of 32 million ETH, about 11 million ETH. Who would spend 11 million ETH to destroy the Ethereum blockchain? Not even the U.S. government would want to.
These sampling techniques are similar to if you have a house, if the front door has four layers of steel protection, but the windows are just a poor quality glass that can be easily broken with a baseball bat. I think Ethereum is somewhat like this; if you want to conduct a 51% attack, you must lose 11 million ETH. But in reality, there are many other ways to attack the protocol, and we should really strengthen these defenses more. So conversely, if you have a subset of validators for finality, then the protocol is still secure enough, and you can really increase the level of decentralization.
The second technology is better signature aggregation. You can do some advanced things like Starks, instead of supporting 30,000 signatures per slot, ultimately we might be able to support more signatures. That’s the first part.
The second part is making running nodes easier.
The first step is historical expiration, and there has already been a lot of progress in this area with EIP-4444.
The second step is a stateless client. Verkle has been around for a long time, and another possible option is to create a binary hash tree similar to Poseidon, a Stark-friendly hash function. Once you have this, to verify an Ethereum block, you no longer need a hard drive. After that, you can also add a Type 1 ZKVM that can Stark-verify the entire Ethereum block, so you can verify arbitrarily large Ethereum blocks by downloading data, or even sampling data availability, and then you only need to verify a proof.
If this is done, running nodes will become easier. If there is a stateless client, one very annoying thing currently is that if you want to change hardware or software settings, you usually either need to start from scratch and lose a day, or do something very risky by putting the keys in two places, which could lead to a slash. If we have a stateless client, you no longer need to do this.
You can simply start a new independent client, shut down the old one, move the keys over, and start the new one. You would only lose one epoch.
Once we have the ZKVM, the hardware requirements will essentially drop to almost zero.
So, the 32 ETH threshold and the difficulty of running nodes are both issues that can be technically solved. I believe that doing so has many other benefits, which will truly improve our ability to enhance people's solo staking capacity, giving us a better solo staking ecosystem and avoiding the risks of staking centralization.
Proof of Stake also has other challenges, such as risks related to liquid staking and MEV-related risks. These are also important issues that need to be continuously considered. Our researchers are looking into these.
Recovering from a 51% Attack
I have really started to think seriously and rigorously about this. Surprisingly, many people do not think about this topic at all and just treat it as a black box.
What would happen if a 51% attack really occurred?
Ethereum could face a 51% attack, Bitcoin could face a 51% attack, and a government could also face a 51% attack, such as buying off 51% of politicians.
One issue is that you do not want to rely solely on prevention; you also want a recovery plan.
A common misconception is that people think a 51% attack is about reversing finality. People focus on this because it is something that Satoshi emphasized in the white paper. You can double spend; after I buy a private jet, I conduct a 51% attack, reclaim my Bitcoin, and still keep my private jet and fly around.
In reality, a more realistic attack might involve deposits on exchanges and things like destroying DeFi protocols.
However, reversing is not actually the worst thing. The biggest risk we should be concerned about is censorship. 51% of the nodes stop accepting blocks from the other 49% of nodes or any nodes attempting to include a certain type of transaction.
Why is this the biggest risk? Because reversing finality has slashing, there is immediate on-chain verifiable evidence that at least one-third of the nodes did something very, very wrong, and they are punished.
In a censorship attack, however, this is not programmatically attributable, and there is no immediate programmatic evidence to say who did bad things. Now, if you are an online node and you want to see a certain transaction not included within 100 blocks, but we haven't even written software to perform such checks.
Another challenge of censorship is that if someone wants to attack, they can do so by delaying unwanted transactions and blocks starting from 30 seconds, then delaying for a minute, then delaying for two minutes, and you don’t even have consensus on when to respond.
So, I say that in fact, censorship is a greater risk.
There is an argument in blockchain culture that if an attack occurs, the community will unite, and they will obviously perform a minor soft fork and cut off the attacker.
This may be true today, but it relies on many assumptions about coordination, ideology, and various other things, and it is unclear how real such things will be in ten years. So many other blockchain communities have started doing things like this; they say we have things like censorship, which are essentially more unaccountable errors. Therefore, we must rely on social consensus. So let’s just rely on social consensus and proudly acknowledge that we will use it to solve our problems.
In fact, I advocate moving in the opposite direction. We know that fully coordinated automatic responses and automatic forks against a majority of attackers conducting censorship are mathematically impossible. But we can get as close to that as possible.
You can create a fork based on some assumptions about network conditions, which actually brings in at least a majority of online nodes. The argument I want to convey here is that we actually want to make the response to a 51% attack as automated as possible.
If you are a validator, then your node should run software that, if it detects that transactions are being censored or certain validators are being censored, it will automatically perform counter-censorship against the majority chain, and all honest nodes will automatically coordinate on the same minor soft fork due to the code they are running.
Of course, there are again mathematical impossibility results; at least anyone who is offline at the time will not be able to distinguish who is right and who is wrong.
There are many limitations, but the closer we get to this goal, the less work social consensus needs to do.
If you imagine what would actually happen in a 51% attack. It won't be like below, suddenly at some point, Lido, Coinbase, and Kraken will publish a blog post at 5:46 basically saying, "Hey guys, we are now conducting censorship."
What will actually happen is that you will see a social media war at the same time, and you will see various other attacks. If a 51% attack does indeed occur, by the way, I mean we should not assume that Lido, Coinbase, and Kraken will be in power in ten years. The Ethereum ecosystem will become increasingly mainstream, and it needs to have a strong adaptability to this. We want the burden at the social level to be as light as possible, which means we need the technical layer to at least propose an obvious winning candidate. If they want to fork from a chain that is undergoing censorship, they should rally around a minor soft fork.
I advocate that we conduct more research and propose a very specific suggestion.
Proposal: Raise the Quorum Threshold to 75% or 80%
I believe that the quorum threshold can be raised from the current two-thirds to around 75% or 80%.
The basic argument is that if a malicious chain, such as a censorship chain, conducts an attack, recovery will become very, very difficult. However, on the other hand, if you increase the quorum ratio, what is the risk? If the quorum is 80%, then it is not 34% of nodes going offline that can stop finality, but rather 21% of nodes going offline that can stop finality.
This has risks. Let’s see how it plays out in practice. From what I understand, I think we have only once experienced finality stopping for about an hour due to more than one-third of nodes going offline. Then, has there been any event involving 20% to 33% of nodes going offline? I think at most once, and at least zero times. Because in practice, very few validators go offline, I actually think the risk of doing this is quite low. The benefit is essentially that the threshold that attackers need to reach is significantly increased, and in the case of client vulnerabilities, the scenarios in which the chain enters safe mode are greatly increased, allowing people to truly cooperate to identify the problem.
If the quorum threshold is raised from 67% to 80%, then assuming the proportion that a client needs to reach is raised from 67% to 80%, the value of a minority client or the value that a minority client can provide will really start to increase.
Other Censorship Concerns
Other censorship concerns are either inclusion lists or some alternative to inclusion lists. So, the whole multi-parallel proposer thing, if effective, could even become an alternative to inclusion lists. You need either account abstraction or some form of account abstraction within the protocol.
The reason you need it is that currently, smart contract wallets do not really benefit from inclusion lists. Any form of protocol layer censorship resistance guarantees that smart contract wallets do not really benefit.
If there is account abstraction within the protocol, then they will benefit. So, there are many things, and actually many of these things are valuable in both the L2-centered vision and the L1-centered vision.
I believe that among the different ideas I discussed, about half may be specifically targeted at Ethereum focusing on L2, but the other half basically applies to L2 as the foundation layer of Ethereum users and L1, or directly user-facing applications as users.
Ubiquitous Use of Light Clients
In many ways, our interaction with the space is somewhat disappointing; we are decentralized, we are trustless. Who in this room is running a light client for validating consensus on their computer? Very few. Who uses Ethereum through a browser wallet that trusts Infura? In five years, I hope to see the number of hands raised reversed. I want to see wallets that do not trust Infura. We need to integrate light clients.
Infura can continue to provide data. I mean, if you do not need to trust Infura, that is actually beneficial for Infura because it makes it easier for them to build and deploy infrastructure, but we have tools to remove trust requirements.
What we can do is have a system where end users run something like the Helios light client. It should actually run directly in the browser, directly validating Ethereum consensus. If they want to verify something on-chain, like interacting with the chain, then you just need to directly verify the Merkle proof.
If this is done, you actually gain a certain level of trustlessness in your interaction with Ethereum. This is for L1. Additionally, we also need an equivalent solution for L2.
On the L1 chain, there are block headers, states, sync committees, and consensus. If you validate the consensus, if you know what the block header is, you can walk through the Merkle branch to see what the state is. So how do we provide light client security guarantees for L2s? The state root of L2 is there; if it is a base Rollup, there is a smart contract that stores the L2 block header. Or, if you have pre-confirmation, then you have a smart contract that stores who the pre-confirmers are, so you know who the pre-confirmers are and then listen for a two-thirds subset of their signatures.
So, once you have the Ethereum block header, there is a fairly simple trust chain of hashes, Merkle branches, and signatures that you can verify, and you can achieve light client verification. The same goes for any L2.
I have brought this up to people in the past, and many times the reaction is, "Wow, that's interesting, but what's the point?" Many L2s are multi-signatures. Why don't we trust multi-signatures to verify multi-signatures?
Fortunately, as of last year, this is actually no longer the case. Optimism and Arbitrum are in the first phase of Rollup, which means they actually have proof systems running on-chain, with a security committee that can override them in case of vulnerabilities, but the security committee needs to reach a very high voting threshold, like 75% of 8 people, and Arbitrum's scale will increase to 15 people. So, in the case of Optimism and Arbitrum, they are not just multi-signatures; they have actual proof systems, and these proof systems actually work, at least in terms of deciding which chain is correct or incorrect, having the majority of power.
EVM goes even further; I believe it doesn’t even have a security committee, so it is completely trustless. We are really starting to make progress in this area, and I know many other L2s are also advancing. So L2 is not just multi-signatures, so the concept of light clients for L2 is actually starting to make sense.
Today, we can already verify Merkle branches; we just need to write the code. Tomorrow, we can also verify ZKVM, so you can fully verify Ethereum and L2 in a browser wallet.
Who wants to be a trustless Ethereum user in a browser wallet? Awesome. Who would prefer to be a trustless Ethereum user on their phone? From a Raspberry Pi? From a smartwatch? From a space station? We will solve this problem too. So, what we need is an equivalent of RPC configuration that not only includes which servers you are talking to but also includes actual light client verification instructions. This is a goal we can strive to achieve.
Quantum Resistance Strategy
The timeline for the arrival of quantum computing is shrinking. Metaculous believes quantum computers will arrive in the early 2030s, and some believe it will be even sooner.
So we need a quantum resistance strategy. We do have a quantum resistance strategy. There are four parts of Ethereum that are vulnerable to quantum computing, each with natural alternatives.
The quantum-resistant alternative to Verkle Tree is Starked Poseidon Hash, or if we want to be more conservative, we can use Blake consensus signatures. We currently use BLS aggregation signatures, which can be replaced with Stark aggregation signatures. Blobs use KZG, which can use separated encoding Merkle tree Stark proofs. User accounts currently use ECDSA SECP256K1, which can be replaced with hash-based signatures and account abstraction and aggregation, smart contract wallets ERC 4337, etc.
Once we have these, users can set their own signature algorithms, essentially using hash-based signatures. I believe we really need to start considering actually building hash-based signatures so that user wallets can easily upgrade to hash-based signatures.
Protocol Simplification
If you want a robust foundation layer, the protocol needs to be simple. It should not have 73 random hooks and some backward compatibility that exists because some random guy named Vitalik proposed some random stupid idea in 2014.
So it is valuable to try to truly simplify and start to eliminate technical debt. Logs are currently based on Bloom filters, which do not work well and are not fast enough, so log improvements are needed to increase stronger immutability. We have already done this in the stateless aspect, essentially limiting the amount of state access per block.
Ethereum is currently an incredible collection, with RLP, SSZ, APIs; ideally, we should only use SSZ, but at least remove RLP, state, and binary Merkle trees. Once we have binary Merkle trees, then all of Ethereum is on binary Merkle trees.
Fast finality, Single Slot Finality (SSF), cleaning up unused precompilers like ModX precompilers that often lead to consensus errors. If we can remove it and replace it with high-performance Solidity code, that would be great.
Conclusion
Ethereum, as a robust foundation layer, has very unique advantages, including some that Bitcoin does not have, such as consensus decentralization and significant research on recovery from 51% attacks.
I believe it is necessary to truly strengthen these advantages while recognizing and correcting our shortcomings to ensure we meet very high standards. These ideas are fully compatible with a positive L1 roadmap.
One of the things I am most satisfied with regarding Ethereum, especially the core development process, is that our ability to work in parallel has greatly improved. This is a strength; we can actually work on many things in parallel. So caring about these topics does not actually affect the ability to improve the L1 and L2 ecosystems. For example, improving the L1 EVM to make cryptography easier. Currently, verifying Poseidon hash in the EVM is too expensive. 384-bit cryptography is also too expensive.
So there are some ideas above EOF, like SIMD opcodes, EVM max, etc. There is an opportunity to attach this high-performance coprocessor to the EVM. This is better for Layer 2 because they can verify proofs more cheaply, and it is also better for Layer 1 applications because privacy protocols like zk SNARKs are cheaper.
Who has used privacy protocols? Who would prefer to use privacy protocols that cost 40 instead of 80? More people. The second group can use it on Layer 2, while Layer 1 can achieve significant cost savings.
The "Three Giants" of Ethereum Reunite
2024 marks the 10th anniversary of Ethereum's ICO, and the EthCC in 2024 invited all three core founders of Ethereum, Vitalik Buterin, Joseph Lubin, and Gavin Wood, to attend.
After Vitalik's speech, they were invited to take a commemorative photo together:
The three giants shake hands again.