Vitalik Buterin: Ethereum needs to complete three transformations: L2, wallets, privacy
Original Title: 《The Three Transitions》
Author: Vitalik Buterin
Translation: MK, MarsBit
As Ethereum transitions from a young experimental technology to a mature tech stack that can truly provide an open, global, and permissionless experience for ordinary users, this stack needs to undergo three major technical transitions, roughly occurring simultaneously:
- L2 scaling transition - Everyone migrates to rollups
- Wallet security transition - Everyone migrates to smart contract wallets
- Privacy transition - Ensuring that privacy-preserving fund transfers are provided, and ensuring that all other tools being developed (social recovery, identity, reputation) can protect privacy
This is the triangular relationship of ecosystem transformation. You can only choose all three out of three.
Without the first, Ethereum will fail, as every transaction would cost $3.75 (if we have another bull market, then the price would be $82.48), and every product aimed at the mass market will ultimately forget the chain and adopt centralized solutions for everything.
Without the second, Ethereum will fail, as users will be unwilling to store their funds (and non-financial assets), and everyone will migrate to centralized exchanges.
Without the third, Ethereum will fail, as all transactions (and POAPs, etc.) will be publicly visible to anyone, which is an excessive sacrifice of privacy for many users, and everyone will migrate to centralized solutions that at least have some hidden data.
For the above reasons, these three transitions are crucial. However, addressing these issues requires strong coordination, making them challenging. Not only does it require improvements to the protocol's functionality, but in some cases, the way we interact with Ethereum needs quite fundamental changes, requiring deep changes in applications and wallets.
These three transitions will fundamentally change the relationship between users and addresses
In the world of L2 scaling, users will exist across many L2s. Are you a member of ExampleDAO, which is on Optimism? Then you have an account on Optimism! Do you hold a CDP in the stablecoin system on ZkSync? Then you have an account on ZkSync! Have you ever tried some applications that happen to be on Kakarot? Then you have an account on Kakarot! The days of users having only one address are gone forever.
I have ETH in four places, according to my Brave Wallet view. Yes, Arbitrum and Arbitrum Nova are different. Don't worry, this will get more complicated over time!
Smart contract wallets add more complexity, making it harder to have the same address across L1 and various L2s. Nowadays, most users are using externally owned accounts, where the address is actually a hash of the public key used for signature verification - so there is no change between L1 and L2. However, for smart contract wallets, maintaining a single address becomes more difficult. Despite significant work to make addresses hash-equivalent across networks, especially with CREATE2 and ERC-2470 singleton factories, achieving this perfectly is very challenging. Some L2s (like "type 4 ZK-EVMs") are not entirely equivalent to EVMs, often using Solidity or intermediate assembly instead, which prevents hash equivalence. Even if you can achieve hash equivalence, the possibility of wallets changing ownership through key changes introduces other non-intuitive consequences.
Privacy requires each user to have more addresses and may even change the types of addresses we handle. If privacy address proposals gain widespread adoption, each user may no longer have just a few addresses or one address on each L2, but potentially one address for every transaction. Other privacy solutions, even existing ones like Tornado Cash, will change the way assets are stored in different ways: many users' funds are stored in the same smart contract (and thus the same address). To send funds to a specific user, users will need to rely on the internal address system of the privacy solution itself.
As we can see, these three transitions weaken the mental model of "one user ~= one address" in different ways, and some of these effects feedback into the complexity of executing the transitions. Two particularly complex points are:
If you want to pay someone, how do you obtain the information to make the payment to them?
If users store many assets in different places across different chains, how do they perform key changes and social recovery?
The three transitions relate to on-chain payments (and identity)
I have coins on Scroll, and I want to pay for coffee (if "I" literally refers to me as the author of this article, then "coffee" is of course a metonym for "green tea"). You are selling me coffee, but you are only prepared to accept coins on Taiko. What should I do?
There are basically two solutions:
The receiving wallet (which could be a merchant or just an ordinary individual) strives to support every L2 and has some asynchronous integration of funds automatically.
The receiver provides their L2 and their address, and the sender's wallet automatically routes the funds to the target L2 through some cross-L2 bridging system.
Of course, these solutions can be combined: the receiver provides a list of L2s they are willing to accept, and the sender's wallet calculates the payment, which may involve direct sending (if they are lucky) or through a cross-L2 bridging path.
But this is just one example of the key challenges introduced by the three transitions: a simple act like paying someone starts to require more information than just a single 20-byte address.
Fortunately, the transition to smart contract wallets does not impose a significant burden on the address system, but there are still some technical issues in other parts of the application stack that need to be addressed. Wallets need to be updated to ensure they are not just sending 21000 gas in transactions, but more importantly, that the wallet's payment receiving end tracks not only ETH transfers from EOAs but also ETH sent by smart contract code. Applications that rely on the assumption of unchanged address ownership (e.g., prohibiting smart contracts to enforce royalties on NFTs) will have to find other ways to achieve their goals. Smart contract wallets will also make some things easier - particularly if someone only accepts non-ETH ERC20 tokens, they will be able to use ERC-4337 payers to pay gas fees with that token.
On the other hand, privacy again raises major challenges that we have not truly solved yet. The original Tornado Cash did not introduce these issues because it did not support internal transfers: users could only deposit into the system and withdraw. Once you can perform internal transfers, users will need to use the internal address scheme of the privacy system. In practice, a user's "payment information" will need to include (i) some form of "spending public key," which is a secret commitment that the receiver can use to spend, and (ii) a way for the sender to send encrypted information that only the receiver can decrypt to help the receiver discover the payment.
Privacy address protocols rely on the concept of a meta-address, which works as follows: part of the meta-address is a blinded version of the sender's spending key, and another part is the sender's encryption key (although a minimal implementation can set these two keys to be the same).
The key lesson here is that in a privacy-focused ecosystem, users will have spending public keys and encryption keys, and a user's "payment information" will need to include both types of keys. Beyond payments, there are other good reasons to expand in this direction. For example, if we want encrypted email based on Ethereum, users will need to publicly provide some form of encryption key. In the "EOA world," we could reuse account keys to achieve this, but in a secure smart contract wallet world, we may need to have clearer functionalities to achieve this. This will also help make Ethereum-based identities more compatible with non-Ethereum decentralized privacy ecosystems, with the most prominent example being PGP keys.
The three transitions and key recovery
In a world where a user may have multiple addresses, the default way to implement key changes and social recovery is to have users perform recovery procedures individually for each address. This can be done with one click: wallets can include software to execute recovery procedures simultaneously across all user addresses. However, even with such a user experience simplification, naive multi-address recovery presents three issues:
- Unrealistic gas fees: This is self-evident.
- Counterfactual addresses: Addresses that have not yet deployed their smart contracts (which essentially means you have not sent funds from that account). As a user, you could potentially have an infinite number of counterfactual addresses: one or more on each L2, including non-existent L2s, plus a completely different infinite set of counterfactual addresses arising from privacy address schemes.
- Privacy: If users deliberately have many addresses to avoid linking them together, they certainly do not want to publicly link all their addresses by recovering them at the same time or nearly the same time!
Solving these issues is difficult. Fortunately, there is a fairly elegant solution that performs quite well: an architecture that separates verification logic from asset holding.
Each user has a key vault contract that exists in one location (possibly on the mainnet or a specific L2). The user then has addresses on different L2s, where the verification logic of each address is a pointer to the key vault contract. Spending from these addresses will require a proof that enters the key vault contract, showing the current (or more practically, the most recent) spending public key.
Proofs can be achieved in several ways:
Directly reading read-only L1 access in L2. L2 can be modified to give them a way to directly read L1 state. If the key vault contract is on L1, this would mean that contracts within L2 can access the key vault "for free."
Merkle branches. Merkle branches can prove L1 state to L2, or L2 state to L1, or you can combine both to prove a part of an L2 state to another L2. The main weakness of Merkle proofs is the high gas cost due to proof length: a proof may require 5 kB, although this will reduce to less than 1 kB in the future due to Verkle trees.
ZK-SNARKs. You can reduce data costs by using ZK-SNARKs of Merkle branches instead of the branches themselves. Off-chain aggregation techniques can be built (e.g., based on EIP-4337) that allow a single ZK-SNARK to verify all cross-chain state proofs in a block.
KZG commitments. L2 or schemes built on top of it can introduce a sequential addressing system that allows state proofs within this system to be only 48 bytes long. Like ZK-SNARKs, a multi-proof scheme can combine all these proofs into a single proof for each block.
If we want to avoid making a proof for every transaction, we can implement a more lightweight scheme that only requires a cross-L2 proof when recovering. Spending from one account will depend on a spending key, whose corresponding public key is stored in that account, but recovery will require a transaction that copies the current spending public key in the key vault. Funds in counterfactual addresses are safe even if your old key is not secure: "activating" a counterfactual address, turning it into a working contract, will require making a cross-L2 proof that copies the current spending public key. This topic on the Safe forum describes how a similar architecture might work.
To enhance privacy for such schemes, we only need to encrypt the pointers and then make all proofs in ZK-SNARKs:
With more work (e.g., using this work as a starting point), we can also strip away much of the complexity of ZK-SNARKs to create a simpler KZG-based scheme.
These schemes can become complex. However, there are many potential synergies between these schemes. For example, the concept of a "key vault contract" could also be a solution to the "address" challenges mentioned in the previous section: if we want users to have persistent addresses that do not change when users update their keys, we could place the hidden meta-address, encryption keys, and other information in the key vault contract and use the address of the key vault contract as the user's "address."
Many secondary infrastructures need updates
Using ENS is expensive. Today, in June 2023, the situation is not too bad: transaction fees are high, but still comparable to ENS domain fees. Registering zuzalu.eth cost me about $27, of which $11 was the transaction fee. However, if we have another bull market, the fees will skyrocket. Even without the price of ETH rising, gas fees returning to 200 gwei would raise the transaction fee for domain registration to $104. Therefore, if we want people to truly use ENS, especially in application scenarios like decentralized social media, users demand almost free registration (ENS domain fees are not an issue because these platforms provide subdomains for their users), we need ENS to operate on L2.
Fortunately, the ENS team has already started taking action, and ENS on L2 is actually happening! ERC-3668 (also known as the "CCIP standard"), along with ENSIP-10, provides a way to automatically verify ENS subdomains on any L2. The CCIP standard requires setting up a smart contract that describes how to verify L2 data proofs, and domain names (e.g., Optinames using ecc.eth) can be placed under the control of such a contract. Once the CCIP contract controls ecc.eth on L1, accessing some subdomain.ecc.eth will automatically involve looking up and verifying proofs (e.g., Merkle branches) of L2 state that actually stores that specific subdomain.
Obtaining proofs in practice involves accessing a series of URLs stored in the contract, which admittedly feels centralized, although I would argue it is not: it is a 1-of-N trust model (invalid proofs will be caught by the verification logic in the callback function of the CCIP contract, as long as one URL returns a valid proof, there is no issue). This list of URLs could contain dozens of URLs.
The work of ENS CCIP is a successful example and should be seen as a sign that the kind of radical reform we need is possible. But more application-level reforms are needed. Some examples include:
Many dapps rely on users providing off-chain signatures. This is straightforward for externally owned accounts (EOAs). ERC-1271 provides a standardized way for smart contract wallets to implement this. However, many dapps still do not support ERC-1271; they need to support it.
Dapps that use "Is this an EOA?" to distinguish between users and contracts (e.g., to prevent transfers or enforce royalties) will break. In general, I recommend not trying to find a purely technical solution; figuring out whether a specific transfer of cryptographic control is a beneficial transfer of rights is a difficult question that may not be solvable without some off-chain community-driven mechanisms. Most likely, applications will have to rely less on preventing transfers and more on technologies like Harberger taxes.
How wallets interact with spending and encryption keys will need improvement. Currently, wallets typically use deterministic signatures to generate application-specific keys: signing a standard random number (e.g., the hash of the application name) with the private key of an EOA generates a deterministic value that cannot be generated without the private key, so it is technically secure. However, these techniques are "opaque" to wallets, preventing wallets from implementing user interface-level security checks. In a more mature ecosystem, signatures, encryption, and related functionalities need to be handled more explicitly by wallets.
Light clients (e.g., Helios) will need to verify L2, not just L1. Today, light clients focus on checking the validity of L1 headers (using light client synchronization protocols) and verifying Merkle branches of L1 state and transactions that originate from L1 headers. Tomorrow, they will also need to verify proofs of L2 state that originate from the state root stored in L1 (this more advanced version will actually look at pre-confirmations of L2).
Wallets need to protect assets and data
Now, the business of wallets is to protect assets. Everything exists on-chain, and the only thing wallets need to protect is the private key that currently protects those assets. If you change your key, you can safely publish your previous private key on the internet the next day. However, in a zero-knowledge proof world, this is no longer the case: wallets are not just protecting authentication credentials; they are also protecting your data.
We have seen the first signs of such a world in Zupass, a ZK-SNARK-based identity system used in Zuzalu. Users have a private key that they use to authenticate the system, which can be used to make basic proofs, such as "proving I am a resident of Zuzalu without revealing which one." However, the Zupass system is also starting to have other applications built on top of it, the most notable being stamps (the POAPs version of Zupass).
One of my many Zupass stamps proves that I am a proud member of Team Cat.
The key feature that stamps provide over POAPs is that stamps are private: you hold the data locally, and you only prove the stamp (or some computation on the stamp) to them when you want them to have that information. But this increases the risk: if you lose that information, you lose your stamps.
Of course, the issue of holding data can be reduced to the issue of holding an encryption key: third parties (even the chain) can hold an encrypted copy of the data. This has the convenient advantage that your actions do not change the encryption key, so there is no need to interact with a system that keeps your encryption key secure. But even so, if you lose your encryption key, you lose everything. Conversely, if someone sees your encryption key, they can see everything encrypted by that key.
The de facto solution of Zupass is to encourage people to store their keys on multiple devices (e.g., laptops and phones), as the likelihood of losing all devices simultaneously is low. We can go further by using secret sharing to store keys, splitting the keys among multiple guardians.
This social recovery through MPC is not a sufficient solution for wallets, as it means that not only current guardians but also previous guardians could collude to steal your assets, which is an unacceptably high risk. However, leaking privacy is often a smaller risk than completely losing assets, and if someone needs a highly privacy-protecting use case, they can accept a higher loss risk by not backing up the associated keys that need privacy protection.
To avoid overwhelming users with a complex multi-recovery-path system, wallets that support social recovery may need to manage both asset recovery and encryption key recovery simultaneously.
Back to the identity issue
A common theme in these changes is that the concept of "addresses" representing "you" on-chain must undergo a thorough transformation. "Instructions on how to interact with me" is no longer just an ETH address; they must, in some form, include multiple addresses across multiple L2s, hidden meta-addresses, encryption keys, and some combination of other data.
One way to achieve this is to make ENS your identity: your ENS record can contain all this information, and if you send someone bob.eth (or bob.ecc.eth, or…), they can look up and understand everything about how to pay and interact with you, including in more complex cross-domain and privacy-protecting ways.
However, this ENS-centered approach has two weaknesses:
- It binds too many things to your name. Your name is not you; your name is just one of your many attributes. You should be able to change your name without having to move your entire identity profile and update a bunch of records in many applications.
- You cannot have trustless counterfactual real names. A key UX feature of any blockchain is the ability to send coins to someone who has not yet interacted with the chain. Without such functionality, there is a chicken-and-egg problem: interacting with the chain requires paying transaction fees, and paying fees requires… already having coins. ETH addresses, including smart contract addresses with CREATE2, have this feature. ENS names do not, because if two Bobs decide off-chain that they are bob.ecc.eth, there is no way to choose which one gets that name.
One possible solution is to put more things into the key vault contract mentioned in the architecture at the beginning of this article. The key vault contract can contain various information about you and how to interact with you (via CCIP, some of this information can be off-chain), and users can use their key vault contract as their primary identifier. But the actual assets they receive will be stored in various different places. Key vault contracts are not bound to names; they are counterfactual-friendly: you can generate an address that can only be initialized by a key vault contract with certain fixed initial parameters.
Another category of solutions relates to abandoning the user-facing address concept, similar to the spirit of the Bitcoin payment protocol. One idea is to rely more on direct communication channels between the sender and receiver; for example, the sender can send a request link (as an explicit URL or QR code), and the receiver can use that link to accept payment in whatever way they wish.
Whether the sender or receiver takes action first, relying more on wallets to directly generate the latest payment information in real-time can reduce friction. That said, persistent identifiers are convenient (especially with ENS), and in practice, the assumption of direct communication between the sender and receiver is a very tricky issue, so we may see a combination of different technologies.
In all these designs, it is crucial to keep things both decentralized and user-friendly. We need to ensure that users can easily access the latest view of their current assets and the information published for them. These views should rely on open tools rather than proprietary solutions. Avoiding more complex payment infrastructure from becoming an opaque "abstract tower" that developers find difficult to understand and adapt to new environments will require hard work. Despite the challenges, achieving scalability for Ethereum, wallet security, and privacy for ordinary users is essential. This is not just about technical feasibility; it is about actual accessibility for ordinary users. We need to rise to this challenge.
Special thanks to Dan Finlay, Karl Floersch, David Hoffman, and the Scroll and SoulWallet teams for their feedback, review, and suggestions.