Vitalik's Blog Interpretation: Is the Next Stop for Web3 Infrastructure Packaging or Expansion?
Original Title: "Encapsulation or Extension? Exploring the Next Stop of Web3 Infrastructure ------ An Interpretation of Vitalik's Enshrinement Blog"
Original Author: CP, Artela CTO & co-founder
Last week, Vitalik published a blog post titled "Should Ethereum be okay with enshrining more things in the protocol?", sharing his thoughts on the foundational functionalities required for the "enshrinement" of upper-layer applications in Ethereum, and exploring how to propose a framework to enshrine more foundational functionalities needed for upper-layer applications.
This is a key issue faced by typical platform systems: should the team "enshrine" critical upper-layer application functionalities into the underlying layer, or should application developers "extend" these functionalities at the application layer? As infrastructure approaches large-scale expansion, the design of "encapsulation vs. extension" becomes crucial and will be one of the key designs affecting its ability to be widely adopted.
In the past six months, several major infrastructures have launched significant technical updates: Uniswap introduced the Hook mechanism to support the extension of custom functionalities in pools; MetaMask wallet launched Snaps to allow developers to add user extensions; Ethereum is also facing the dilemma of "encapsulation vs. extension."
This article will explore the design trade-offs of Web3 infrastructures regarding "encapsulation vs. extension," as well as some personal thoughts on this issue concerning public chain infrastructure.
What Problems Does Ethereum Face? Encapsulation or Extension
On the issue of "encapsulation vs. extension," Ethereum has consistently chosen "extension."
The design philosophy of Ethereum is rooted in Unix, creating a minimalistic and universal kernel that allows user needs to be realized at the application layer through developers. The key technology supporting Ethereum in achieving this is the EVM. The Turing-complete smart contract language enables developers to customize their applications at the application layer.
This model seems reasonable, but it does not work as well on "decentralized Unix." One of the important reasons is that the so-called Turing completeness of the EVM is not as "complete" in practical use. Under the gas mechanism and limited opcodes, it requires programs to complete their tasks using a limited number of opcodes within a finite number of steps, which greatly restricts applications from being as versatile as Web2 programs in the Unix user layer. Many advanced capabilities required by dApps cannot be satisfied by the EVM. Whether it's Rollup or AA wallets, while they can operate without modifying L1, they remain MVP products, and their efficiency and experience are still far from their goals.
The choice in front of developers is EIP. They rely on the core Ethereum team to "encapsulate" the important functionalities into the underlying layer for use at the application layer.
The "extension" based on EVM cannot meet the developmental needs of applications, and now they need to seriously consider how to "encapsulate."
However, for decentralized infrastructure, encapsulating upper-layer application functionalities is not that simple. It is not just about integrating a piece of code; behind it lies the biggest challenge of decentralized systems: governance. "Encapsulation" means that the core team, in addition to development and maintenance, must also take on the governance of these functionalities, which carries the risk of undermining Ethereum's trust model and introducing potential issues affecting sustainable development.
Thus, the final effect can be easily observed: the number of functionalities that the core team can "encapsulate" is limited, and the importance of these functionalities must gain widespread consensus from the community. Finally, the implementation efficiency will not be very high, taking years.
At the same time, this also means that if the functionality you need is not a foundational functionality that has gained widespread consensus, then Ethereum may never accommodate you. Even your attempts may lead you to choose to build your own application chain, incurring high development and operational costs, and losing the beauty of composability in the world of smart contracts.
Regarding the issue of "encapsulation vs. extension," Ethereum still lacks a clear solution. How to make "encapsulation" proceed "in an orderly manner," as Vitalik mentioned, is still being explored in terms of a framework to determine which target functionalities to encapsulate and how to encapsulate them.
What Else Can We Learn from Unix? Native Extension!
On the issue of "encapsulation vs. extension," Ethereum primarily needs the core team to encapsulate functionalities due to the insufficient extension capabilities of the EVM. Let's think from another angle: if we can enhance the scalability of the application layer, could it solve a significant part of the problem? For example, application developers could customize the underlying functionalities needed for their applications without waiting for the core team to encapsulate them.
We also know that Ethereum has drawn many design philosophies from Unix, so let's continue to look for insights within the Unix system.
Commercial operating systems based on Unix target the PC market and face more diverse demands at the application layer, including expansion needs from enterprise usage scenarios. However, these commercial operating systems do not impose a high "encapsulation" burden on the core team; they provide sufficient scalability for applications, allowing most functionalities to be resolved by users themselves.
Taking Mac OS X as an example, general operating systems distinguish between kernel mode and user mode, with user applications typically running in user mode and utilizing functionalities provided by kernel mode programs. A simple (but not complete) comparison is that smart contracts above the EVM are equivalent to user-mode applications, while the Ethereum protocol layer is equivalent to kernel mode.
However, Mac OS X allows application developers to deploy programs into kernel mode independently to extend kernel functionalities without requiring the Mac OS X core team to encapsulate them on a case-by-case basis. The core mechanisms provided by Mac OS X are "kernel extensions" and "system extensions," which allow developers to develop kernel extensions under certain security models, using higher-privilege functionalities to accomplish tasks that pure user-mode applications cannot.
The insight we gain is whether the "Kernel Extension" model is feasible in a "decentralized Unix." Its model is illustrated in the following diagram.
On blockchain protocols, in addition to supporting "smart contracts," another type of program called "Native Extension" could be supported, which:
1) Has more access rights to the underlying protocol APIs than smart contracts.
2) Its execution environment is significantly more efficient than the EVM.
3) It is isolated from the underlying protocol, not affecting the stability of the underlying protocol.
4) Therefore, in terms of governance, it does not need to be maintained by the core team but can be maintained and deployed by the application team.
If this model can technically satisfy the above four points, it seems to solve many problems: application developers can customize the underlying functionalities needed for their applications without waiting for the core team to encapsulate them.
Let's tentatively summarize this paradigm as the "Native Extension" paradigm and see if there are any shadows of it in existing Web3 infrastructures.
Hook, Hook, Hooks…
In the software world, great wheels are often created by great scenarios. As a DeFi infrastructure, Uniswap is at a critical stage in becoming a "platform," providing an impressive design in the "encapsulation vs. extension" characteristic: Hook. It allows developers to use Hooks to add extensions to pools without permission, achieving a diversified pool experience without the core team constantly upgrading functionalities through "encapsulation."
The Hook mechanism shares several similar conditions with the aforementioned Native Extension:
Hooks can cut into the execution lifecycle of the pool and can access runtime data, which is a higher level of access.
Hooks and pools are two independent contracts, and the security of Hooks does not affect the pool.
In terms of governance, Hooks can be developed and deployed by third-party developers without permission, and they are not globally activated; rather, different pools can bind different Hooks as needed to activate custom features.
Hooks are a small but elegant design that solves the scalability problem of pools. The application layer infrastructure has taken the lead in applying these concepts; let's continue to look at the ideas of more complex underlying protocols (L1/L2).
Extension Ideas from New Public Chain Projects
Ethereum is in trouble, so let's see what ideas Layer2 projects dedicated to extending Layer1 have.
Arbitrum Stylus allows application developers to encapsulate precompiled contracts themselves!
Everyone should be aware that the EVM can extend functionalities through "precompiled contracts." Their code does not run inside the EVM but is integrated into the nodes, running at the underlying level. For example, to add new cryptographic algorithms, which are too complex and expensive to compute, they can be implemented as precompiled contracts, allowing application contracts to call them and use the new cryptographic algorithms. However, the increased permissions for precompiled contracts are not open to application developers and still require the Ethereum development team to "encapsulate" them through EIPs.
Arbitrum Stylus proposes the "EVM+" paradigm, allowing Layer2 to pursue EVM equivalence/compatibility while enabling developers to break through the limitations of the EVM and deploy higher-performance precompiled contracts without permission. Its implementation principle is to add a WASM execution environment at the execution layer to dynamically load and run WASM contracts, providing an efficiency that is an order of magnitude higher than the EVM and supporting multiple programming languages.
This is one of the implementations that can optimize Ethereum's "encapsulation" dilemma; the demand for EVM extension no longer waits for the core team to encapsulate it. The core team can focus on maintaining the execution layer's extension environment, while the introduction, development, and governance of new functionalities can be left to the application layer.
However, Stylus is still in its early stages, and many challenges of this model have yet to be exposed, and the problems it can solve are limited. Currently, it only supports dynamically encapsulating precompiled contracts, while Ethereum still faces many more encapsulation challenges beyond precompiled contracts. But the good news is that this is one of the closest implementations we can see to the Native Extension paradigm, representing a new generation of infrastructure that introduces scalable designs to address the "encapsulation" challenges they will encounter as their ecosystems scale in the future, considering long-term ecological development.
Native Extension: A "Modular Encapsulation" Approach!
After reviewing these Web2 and Web3 infrastructure projects, looking back at the "encapsulation vs. extension" issue, we can see a clear idea: by enhancing extension capabilities, allowing developers to encapsulate the functionalities they need in a modular way.
This is the important role that the Native Extension paradigm plays in infrastructure, by enhancing the scalability of the infrastructure, thereby returning the choice to developers, allowing them to freely encapsulate and extend the functionalities they need in a modular way without affecting the stability of the core protocol.
Ethereum is trying to improve the efficiency of "encapsulation," while Arbitrum Stylus is liberating precompiled contracts. Looking further, public chains can also completely liberate the creativity of the application layer through the Native Extension paradigm, just as Uniswap V4 has brought to everyone.
A New L1 Public Chain Based on the Native Extension Paradigm: Artela
Here, let's switch perspectives; "we" refers to the team I am part of as CTO: Artela. We would like to share our thoughts and actions on this issue.
On the Artela blockchain, in addition to the EVM, we have also implemented a WASM execution environment. On one hand, it can run a stateful program, similar to stateful precompiled contracts; on the other hand, it supports a mechanism similar to Hooks, allowing triggers to run at multiple lifecycle nodes of block and transaction processing. This means it is not just used for encapsulating precompiled contracts like Arbitrum Stylus, but can also customize the execution process of transactions and blocks, achieving broader functionality encapsulation. For example, triggering the WASM Native Extension during the transaction validation phase to use new algorithms for recognizing and validating transactions. These Hooks in Artela are called Join Points, and these Native Extensions are not called Smart Contracts but are referred to as Aspects, similar to the concept of AoP (Aspect-oriented Programming), dynamically injecting new functionalities at various Join Points in the running blockchain system!
As a specific example, we have communicated with investors and Web2 institutions about what the biggest resistance to large-scale asset migration to Web3 is, and the most discussed issue is security! Web2-level risk control technologies ensure the safety of billions of assets, but they struggle to enter the Web3 tech stack. Carl from NASA's aerospace field has also voiced similar opinions: "Why We Need Runtime Protection and Aspect."
Runtime Protection is a core means of security and risk control. In the current Web3, we can see a number of strong security companies, with both static audits and formal verification, real-time monitoring, and transaction front-running; this seems to be all the methods available, but it is still far from Web2-level real-time risk control. The core root problem is that the security measures surrounding the mempool are limited because once a transaction crosses the mempool, it becomes powerless. In the transaction execution phase after the mempool, if there is an extension capability that allows security experts to deploy runtime-level security strategies, the security level will be elevated. And Aspect provides developers with the ability to extend security deeply into the execution layer!
Developers can deploy Aspects that serve only their projects to customize the protocol layer functionalities they need. For example, if a transaction could potentially lead to a large amount of funds being stolen, it could be intercepted in the Aspect.
Developers can also deploy public Aspects to encapsulate foundational functionalities that can be reused by multiple projects. For example, a certain Aspect implements specific algorithms and transaction types, making AA wallets more programmable and composable, and other developers can also enable this Aspect to utilize this underlying feature in their projects.
For Artela, our ideas have become increasingly firm along the way:
Allow developers to solve problems through Native Extension at the application layer without permission, rather than waiting for the core public chain team to encapsulate.
Enable large institutions and funds with Web2 backgrounds to stake on the blockchain (by introducing Web2-level runtime risk control enhancements).
Provide developers with a good ecological environment to break through boundaries (EVM may soon reach its ceiling, while EVM + Native Extension may have more potential).
Create an ideal home for dApps that want to migrate more business logic on-chain, such as full-chain games and RWA.
We can see that Ethereum is at a stage of figuring out how to "encapsulate" application features, and there is no plan to relieve their "encapsulation" pressure to return creativity to developers. For this group of potential next-generation innovators who dare to explore breakthroughs in decentralized applications, this situation is quite restrictive. On one hand, they need a robust decentralized network; on the other hand, they feel constrained. This is why we are committed to building a new L1 public chain based on the Native Extension paradigm, allowing infrastructure to embrace the pace of innovation.
Import Web2
Finally, I will conclude this article with these two words. Although at the code level, the decentralized Web3 stack and the Web2 stack represent completely different ways of thinking, it does not prevent us from seeking treasures in the library of Web2 in terms of design philosophy and development history. Keep building!