Proof of Validator: A Key Security Piece on the Road to Ethereum Scalability
Written by: Deep Tide TechFlow
Today, a new concept quietly emerged in the Ethereum research forum: Proof of Validator.
This protocol mechanism allows network nodes to prove that they are Ethereum validators without revealing their specific identities.
What does this have to do with us?
Generally, the market tends to focus more on the surface narratives brought about by certain technological innovations on Ethereum, rarely delving deeply into the technology itself in advance. For example, with Ethereum's Shanghai upgrade, the merge, the transition from PoW to PoS, and scaling, the market only remembers the narratives of LSD, LSDFi, and re-staking.
But let's not forget that performance and security are paramount for Ethereum. The former determines the upper limit, while the latter determines the lower limit.
It is evident that, on one hand, Ethereum has been actively promoting various scaling solutions to enhance performance; on the other hand, along the path of scaling, in addition to honing its internal capabilities, it also needs to guard against external attacks.
For instance, if a validating node is attacked, leading to data unavailability, then all narratives and scaling solutions built on Ethereum's staking logic could be affected. However, this impact and risk often lurk in the background, making it difficult for end users and speculators to perceive, and sometimes they may not even care.
The Proof of Validator, which this article will discuss, may be a key piece of the security puzzle on Ethereum's scaling journey.
Since scaling is imperative, how to reduce the risks that may be inherent in the scaling process is an unavoidable security issue that is also closely related to each of us in the community.
Therefore, it is necessary to clarify the full picture of the newly proposed Proof of Validator. However, due to the fragmented and hardcore nature of the full text in the technical forum, and its involvement with many scaling solutions and concepts, the Deep Tide Research Institute has integrated the original post and organized the necessary related information to interpret the background, necessity, and potential impact of Proof of Validator.
Data Availability Sampling: The Breakthrough for Scaling
Don't worry, before formally introducing Proof of Validator, it is necessary to clarify the current logic of Ethereum's scaling and the risks it may entail.
The Ethereum community is actively promoting multiple scaling plans. Among them, Data Availability Sampling (DAS) is regarded as the most critical technology.
The principle is to split complete block data into several "samples," allowing nodes in the network to validate the complete block by obtaining only a few samples relevant to themselves.
This greatly reduces the storage and computational load on each node. To put it in a more understandable example, it's similar to conducting a survey; by interviewing different people, one can summarize the overall situation of the entire population.
Specifically, the implementation of DAS can be summarized as follows:
- Block producers split block data into multiple samples.
- Each network node only receives a few samples of interest, rather than the complete block data.
- Network nodes can randomly sample different samples to verify the availability of the complete block data.
Through this sampling, even if each node processes only a small amount of data, collectively they can fully verify the data availability of the entire blockchain. This can significantly increase block size and achieve rapid scaling.
However, there is a key issue with this sampling scheme: where to store the massive samples? This requires a complete decentralized network to support it.
Distributed Hashed Table: The Home for Samples
This presents an opportunity for Distributed Hashed Table (DHT) to shine.
DHT can be viewed as a massive distributed database that uses hash functions to map data to an address space, with different nodes responsible for storing and retrieving data from different address segments. It can be used to quickly find and store samples among a vast number of nodes.
Specifically, after DAS splits block data into multiple samples, these samples need to be distributed across different nodes in the network for storage. DHT can provide a decentralized method to store and retrieve these samples, with the basic idea being:
- Use a consistent hash function to map samples to a vast address space.
- Each node in the network is responsible for storing and providing data samples within a specific address range.
- When a sample is needed, one can look up the corresponding address via hashing and find the node responsible for that address range to retrieve the sample.
For example, each sample can be hashed to an address according to certain rules, with node A responsible for addresses 0-1000 and node B responsible for addresses 1001-2000.
Thus, the sample at address 599 would be stored in node A. When this sample is needed, one can look up address 599 through the same hash and find node A in the network, from which to retrieve the sample.
This method breaks the limitations of centralized storage, greatly enhancing fault tolerance and scalability. This is precisely the network infrastructure needed for DAS sample storage.
Compared to centralized storage and retrieval, DHT can improve fault tolerance, avoid single points of failure, and enhance network scalability. Additionally, DHT can help defend against attacks such as "sample hiding" mentioned in DAS.
The Pain Point of DHT: Witch Attacks
However, DHT also has a fatal weakness: it faces the threat of Sybil attacks. Attackers can create a large number of fake nodes in the network, overwhelming the surrounding real nodes.
To draw an analogy, an honest vendor surrounded by rows of counterfeit goods makes it difficult for users to find the real product. In this way, attackers can control the DHT network, rendering samples unavailable.
For example, to obtain the sample at address 1000, one needs to find the node responsible for that address. However, when surrounded by thousands of fake nodes created by attackers, requests will be continuously directed to the fake nodes, preventing access to the actual node responsible for that address. The result is that the sample cannot be obtained, and both storage and verification fail.
To solve this problem, a high-trust network layer needs to be established on DHT, consisting solely of validator nodes. However, the DHT network itself cannot identify whether a node is a validator.
This severely hinders DAS and Ethereum's scaling. What methods can be employed to resist this threat and ensure the trustworthiness of the network?
Proof of Validator: A ZK Solution to Safeguard Scaling Security
Now, let's return to the focus of this article: Proof of Validator.
Today, in the Ethereum technical forum, George Kadianakis, Mary Maller, Andrija Novakovic, and Suphanat Chunhapanya jointly proposed this solution.
The overall idea is that if we can devise a way to allow only honest validators to join the DHT in the previous section, then malicious actors wishing to initiate a Sybil attack must stake a significant amount of ETH, thereby substantially increasing the economic cost of wrongdoing.
In other words, this concept can be expressed in a way we are more familiar with: I want to know you are a good person without knowing your identity, and I can identify bad people.
In this limited information proof scenario, zero-knowledge proofs can clearly come into play.
Thus, Proof of Validator (PoV) can be used to establish a highly trustworthy DHT network composed solely of honest validating nodes, effectively resisting Sybil attacks.
The basic idea is to have each validating node register a public key on the blockchain and then use zero-knowledge proof technology to prove that they know the private key corresponding to this public key. This is akin to presenting one's ID to prove they are a validating node.
Additionally, regarding the resistance of validating nodes to DoS (Denial of Service) attacks, PoV also aims to hide the identities of validators at the network layer. In other words, the protocol does not want attackers to be able to discern which DHT node corresponds to which validating node.
So how exactly does this work? The original post used a lot of mathematical formulas and derivations, which we will not elaborate on here; instead, we provide a simplified version:
In practical implementation, Merkle trees or lookup tables can be used. For example, using a Merkle tree, one can prove that the registered public key exists in the list of public keys represented by this Merkle tree, and then prove that the network communication public key derived from this public key matches. The entire process is achieved using zero-knowledge proofs, without revealing actual identities.
Skipping these technical details, the ultimate effect of PoV is:
Only authenticated nodes can join the DHT network, significantly increasing its security and effectively resisting Sybil attacks, preventing samples from being deliberately hidden or modified. PoV provides a reliable foundational network for DAS, indirectly assisting Ethereum in achieving rapid scaling.
However, PoV is still in the theoretical research stage, and its practical implementation remains uncertain.
Nonetheless, the researchers behind this post have already conducted experiments on a small scale, and the results show that PoV performs well in terms of the efficiency of the proposed ZK proofs and the efficiency of validators receiving proofs. It is worth mentioning that their experimental setup was merely a laptop equipped with a five-year-old Intel i7 processor.
Finally, PoV is still in the theoretical research stage, and its practical implementation remains uncertain. However, it represents an important step towards greater scalability for blockchain. As a key component in Ethereum's scaling roadmap, it deserves continuous attention from the entire industry.