Paradigm: Exploring the Relationship Between MEV-Boost and Consensus Mechanisms

Paradigm
2023-05-01 19:49:52
Collection
This article aims to explore the interaction between Mev-Boost and consensus, reveal the subtleties of the Ethereum proof-of-stake mechanism, and list some possible directions for moving forward.

Title: Time, slots, and the ordering of events in Ethereum Proof-of-Stake

Authors: Georgios Konstantopoulos, Mike Neuder, Paradigm

Compiled by: wesely, GWEI Research

On April 2, a malicious Ethereum network participant exploited a vulnerability in the mev-boost-relay to steal $20 million from an MEV seeker (see Flashbots' post-mortem analysis). In the following days, developers addressed this bug by releasing five patches, which, combined with existing network latency and validator strategies, led to a brief period of instability in the Ethereum network on April 6. Reorganizations are detrimental to network health as they reduce block production rates and decrease settlement guarantees.

This article aims to explore the interaction between mev-boost and consensus, revealing the subtleties within Ethereum's Proof-of-Stake mechanism and outlining some potential paths forward. We are inspired by the attacks on seekers and the temporary instability of the network.

What is mev-boost? Why is it important?

mev-boost is a protocol designed by Flashbots and the community to mitigate the negative impact of Maximum Extractable Value (MEV) on the Ethereum network.

There are three roles in mev-boost:

  • Relays - Trusting auctioneers that connect proposers to block builders.
  • Builders - Complex entities that construct blocks to maximize MEV for themselves and proposers.
  • Proposers - Ethereum Proof-of-Stake validators.

The rough sequence of events for each block is as follows:

  • Builders create a block by receiving transactions from users, seekers, or other (private or public) order flows.
  • Builders submit the block to the Relay.
  • Relays verify the block's validity and calculate how much to pay the proposer.
  • Relay sends a "blinded" header and payment value to the proposer for the current slot.
  • Proposers evaluate all bids received and sign the blinded header associated with the highest payment.

Proposers send this signed header back to the relay.

The block is published by the relay using its local beacon node and returned to the proposer. Rewards are distributed to builders and proposers through transactions within the block and block rewards.

Relay acts as a trusted third party that facilitates fair exchange of block space from proposers and transaction sequencing for MEV extraction from builders. Relay protects builders from MEV theft, where proposers replicate builder transactions to capture MEV instead of allocating it to the seeker/builder that discovered it. Relay ensures proposers confirm the validity of builder blocks, processes hundreds of blocks for each proposer per slot, and guarantees the accuracy of payments made by proposers.

mev-boost is a critical protocol infrastructure because it allows all proposers to access MEV democratically without establishing trust with builders or seekers, contributing to Ethereum's long-term decentralization.

Ethereum's fork choice rules and mev-boost

Before delving into the attacks and responses, let's first look at Ethereum's Proof-of-Stake (PoS) mechanism and its associated fork choice rules. Fork choice rules allow the network to reach consensus on the chain head. According to "Reorganization of Ethereum Post-Merge":

The fork choice rule is a function evaluated by the client that takes the blocks and other messages it has seen as input and outputs what the "canonical chain" is. The need for a fork choice rule arises because there may be multiple valid chains to choose from (e.g., when two competing blocks with the same parent are published simultaneously).

One lesser-known aspect of fork choice rules is their relationship with time, which has significant implications for block production.

Slots and sub-slot periods

In Ethereum PoS, time is divided into 12-second increments called slots. The PoS algorithm randomly designates a validator to propose a block for that slot; this validator is known as the proposer. In the same slot, other validators are assigned the task of voting for the latest version of the block at the chain head according to their local view by applying the fork choice rule. The 12-second interval is divided into three phases, each lasting 4 seconds.

The events that occur within a slot are as follows, where t=0 indicates the start of the slot.

image

The most critical moment in the slot is the validation deadline at t=4. If a validating validator does not see a block before the validation deadline, they will vote for the previously accepted head on-chain (according to the fork choice rule). The earlier a block is proposed, the more time it has to propagate, thus accumulating more attestations (as more validators see it before the validation deadline).

From a network health perspective, the optimal time to publish a block is at t=0 (as specified by the specification). However, as the value of blocks monotonically increases over time, proposers are incentivized to delay the publication of their blocks to allow for more MEV accumulation. See the timing games in Proof-of-Stake and this discussion for further details.

Historically, proposers could still publish blocks even after the validation deadline and close to the end of the slot, as long as the next validator observed that block before building their subsequent slot block. This is due to the parent block inheriting weight and the fork choice rule terminating at the leaf node, resulting in no negative impact from delayed block publication. To help steer rational behavior (delayed block publication) towards honest behavior (timely publication), "honest reorganization" was implemented.

Proposer Boost and Honest Reorganization

Two new concepts were introduced to the consensus client that have critical implications for the validation deadline.

  • Proposer Boost (PR) - Attempts to minimize reorganization balance attacks by granting proposers a fork choice "boost" equivalent to 40% of full validation weight. Importantly, this boost lasts only for one slot.
  • Honest Reorganization (PR) - Adopts proposer boost and allows honest proposers to use it to force a reorganization of blocks with less than 20% validation weight. This is implemented in Lighthouse and Prysm (since the v4.0-Capella release). This change is optional as it is a local decision made by proposers and does not affect validator behavior. Therefore, there was no coordinated effort to roll it out simultaneously across all clients, nor was it associated with any specific hard fork.

Note that honest reorganization is avoided in certain special cases:

  • During epoch boundary blocks
  • If the chain is incomplete
  • If the chain head is not obtained from the slot prior to the reorganization block

Condition 3 ensures that honest reorganization only removes a single block from the chain, acting as a circuit breaker that allows the chain to continue producing blocks during extreme network latency. This also reflects a reduced confidence of proposers in their network view, as they can no longer be certain that their proposer-boosted block will be considered canonical.

The following diagram illustrates how honest behavior changes to implement reorganization strategies.

image

In this case, let b1 represent a late block. Due to the delay, b1 only has 19% validation weight for slot n. The remaining 81% of validation weight is allocated to the parent block HEAD, as many validators did not see b1 before the validation deadline.

Without honest reorganization, in slot n+1, the proposer would treat b1 as the chain head and build a sub-block b2. Despite having only 19% validation weight, the proposer would not make an effort to reorganize b1. During slot n+1, b2 has the proposer boost, and assuming it is delivered on time, it becomes canonical by accumulating most of the attestations for that slot.

With honest reorganization, the situation is quite different. Now the proposer for the n+1 time period finds that the 19% validation weight for b1 is below the reorganization threshold, so they build a new block with HEAD as the parent of b2 and forcibly reorganize b1. When we reach the validation deadline for the n+1 time period, honest validators will compare b2 (40% from proposer boost) against b1 (19%) relative weight. All clients implement proposer boost, so b2 will be considered the chain head and will accumulate attestations for slot n+1.

Relay and Beacon Node Fixes Against Unbinding Attacks

In the unbinding attack on April 2, proposers exploited a relay vulnerability by sending an invalid signed header to the relay. In the following days, relay and core development teams released several software patches to mitigate the risk of repeated attacks. The five main changes are as follows:

  1. Relay Changes:
  • Check the database for known malicious proposers (used only by ultrasonic relays in production and has been removed).
  • Check if a complete block has been passed to the P2P network during that period.
  • Introduce a uniform random delay in the range of 0-500 ms before publishing blocks (removed from all relays).
  1. Beacon Chain Node Changes (only applicable to relay beacon chain nodes):
  • Validate the beacon block before broadcasting it.
  • Check for equivalence on the network before publishing the block.

The combination of these changes led to consensus instability, while most validators now using the aforementioned honest reorganization strategy further exacerbated the situation.

Unintended Consequences

Each of the above five changes increases the delay time on the hot path of relay block publication, thereby increasing the likelihood that relay blocks will be broadcast after missing the validation deadline. The following diagram shows the sequence of these five checks and how the introduced delays can lead to block publication exceeding the validation deadline.

Before implementing these checks, the arrival time of signed headers significantly later than t=0 (e.g., t=3) typically posed no issues. Relay overhead is very low, so blocks would be published before t=4.

However, with the increased delay time introduced by these five patches, relays may now be partially responsible for delayed broadcasts. Let's examine block publication under the following hypothetical scenario.

image

The relay receives the signed header from the proposer at t=3. By t=4, the relay is still performing checks, so the broadcast occurs after the validation deadline. In this case, the combination of the proposer sending the signed header late and the additional delay introduced by the relay results in missing the validation deadline. Without honest reorganization, these blocks would likely enter the chain. As we see in Figure 2, subsequent honest proposers would not intentionally reorganize blocks that were rejected due to being too late. However, in the case of honest reorganization, missing the validation deadline means that the block will be reorganized by the next proposer.

As a result, the number of forked blocks surged in the days following the attack.

image

Metrika's data over two weeks showed that, in the worst case, 13 blocks (4.3%) were reorganized within an hour, which is about five times more than normal. As various changes were rolled out by the relay, the sharp increase in the number of forked blocks became evident. Thanks to the tremendous community effort from relay operators and core developers, many changes were rolled back once the impacts were understood, and the network returned to a healthy state.

As of today, the most useful changes are the validation of beacon node blocks and equivalence checks before broadcasting. Malicious proposers can no longer execute attacks by sending invalid headers to the relay and ensuring that the relay beacon node does not see equivalent blocks before publication. Nevertheless, the relay remains vulnerable to more general equivalence attacks presented by mev-boost and ePBS intermediaries.

So what should we do?

In this article, we highlighted how mev-boost works and its importance to Ethereum consensus. We also detailed some lesser-known aspects of the time-related Ethereum fork choice rules. By using the unbinding attack and the developers' responses as case studies, we emphasized the potential vulnerabilities related to time in the fork choice rules and their impact on network stability.

In light of this, the research community should assess what constitutes an "acceptable" number of reorganizations and consider the risks posed by equivalence attacks in general to determine whether mitigation measures are necessary.

Additionally, several future directions are actively being explored:

  • Implementing "headlock" to protect mev-boost from equivalence attacks. This will also require changes to consensus client software and may necessitate specification changes to extend the validation deadline.
  • Increasing the number and visibility of bug bounty programs targeting mev-boost software vulnerabilities.
  • Expanding simulation software to explore how sub-slot timing affects network stability. This can be used to evaluate how adjusting the validation deadline can reduce reorganizations.
  • Optimizing the block publication path on relays to reduce unnecessary delays. This is already under research.
  • Recognizing that mev-boost is a core protocol feature and incorporating it into consensus clients, i.e., enshrined-PBS (ePBS). The ePBS of two slots is susceptible to equivalence attacks, so implementing "headlock" remains an option.
  • Increasing more hive and/or specification tests based on latency and validation deadline issues.
  • Encouraging relay client diversity by building alternative implementations of relay specifications.
  • Considering adjustments to equivalence penalty measures, but keeping in mind that even a complete cut of 32 ETH may not deter malicious behavior when significant MEV opportunities exist.

Overall, we are excited about the renewed energy surrounding the MEV and mev-boost ecosystem. Through the unbinding attack and mitigation measures, we have gained insights into the critical relationship between latency, mev-boost, and consensus mechanisms; we hope the protocol continues to strengthen.

Special thanks to Bert Miller, Danny Ryan, Alex Stokes, Francesco D'Amato, Michael Sproul, Terence Tsao, Frankie, Joachim Neu, Chris Hager, Matt Garnett, Charlie Noyes, and samczsun for their feedback on this article, as well as Achal Srinivasan.

ChainCatcher reminds readers to view blockchain rationally, enhance risk awareness, and be cautious of various virtual token issuances and speculations. All content on this site is solely market information or related party opinions, and does not constitute any form of investment advice. If you find sensitive information in the content, please click "Report", and we will handle it promptly.
ChainCatcher Building the Web3 world with innovators