CertiK: Sui's Latest Vulnerability "Hamster Wheel", Technical Details and In-Depth Analysis

CertiK
2023-06-29 18:06:29
Collection
This article shares the technical details of the "hamster wheel" attack discovered by the CertiK Skyfall team, explaining how this new type of attack exploits critical vulnerabilities to completely shut down the Sui network.

Author: CertiK

Previously, the CertiK team discovered a series of denial-of-service vulnerabilities on the Sui blockchain. Among these vulnerabilities, a new and significantly impactful one stands out. This vulnerability can cause Sui network nodes to be unable to process new transactions, effectively shutting down the entire network.

Just last Monday, CertiK received a $500,000 bug bounty from SUI for discovering this major security vulnerability. The authoritative media in the U.S., CoinDesk, reported on the incident, and subsequently, major media outlets followed up with related news.

This security vulnerability has been vividly dubbed the "Hamster Wheel": its unique attack method differs from currently known attacks, as the attacker only needs to submit a payload of about 100 bytes to trigger an infinite loop in the Sui validator nodes, preventing them from responding to new transactions.

Moreover, the damage caused by the attack can persist even after the network is restarted, and it can automatically propagate within the Sui network, causing all nodes to be unable to process new transactions, akin to hamsters endlessly running on a wheel. Therefore, we refer to this unique type of attack as the "Hamster Wheel" attack.

After discovering the vulnerability, CertiK reported it to Sui through the bug bounty program. Sui promptly responded, confirming the severity of the vulnerability and actively taking corresponding measures to fix the issue before the mainnet launch. In addition to fixing this specific vulnerability, Sui also implemented preventive mitigation measures to reduce the potential damage that the vulnerability could cause.

To thank the CertiK team for their responsible disclosure, Sui awarded the CertiK team a $500,000 bounty.

The following text will disclose the details of this critical vulnerability from a technical perspective, clarifying the root cause and potential impact of the vulnerability.

Vulnerability Details

The Key Role of Validators in Sui

Blockchains based on the Move language, such as Sui and Aptos, primarily rely on static verification techniques to prevent malicious payload attacks. Through static verification, Sui can check the validity of user-submitted payloads before contracts are deployed or upgraded. Validators provide a series of checkers to ensure structural and semantic correctness, and only after passing these checks can the contract be executed in the Move virtual machine.

Threats from Malicious Payloads on Move Chains

The Sui chain provides a new storage model and interface on top of the original Move virtual machine, resulting in a customized version of the Move virtual machine. To support new storage primitives, Sui further introduced a series of additional, customized checks for the security verification of untrusted payloads, such as object safety and global storage access. These custom checks align with Sui's unique features, so we refer to these custom checks as Sui validators.

Sui's Order of Payload Checks

As shown in the figure above, most checks in the validator perform structural safety verification on the CompiledModule (which represents the execution of user-provided contract payloads). For example, the "Duplicate Checker" ensures that there are no duplicate entries in the runtime payload; the "Limit Checker" ensures that the length of each field in the runtime payload is within the allowed entry limits.

In addition to structural checks, the static checks of the validator still require more complex analytical methods to ensure the robustness of untrusted payloads at the semantic level.

Understanding the Abstract Interpreter of Move:

Linear and Iterative Analysis

The abstract interpreter provided by Move is a framework specifically designed for executing complex security analyses on bytecode through abstract interpretation. This mechanism allows for a more refined and accurate verification process, where each verifier is allowed to define their unique abstract state for analysis.

At the start of execution, the abstract interpreter constructs a control flow graph (CFG) from the compiled module. Each basic block in these CFGs maintains a set of states, namely "pre-state" and "post-state." The "pre-state" provides a snapshot of the program state before the execution of a basic block, while the "post-state" describes the program state after the execution of the basic block.

When the abstract interpreter does not encounter any back jumps (or loops) in the control flow graph, it follows a simple linear execution principle: each basic block is analyzed in sequence, and the pre-state and post-state are calculated based on the semantics of each instruction in the block. The result is an accurate snapshot of the state at each basic block level during program execution, helping to verify the program's security properties.

Workflow of the Move Abstract Interpreter

However, when loops are present in the control flow, the process becomes more complex. The presence of loops means that there is a back jump edge in the control flow graph, where the source of the back jump edge corresponds to the post-state of the current basic block, and the target basic block (the loop header) is a pre-state of a previously analyzed basic block, requiring the abstract interpreter to carefully merge the states of the two basic blocks involved in the back jump.

If the merged state is found to differ from the existing pre-state of the loop header basic block, the abstract interpreter will update the state of the loop header basic block and restart the analysis from that block. This iterative analysis process will continue until the pre-state of the loop header stabilizes. In other words, this process repeats until the pre-state of the loop header basic block no longer changes between iterations. Reaching a fixed point indicates that the loop analysis is complete.

Sui IDLeak Validator:

Customized Abstract Interpretation Analysis

Unlike the original Move design, Sui's blockchain platform introduces a unique target-centered global storage model. A notable feature of this model is that any data structure with a key attribute (used as an index for on-chain storage) must have an ID type as the first field of that structure. The ID field is immutable and cannot be transferred to other targets, as each object must have a globally unique ID. To ensure these characteristics, Sui has established a set of custom analysis logic on the abstract interpreter.

The IDLeak validator, also known as idleakverifier, works in conjunction with the abstract interpreter for analysis. It has its unique AbstractDomain, referred to as AbstractState. Each AbstractState consists of multiple AbstractValues corresponding to local variables. AbstractValue supervises the state of each local variable to track whether an ID variable is new.

During the struct packing process, the IDLeak validator only allows a brand new ID to be packed into a structure. Through abstract interpretation analysis, the IDLeak validator can meticulously track the local data flow state to ensure that no existing IDs are transferred to other structure objects.

Inconsistency in State Maintenance of Sui IDLeak Validator

The IDLeak validator integrates with the Move abstract interpreter by implementing the AbstractState::join function. This function plays an indispensable role in state management, particularly in merging and updating state values.

Let’s examine these functions in detail to understand their operations:

In AbstractState::join, the function takes another AbstractState as input and attempts to merge its local state with the local state of the current object. For each local variable in the input state, it compares the value of that variable with its current value in the local state (defaulting to AbstractValue::Other if not found). If these two values are not equal, it sets a "changed" flag as a basis for whether the final state merge result has changed and updates the local variable value in the local state by calling AbstractValue::join.

In AbstractValue::join, the function compares its value with another AbstractValue. If they are equal, it returns the incoming value. If not, it returns AbstractValue::Other.

However, this state maintenance logic contains a hidden inconsistency issue. Although AbstractState::join returns a result indicating that the merge state has changed (JoinResult::Changed) based on the differences between the new and old values, the merged updated state value may still remain unchanged.

This inconsistency arises from the order of operations: the determination of a changed state in AbstractState::join occurs before the state update (AbstractValue::join), and this determination does not reflect the true result of the state update.

Additionally, in AbstractValue::join, AbstractValue::Other plays a decisive role in the merge result. For example, if the old value is AbstractValue::Other and the new value is AbstractValue::Fresh, the updated state value will still be AbstractValue::Other, even though the new and old values differ, resulting in no change in the updated state itself.

Example: Inconsistency in State Joining

This introduces an inconsistency: the result of merging the basic block states is determined to be "changed," but the merged state value itself has not changed. During the abstract interpretation analysis process, this inconsistency can have serious consequences. We revisit the behavior of the abstract interpreter when loops appear in the control flow graph (CFG):

When encountering a loop, the abstract interpreter adopts an iterative analysis approach to merge the states of the back jump target basic block and the current basic block. If the merged state changes, the abstract interpreter will restart the analysis from the jump target.

However, if the abstract interpretation's merge operation incorrectly marks the state merge result as "changed," while the values of the internal variables of the state have not changed, it can lead to endless re-analysis, resulting in an infinite loop.

Further Exploiting the Inconsistency

Triggering an Infinite Loop in the Sui IDLeak Validator

By exploiting this inconsistency, an attacker can construct a malicious control flow graph that entices the IDLeak validator into an infinite loop. This carefully constructed control flow graph consists of three basic blocks: BB1, BB2, and BB3. Notably, we intentionally introduce a back jump edge from BB3 to BB2 to create a loop.

Malicious CFG + State, which can lead to an internal deadlock in the IDLeak validator

The process starts from BB2, where a specific local variable's AbstractValue is set to ::Other. After executing BB2, the flow transfers to BB3, where the same variable is set to ::Fresh. At the end of BB3, there is a back jump edge that jumps back to BB2.

During the abstract interpretation analysis of this example, the aforementioned inconsistency plays a crucial role. When the back jump edge is processed, the abstract interpreter attempts to connect the post-state of BB3 (variable is " ::Fresh") with the pre-state of BB2 (variable is "::Other"). The AbstractState::join function notices the difference between the new and old values and sets the "changed" flag, indicating that BB2 needs to be re-analyzed.

However, the dominant behavior of "::Other" in AbstractValue::join means that after merging, the actual value of the state variable in BB2 remains "::Other," and the result of the state merge has not changed.

Thus, once this loop process begins, the validator will continue to re-analyze BB2 and all its successor basic block nodes (in this case, BB3) indefinitely. The infinite loop consumes all available CPU cycles, preventing it from processing new transactions, and this situation persists even after the validator is restarted.

By exploiting this vulnerability, validator nodes can endlessly loop like hamsters on a wheel, unable to process new transactions. Therefore, we refer to this unique type of attack as the "Hamster Wheel" attack.

The "Hamster Wheel" attack can effectively cause the Sui validator to stall, leading to a complete paralysis of the entire Sui network.

Having understood the cause and triggering process of the vulnerability, we constructed a concrete example using the following Move bytecode, successfully triggering the vulnerability in a real environment simulation:

This example demonstrates how to trigger the vulnerability in a real environment through carefully constructed bytecode. Specifically, an attacker can trigger an infinite loop in the IDLeak validator, consuming all CPU cycles of Sui nodes with just about 100 bytes of payload, effectively preventing new transaction processing and leading to a denial of service for the Sui network.

The Persistent Harm of the "Hamster Wheel" Attack in the Sui Network

Sui's bug bounty program has strict criteria for assessing the severity of vulnerabilities, primarily based on the level of harm to the entire network. Vulnerabilities that meet the "critical" rating must cause the entire network to shut down and effectively obstruct new transaction confirmations, requiring a hard fork to fix the issue; if the vulnerability only causes some network nodes to deny service, it can be rated at most as "medium" or "high" severity.

The "Hamster Wheel" vulnerability discovered by the CertiK Skyfall team can cause the entire Sui network to shut down and requires an official release of a new version for upgrade and repair. Based on the severity of this vulnerability, Sui ultimately rated it as "critical." To further understand the serious impact caused by the "Hamster Wheel" attack, it is necessary to understand the complex architecture of Sui's backend system, especially the entire process of on-chain transaction publication or upgrade.

Overview of Transaction Submission Interaction in Sui

Initially, user transactions are submitted through the front-end RPC, and after basic validation, they are passed to the backend service. The Sui backend service is responsible for further validating the incoming transaction payload. After successfully verifying the user's signature, the transaction is transformed into a transaction certificate (containing transaction information and Sui's signature).

These transaction certificates are fundamental components of the Sui network's operation and can be propagated among various validator nodes in the network. For contract creation/upgrade transactions, before they can be on-chain, the validator nodes will call the Sui validator to check and validate the validity of the structure/semantics of these certificates. It is at this critical validation stage that the "dead loop" vulnerability can be triggered.

When this vulnerability is triggered, it causes the validation process to be indefinitely interrupted, effectively obstructing the system's ability to process new transactions and leading to a complete shutdown of the network. To make matters worse, this situation persists even after node restarts, meaning that traditional mitigation measures are far from sufficient. Once triggered, the vulnerability results in "persistent damage," leaving a lasting impact on the entire Sui network.

Sui's Solution

After receiving feedback from CertiK, Sui promptly confirmed the vulnerability and released a patch to address this critical flaw. The patch ensures consistency between state changes and the changed flag, eliminating the critical impact caused by the "Hamster Wheel" attack.

To eliminate the aforementioned inconsistency, Sui's fix includes a small but crucial adjustment to the AbstractState::join function. This patch removes the logic that determines the state merge result before executing AbstractValue::join, replacing it with first executing the AbstractValue::join function for state merging, and then setting the change flag based on comparing the final updated result with the original state value (old_value).

In this way, the result of state merging will remain consistent with the actual update result, preventing dead loops during the analysis process.

In addition to fixing this specific vulnerability, Sui also deployed mitigation measures to reduce the impact of future validator vulnerabilities. According to Sui's response in the bug report, the mitigation measures involve a feature called Denylist.

"However, validators have a node configuration file that allows them to temporarily reject certain categories of transactions. This configuration can be used to temporarily prohibit the processing of publication and package upgrades. Since this bug occurs when the Sui validator runs before signing the publication or package upgrade transaction, the denylist will stop the validator's operation and discard malicious transactions. Temporarily denying these transaction types is a 100% effective mitigation measure (although it will temporarily interrupt the service of those attempting to publish or upgrade code).

By the way, we have had this TX denylist configuration for some time, but we have also added a similar mechanism for certificates as a follow-up mitigation to the "validator dead loop" vulnerability you reported earlier. With this mechanism, we will have greater flexibility against such attacks: we will use the certificate denylist configuration to make the validator forget bad certificates (breaking the dead loop) and use the TX denylist configuration to prohibit publication/upgrade, thus preventing the creation of new malicious attack transactions. Thank you for prompting us to think about this issue!

Validators have a limited number of "ticks" (different from gas) for bytecode verification before signing transactions. If all bytecode published in a transaction cannot be verified within this many ticks, the validator will refuse to sign the transaction, preventing it from executing on the network. Previously, the metering only applied to a selected set of complex validators. To address this issue, we will extend metering to every validator to ensure that the work performed by the validator during each tick of the verification process is constrained. We have also fixed the potential infinite loop bug in the IDLeak validator."

-- From Sui developers regarding the vulnerability fix

In summary, the Denylist allows validators to temporarily circumvent exploitation of vulnerabilities in the validator by disabling the publication or upgrade process, effectively preventing potential damage from some malicious transactions. When the Denylist mitigation measures are in effect, nodes ensure their continued operation by sacrificing their ability to publish/update contracts.

Conclusion

In this article, we shared the technical details of the "Hamster Wheel" attack discovered by the CertiK Skyfall team, explaining how this new type of attack exploits a critical vulnerability to cause a complete shutdown of the Sui network. Additionally, we carefully examined Sui's timely response to fix this critical issue and shared the methods for vulnerability remediation and subsequent mitigation of similar vulnerabilities.

ChainCatcher reminds readers to view blockchain rationally, enhance risk awareness, and be cautious of various virtual token issuances and speculations. All content on this site is solely market information or related party opinions, and does not constitute any form of investment advice. If you find sensitive information in the content, please click "Report", and we will handle it promptly.
ChainCatcher Building the Web3 world with innovators