Million Dollar Minefield: In-Depth Analysis of DeFi Asset Authorization Vulnerabilities
This article is from Amber Group, authored by Wu Jiazhi. The original title is: "Exploiting Primitive Finance Approval Flaws."
Event Summary: On February 24, a vulnerability analysis report regarding Primitive Finance (a derivatives protocol on the Ethereum chain) drew attention within the community. The report described three white hat attacks (hacker attacks aimed at identifying security vulnerabilities) and the principles behind the vulnerabilities. More than a month later, on April 14, a team led by Dr. Wu Jiazhi, a blockchain security expert from Amber Group, discovered a wallet address with over 1 million dollars in assets (500 WETH) at risk. After reproducing the attack locally, the team contacted the Primitive project team through Immunefi (a DeFi bug bounty platform) and successfully assisted potential victims in resetting WETH authorization to avert the crisis. This article will introduce how the team exploited this vulnerability in a simulated environment and how they identified potential victim wallet addresses through blockchain data analysis.
Principle: Gaps in Smart Contracts
In the current architecture of EVM (Ethereum Virtual Machine) and ERC-20 (a protocol standard for Ethereum smart contracts), when users interact with a smart contract, the contract itself lacks a callback mechanism to capture ERC-20 transfer events from the code level. For example, when Alice sends 100 XYZ tokens to Bob, Bob's XYZ balance is updated in the XYZ contract. But how does Bob know that his XYZ has increased? He can check Etherscan (an Ethereum explorer) or his wallet app, which automatically retrieves the latest balance from Ethereum nodes. If Alice sends 100 XYZ to a smart contract Charlie, how does Charlie know that his XYZ balance has increased?
In fact, Charlie cannot actively obtain his latest balance at the moment he receives 100 XYZ because the transfer occurs on the XYZ contract, not on Charlie's contract. Once a smart contract is deployed, it functions like an operating system, a set of code placed somewhere that needs to be called to take effect. To address this issue, the ERC-20 standard has a widely used mechanism—approve()/transferFrom().
For example, when Alice needs to deposit 100 XYZ tokens into Charlie, she can authorize Charlie to use her 100 XYZ in advance. At this point, Charlie's deposit() function can actively withdraw 100 XYZ from Alice's wallet in a single transaction via transferFrom() and update the state of the Charlie contract (e.g., increasing Alice's XYZ deposit balance cXYZ). To reduce friction, many DApps even allow users to authorize an unlimited amount of XYZ to the project address, enabling subsequent transferFrom() calls to succeed directly, eliminating the need for multiple authorization clicks and fees, effectively whitelisting Charlie. This approach leaves a potential risk; if Charlie acts maliciously or is attacked, Alice's assets could be in danger.
An incident that occurred on June 18, 2020, confirmed how a compromised or problematic smart contract could be exploited, leading to asset loss. As shown in the code below, safeTransferFrom(), despite being named a safe transferFrom, was inadvertently declared a public function, allowing anyone to transfer any amount (value) of any asset (token) from any user (from) to any address (to) using the Bancor contract.
Simply put, if Alice had previously used Bancor and authorized Bancor unlimited access to her DAI, as soon as her DAI balance was greater than zero, a hacker could immediately transfer her DAI away.
Diagnosis: How Did the Hacker Bypass the "Security Check"?
According to the vulnerability analysis report mentioned above, this external function has a similar vulnerability but cannot be directly exploited like the Bancor vulnerability. In fact, the attacker needs to forge two ERC20 token contracts, a Uniswap liquidity pool, and initiate a Uniswap flash loan to bypass the msg.sender == address(this) check marked in the diagram below. It sounds complicated, but for an experienced hacker, it is not too difficult.
Why does Primitive need to implement an interface like flashMintShortOptionsThenSwap()? It actually has specific use cases. In the openFlashLong() function, flashMintShortOptionsThenSwap() is encapsulated in a Uniswap flash-swap call parameter, triggered after the flash-swap at line 1371, called by the callback function UniswapV2Call(). At this point, since UniswapV2Call() is within the Primitive contract, it can pass the msg.sender == address(this) check mentioned above.
It is worth noting that in the openFlashLong() function, line 1360 states msg.sender, indicating that under normal circumstances, Primitive can only use the funds of the caller itself. However, the attacker can directly call the Primitive contract's UniswapV2Call() using a forged pair and params in a manner similar to line 1371, bypassing the check of flashMintShortOptionsThenSwap(). Since params can be completely controlled in this case, the msg.sender in line 1360 can be replaced with any wallet address that has previously authorized Primitive, allowing the transferFrom() call in flashMintShortOptionsThenSwap() to steal assets.
Tracking: Identifying Potential Victims
If a hacker happens to know that a "big player" has authorized a problematic contract, they can easily exploit this vulnerability to steal a large amount of funds from the victim. However, this is difficult to achieve using only a block explorer, especially when the contract has been deployed for a long time and has a large user base. The data that needs to be analyzed cannot be achieved through manual searches on Etherscan.
Google Cloud Public Datasets (datasets hosted by Google on BigQuery) can play a role here. Since every successful approve() call emits an Approval() event on Ethereum, we can use BigQuery (Google's cloud data warehouse solution for processing "big data" reports) to find all events and filter out the parts we are interested in, such as all events where _spender is the Primitive contract.
Below is the actual SQL statement we used on GCP to identify potential victims. In line five, we can see that we specified the Ethereum database and the table recording events. Line seven filters for Approval() events, and line eight filters for a specific _spender. Additionally, line six sets the block height range after the Primitive contract was deployed, significantly reducing the amount of data scanned by BigQuery. Such SQL optimizations will directly reflect in your GCP bill.
Next, we can further optimize the SQL query to exclude accounts that have already reset their authorization through approve(_spender, 0), resulting in the final account list. With the final list, we used a script to monitor these accounts and issue alerts when these risky accounts received a large amount of assets, as this could potentially lead to significant losses.
On a Wednesday morning, the bot issued an alert that a potential victim received nearly 500 WETH worth over one million dollars at 5:24 AM Beijing time on April 13. Compared to the previously disclosed three white hat attacks, if this victim were successfully attacked, the amount lost would exceed the total of the earlier three cases.
At 9:32 AM Beijing time, we urgently contacted Immunefi, the operator of the bug bounty program for the Primitive project, and demonstrated how we (re)exploited this vulnerability to steal 500 WETH from the victim in a simulated environment, providing evidence including the screenshot below.
With the help of the Primitive team, the potential victim reset the WETH authorization at 10:03 AM, averting the crisis.
Two days later, the Primitive team also rewarded the vulnerability discovery and publicly thanked us. The bounty has been donated to CryptoRelief (a relief fund dedicated to assisting with the COVID-19 pandemic in India) before the publication of this article.
Reproduction: Distributing and Exploiting the Vulnerability
The first step in exploiting the vulnerability is to prepare two ERC20 contracts: Redeem and Option.
The Redeem contract is a standard ERC20; we only need to expose the mint() interface based on OpenZeppelin's implementation to control the token quantity, as shown below:
The Option contract will be relatively more complex. From the code snippet below, we can see that we need to deliberately construct some global variables (e.g., redeemToken) and public functions (e.g., getBaseValue()), which will be used in Primitive's business logic. Additionally, we need to pass three parameters to initialize the Option contract:
- redeemToken: The address of the earlier constructed Redeem contract
- underlyingToken: The asset contract address held by the target account
- beneficiary: The address of the beneficiary, which is the target address to which the victim's assets will be transferred after a successful attack
It is important to note the mintOptions() function. As seen from the code above, it will directly send all underlyingToken to the beneficiary address. This is because the internal function mintOptionWithUnderlyingBalance() will send underlyingToken to the Option token contract when called by flashMintShortOptionsThenSwap() and will mint Option tokens through mintOptions(). Therefore, in the forged Option contract, we can treat mintOptions() as a withdrawal call, sending underlyingToken to the beneficiary (i.e., the address initiating the attack) for repaying the flash loan funds later.
Next, we can create a Uniswap liquidity pool using the Redeem and Option tokens we just created. The address of this pool will be used to receive funds transferred from the victim's wallet. In fact, each Uniswap pool contains two equivalent assets, such as WETH and Redeem (i.e., Option.redeemToken()). To complete the exploitation, we must inject liquidity into the pool. Redeem is created by us and can mint an unlimited amount of tokens, but what about WETH?
With the help of flash loans, we can essentially utilize an unlimited amount of funds as long as we ensure that we can repay the funds in a single transaction. In this case, we borrowed funds equivalent to 99.7% of the victim's total assets using Aave V2's flash loan to deposit into the aforementioned liquidity pool.
According to Aave's design, a callback function executeOperation() needs to be implemented to execute operations after obtaining the loaned funds (e.g., calling Lib.trigger()), and finally, through an approve() call, authorize the Aave contract to withdraw the flash loan assets and fees.
Conclusion
In the EVM-based smart contract world, approve()/transferFrom() has long been an inherent issue. For DeFi users, it is essential to pay attention to whether your wallet address is allowing others to use your assets and to periodically reset asset usage rights. For project teams, it is crucial to invest more time and effort in testing your code from various possible angles, even simulating attacks, because you are programming with the real money of every user.
About the Author
Wu Jiazhi is employed as a blockchain security expert at Amber Group, a global leader in crypto financial intelligence services. He graduated with a Ph.D. in Computer Science from North Carolina State University in the United States, studying under Professor Jiang Xuxian, a leader in Android security. During his studies in the U.S., he engaged in system security research, primarily focusing on virtualization security and Android system security. Dr. Wu has a significant influence in the global Android security field, having published multiple scientific papers and possessing extensive experience in Android system vulnerability security. He transitioned to the blockchain security field in 2017 and served as the head of the world's first decentralized anonymous testing platform, DVP (Decentralized Vulnerability Platform), calling on white hat hackers across the internet to identify vulnerabilities in open-source underlying code.