Exploring the design space and challenges of implementing oracles in DeFi protocols
Written by: Adrian Chow
Abstract:
- Oracles are indispensable for securing the locked value of DeFi protocols. Of the total locked value of $50 billion in DeFi, $33 billion is secured by oracles.
- However, the inherent time delay in oracle price updates leads to a subtype of value extraction known as Oracle Extractable Value (OEV), which includes front-running, arbitrage, and inefficient liquidations.
- Currently, there are increasingly more design implementations to prevent or mitigate the negative outflow of OEV, each with its unique trade-offs. This article discusses the choices and trade-offs of existing designs and proposes two new concepts, their value propositions, unresolved issues, and development bottlenecks.
Introduction
Oracles can be considered one of the most important infrastructures in today's DeFi. They are an essential part of most DeFi protocols, which rely on price feeds to settle derivative contracts, liquidate under-collateralized positions, and more. Currently, oracles secure $33 billion in value, accounting for at least two-thirds of the total locked value of $50 billion on-chain. However, for application developers, integrating oracles brings significant design trade-offs and issues stemming from value loss due to front-running, arbitrage, and inefficient liquidations. This article categorizes this value loss as Oracle Extractable Value (OEV) and outlines its key issues from the perspective of applications, attempting to clarify the key considerations for safely and reliably integrating oracles into DeFi protocols based on industry research.
Oracle Extractable Value (OEV)
This section assumes that the reader has a basic understanding of oracle functions and the distinction between push-based and pull-based oracles. Individual oracles may provide different price feeds. For an overview, classification, and definitions, please refer to the appendix.
Most applications that use oracle price feeds only need to read prices: decentralized exchanges running their own pricing models use oracle price feeds as reference prices; when depositing collateral for over-collateralized loan positions, they only need the oracle to read prices to determine initial parameters such as loan-to-value ratios and liquidation prices; aside from extreme cases like long-tail assets where price updates are infrequent, the delay in oracle price updates is generally not significant when considering system design. Therefore, the most important consideration for oracles is assessing the accuracy of price contributors and the decentralization performance of oracle providers.
However, if the delay in price updates is a significant consideration, more attention should be paid to how oracles interact with applications. In such cases, these delays can lead to value extraction opportunities, namely front-running, arbitrage, and liquidations. This subtype of MEV is referred to as OEV. Before discussing various implementation schemes and their trade-offs, we will outline the different forms of OEV.
Arbitrage
Oracle front-running and arbitrage are colloquially referred to as "toxic flow" in derivative protocols because these trades occur under conditions of information asymmetry, often extracting risk-free profits at the expense of liquidity providers. OG DeFi protocols like Synthetix have been dealing with this issue since 2018 and have tried various solutions over time to mitigate these negative externalities.
Let’s illustrate with a simple example; a decentralized exchange for perpetual contracts, xyz, uses the Chainlink oracle in the ETH/USD market, illustrated with the ETH/USD price feed:
Figure 1: Example of arbitrage using the Chainlink oracle
Although the above is an overly simplified example that does not consider factors like slippage, fees, or capital, it illustrates the role of the deviation threshold leading to insufficient price granularity and the opportunities that arise from it. Seekers can monitor the delay in spot market price updates based on Chainlink's on-chain storage and extract zero-risk value from liquidity providers (LP).
Front-running
Front-running is similar to arbitrage and is another form of value extraction where seekers monitor oracle updates in the memory pool and execute trades at actual market prices before they are submitted on-chain. This gives seekers the opportunity to trade at prices favorable to their trading direction before the oracle updates.
Decentralized exchanges like GMX have long been victims of toxic front-running; approximately 10% of protocol profits have been lost to front-running before all oracles on GMX are updated through KeeperDAO.
What if we only adopt a pull-based model?
One of Pyth's value propositions is that using Pythnet, built on the Solana architecture, publishers can push price updates to the network every 300 milliseconds, maintaining low-latency price feeds. Therefore, when applications query prices through Pyth's API, they can retrieve the latest prices, update them to the on-chain storage of the target chain, and execute any downstream operations in the application logic in a single transaction.
As mentioned above, applications can directly query the latest price updates from Pythnet, update on-chain storage, and complete all relevant logic in a single transaction. Doesn't this effectively solve the issues of front-running and arbitrage?
Not quite - Pyth's updates give users the ability to choose which prices to use in transactions, which can lead to adversarial selection (another rhetoric for toxic flow). While on-chain stored prices must be timely, users can still choose any price that meets these constraints - meaning arbitrage opportunities still exist, as it allows seekers to see future prices before using past prices. Pyth's documentation suggests that a simple way to guard against this attack vector is to include staleness checks to ensure prices are recent - however, there must be a certain buffer time for updating transaction data in the next block; how do we determine the optimal time threshold?
Let’s analyze the decentralized exchange xyz, which now uses the Pyth ETH/USD price feed, with a staleness check time of 20 seconds, meaning the timestamp of the Pyth price must be within 20 seconds of the block timestamp executing the downstream transaction:
Figure 2: Example process of front-running using Pyth
An intuitive idea is to simply lower the staleness check threshold, but a lower threshold may lead to unpredictable network responses due to block time, affecting user experience. Since Pyth's price feeds rely on bridging, sufficient buffering is needed to a) allow time for Wormhole guardians to prove prices, and b) enable the target chain to process transactions and include them in blocks. The next section will elaborate on these trade-offs.
Liquidations
Liquidations are a core part of any leveraged protocol, and the granularity of price updates is crucial in determining liquidation efficiency.
In the case of threshold-based push oracles, when the spot price reaches a threshold but does not meet the parameters set by the oracle price feed, the granularity of price updates (or lack thereof) can lead to missed liquidation opportunities. This brings negative externalities in the form of market inefficiencies.
When a liquidation occurs, applications typically pay a portion of the liquidated collateral and may sometimes provide rewards to the user initiating the liquidation. For example, in 2022, Aave paid out $37.9 million in liquidation rewards on the mainnet. This clearly overcompensates third parties and leads to poor user experiences. Additionally, when there is extractable value, the ensuing Gas Wars can lead to value loss from the application, flowing into the MEV supply chain.
Design Space and Considerations
In light of the above issues, the following will discuss various implementation schemes based on push, pull, and alternative designs, each regarding their effectiveness in addressing the aforementioned problems and the trade-offs involved; these trade-offs can take the form of additional centralization and trust assumptions or poor user experiences.
Oracle-specific Order Flow Auctions (OFA)
Order Flow Auctions (OFA) have emerged as a solution to eliminate the negative externalities generated by MEV. Broadly speaking, OFA is a generalized third-party bidding service where users can send orders (trades or intents), and seekers extracting MEV can bid for exclusive rights to execute their order strategies. A significant portion of the bidding revenue is returned to users to compensate for the value they create in these orders. Recently, the adoption rate of OFA has surged, with over 10% of Ethereum transactions occurring through private channels (private RPC/OFA), and it is expected to catalyze further growth.
Figure 3: Merged daily private Ethereum transaction count. Source: Blocknative
In oracle updates, the challenge of implementing a generalized OFA is that oracles cannot know whether updates based on standard rules will generate any OEV; if not, it will introduce additional delays when the oracle sends transactions to the auction. On the other hand, the simplest way to streamline OEV and minimize delays is to provide all oracle order flows to a single dominant seeker. However, this clearly introduces significant centralization risks, potentially fostering rent-seeking behavior and censorship, leading to poor user experiences.
Figure 4: General OFA vs. Oracle-specific OFA
Price updates from existing rule-based oracles without an oracle-specific OFA still occur in the public memory pool. This allows the price updates from oracles, along with any resulting extractable value, to remain at the application layer. As a byproduct, it also allows seekers to request data source updates without imposing additional costs on oracle nodes for more frequent updates, thus improving data granularity.
Oracle-specific OFAs are ideal for liquidations because they can provide more granular price updates, maximizing capital returned to liquidated borrowers, reducing protocol rewards paid to liquidators, and retaining the value extracted from bidders within the protocol for redistribution to users. They also partially - though not completely - address front-running and arbitrage issues. Under perfect competition and first-price sealed-bid auction processes, the results of the bidding should be close to the block space cost of execution opportunities, the value extracted from front-running OEV data feeds, and the reduced arbitrage opportunities generated by increased price granularity from price updates.
Currently, to implement oracle-specific OFAs, one either needs to integrate third-party bidding services (like OEV-Share) or build a bidding service as part of the application. Inspired by Flashbots, API3 utilizes OEV relays as an API designed to execute DoS protection services for bidding. This relay is responsible for collecting meta-transactions from oracles, organizing and aggregating bids from seekers, and redistributing profits in a trustless manner without controlling the bids. When a seeker wins the bid, the updated data source can only rely on transferring the bid amount to a protocol-owned proxy contract, which then updates the price source using the signature data provided by the relay.
Figure 5: API3's OEV relay
Additionally, protocols can forgo intermediaries and establish their own bidding services to capture all extractable value from OEV. BBOX is an upcoming protocol that aims to embed bidding into its liquidation mechanism to capture OEV and return it to applications and their users.
Running a Centralized Node or Keeper
An early idea stemming from the first wave of decentralized exchanges for perpetual contracts to combat OEV was to run a centralized Keeper network, aggregating prices received from third-party sources (like centralized exchanges) and using data feeds similar to Chainlink as a contingency plan or circuit breaker. This model has been promoted in GMX v1.10 and many subsequent forks, with its main value proposition being that since the Keeper network is operated by a single entity, it can absolutely prevent front-running.
While this addresses many of the issues mentioned above, there are clear concerns about centralization. A centralized Keeper system can decide execution prices without properly verifying the pricing sources and aggregation methods. In the case of GMX v1, the Keeper was not an on-chain or transparent mechanism but rather a program running on centralized servers signed by team addresses. The core role of the Keeper is not only to execute orders but also to "decide" the trade price based on its preset definitions, without verifying the authenticity or source of the execution price used.
Automated Keeper Networks and Chainlink Data Streams
To address the centralization risks posed by a single operator Keeper network, a more decentralized automated network can be established using third-party service providers. Chainlink Automation is such a product, which, together with Chainlink Data Streams - a new pull-based, low-latency oracle - provides this service. This product was recently launched and is currently in closed testing, but GMX v2.11 is already using it as a reference for systems adopting this design.
At a high level, Chainlink Data Streams consist of three main components: Data DON (Decentralized Oracle Network), Automation DON, and on-chain verification contracts. The Data DON is an off-chain data network whose architecture is similar to that of Pythnet in maintaining and aggregating data. The Automation DON is a guardian network protected by the same node operators of the Data DON, used to extract prices from the on-chain Data DON. Finally, the verification contracts are used to verify the correctness of off-chain signatures.
Figure 6: Chainlink Data Streams architecture
The above diagram illustrates the transaction flow for invoking open trading functionality, where the Automation DON is responsible for obtaining prices from the Data DON and updating on-chain storage. Currently, direct queries to the Data DON endpoints are limited to whitelisted users, allowing protocols to choose to offload Keeper maintenance work to the Automation DON or run their own Keeper. However, as the product development lifecycle progresses, this is expected to gradually shift to a permissionless structure.
In terms of security, the trust assumptions relying on the Automation DON are the same as those using the Data DON alone, representing a significant improvement over single Keeper designs. However, if the authority to update price feeds is given to the Automation DON, then value extraction opportunities can only be left to nodes within the Keeper network. This, in turn, means that the protocol will trust Chainlink node operators (primarily institutions) to maintain their social reputation and not front-run users, similar to trusting Lido node operators not to monopolize block space due to their market share.
Pull-based: Delayed Settlement
One of the biggest changes in Synthetix perps v2 is the introduction of Pyth price feeds for settling perpetual contracts. This allows orders to be settled at Chainlink or Pyth prices, provided their deviations do not exceed predefined thresholds and the timestamps pass the staleness checks. However, as mentioned above, merely switching to pull-based oracles does not solve all OEV-related issues for protocols. To address front-running, a "last look" pricing mechanism can be introduced in the form of delayed orders, which effectively splits users' market orders into two parts:
Trade #1: Submit an "intent" to open a market order on-chain, providing standard order parameters such as size, leverage, collateral, and slippage tolerance. An additional Keeper fee must also be paid to reward the Keeper for executing Trade #2.
Trade #2: The Keeper receives the order submitted in Trade #1, requests the latest Pyth price feed, and calls the Synthetix execution contract in a single transaction. The contract checks predefined parameters such as timeliness and slippage, and if all pass, the order is executed, the on-chain price storage is updated, and the position is established. The Keeper charges a fee to compensate for the gas used in operating and maintaining the network.
This implementation does not give users the opportunity to engage in adversarial selection of the on-chain submitted prices, effectively solving the protocol's front-running and arbitrage opportunities. However, the trade-off of this design is user experience: executing this market order requires two transaction processes, and users must compensate for the gas used by the Keeper while also bearing the cost of updating the oracle's on-chain storage. Previously, this was a fixed fee of 2 sUSD, but it has recently changed to a dynamic fee based on the Optimism gas oracle + premium, with the premium varying based on layer 2 activity. In any case, this can be seen as a solution that sacrifices trader user experience to enhance LP profitability.
Pull-based: Optimistic Settlement
Due to delayed orders imposing additional network fees on users (proportional to the DA fees of layer 2 networks), we brainstormed another order settlement model called "optimistic settlement," which has the potential to reduce user costs while maintaining decentralization and protocol security. As the name suggests, this mechanism allows traders to execute market trades atomically, with the system actively accepting all prices and providing seekers a window to submit evidence proving maliciously placed orders. This section outlines the different versions of this concept, our thought process, and the unresolved issues.
Our initial idea was to establish a mechanism allowing users to submit prices when opening market orders through parsePriceFeedUpdates, then allowing users or any third party to submit settlement transactions using the price feed data, completing the trade at that price upon transaction confirmation. During settlement, any negative difference between the two prices would be accounted for as slippage in the user's profit and loss statement. The advantages of this approach include alleviating the cost burden on users and reducing the risk of front-running. Users would no longer bear the premium for rewarding Keepers, and since they would not know the settlement price when submitting orders, the risk of front-running remains manageable. However, this still introduces a two-step settlement process, which is one of the drawbacks we identified in Synthetix's delayed settlement model. In most cases, if the volatility during the order placement and settlement does not exceed the system-defined profitable front-running threshold, the additional settlement transaction may be unnecessary.
Another solution to circumvent the above issues is to allow the system to actively accept orders and then open a permissionless questioning period during which evidence can be submitted to prove that the price timestamp and block timestamp allow for profitable front-running.
The specific operation is as follows:
Users create orders based on current market prices. They then transmit the price along with embedded Pyth price feed byte data as part of the order creation transaction.
The smart contract actively verifies and stores this information.
After the order is confirmed on-chain, there will be a questioning period during which seekers can submit evidence of adversarial selection. This evidence will confirm that the trader used outdated price feed data with the intent to arbitrage within the system. If the system accepts the evidence, the difference will be applied as slippage to the trader's execution price, and the excess value will be rewarded to the Keeper.
After the questioning period ends, the system considers all prices valid.
This model has two advantages: it alleviates the cost burden on users, who only need to pay gas fees for order creation and oracle updates in the same transaction, without needing an additional settlement transaction. It also prevents front-running, protecting the integrity of liquidity pools and ensuring a healthy Keeper network with economic incentives to submit evidence to the system proving their front-running.
However, before putting this idea into practice, several issues remain to be resolved:
- Defining "adversarial selection": How does the system distinguish between users submitting expired prices due to network delays and users intentionally arbitraging? A preliminary idea could be to measure volatility during the staleness check period (e.g., 15 seconds); if volatility exceeds the net execution fee, the order could be flagged as a potential exploit.
Setting an appropriate questioning period: Considering that toxic order flows may only be open for a short time, what is an appropriate time window for Keepers to question prices? Batch proofs may be more cost-effective, but given the unpredictability of order flows over time, it is challenging to determine the timing for batch proofs to ensure all price information is validated or has sufficient time to be questioned.
Economic rewards for Keepers: To make submitting evidence reasonable for economically incentivized Keepers, the relevant rewards for submitting winning evidence must exceed the gas costs associated with submitting evidence. This assumption may not hold due to varying order sizes.
- Is there a need to establish a similar mechanism for closing orders? If so, how would it affect user experience?
- Ensuring that "unreasonable" slippage does not fall on users: In flash crash scenarios, there may be significant price differences between order creation and on-chain confirmation. Some form of backup or circuit breaker may be needed, possibly considering the use of Pyth's EMA price to ensure price stability before use.
ZK Co-processors - Another Form of Data Consumption
Another direction worth exploring is the use of ZK co-processors, which are designed to obtain on-chain states for complex off-chain computations while providing proof of the execution method; this can be verified permissionlessly. Projects like Axiom enable contracts to query historical blockchain data, perform computations off-chain, and submit ZK proofs to demonstrate that the computation results are correctly derived from valid on-chain data. Co-processors open up the possibility of building custom TWAP oracles with manipulation resilience using historical prices from multiple DeFi-native liquidity sources (like Uniswap + Curve).
Compared to traditional oracles that can only access the latest asset price data, ZK co-processors will expand the range of data securely available to dApps (Pyth does provide EMA prices for developers to use as reference checks for the latest prices). This allows applications to introduce more business logic that works in synergy with historical blockchain data, enhancing protocol security or improving user experience.
However, ZK co-processors are still in early development, and there are still some bottlenecks, such as:
The acquisition and computation of large amounts of blockchain data in a co-processor environment may require longer proof times.
Providing only blockchain data does not address the need for secure communication with non-Web3 applications.
Oracle-less Solutions - The Future of DeFi?
Another approach to solving this issue is to design a primitive from scratch that eliminates the need for external price feeds, thereby addressing DeFi's dependency on oracles. Recent developments in this area involve using various AMM LP tokens as pricing mechanisms, with the core idea being that the LP positions of constant function market makers represent tokens with preset weights of two assets, along with an automatic pricing formula for these two tokens (i.e., xy=k). By utilizing LP tokens (as collateral, loan bases, or in recent use cases, moving v3 LP positions to different scaling points), the protocol can obtain information that typically requires oracles. Thus, a new wave of trends - oracle-less solutions free from the aforementioned challenges - has emerged. Examples of applications based on this direction include:
Panoptic is building a permanent, oracle-less options protocol utilizing Uniswap v3 concentrated liquidity positions. Since concentrated liquidity positions will convert 100% into the underlying asset when the spot price exceeds the upper limit of the LP position, the returns for liquidity providers are very similar to those of sellers of put options. Therefore, the operation of the options market involves liquidity providers depositing LP assets or positions, while options buyers and sellers borrow liquidity and move it in or out of range, generating dynamic options returns. Since loans are priced in LP positions, there is no need for oracles during settlement.
Infinity Pools is leveraging Uniswap v3's concentrated liquidity positions to establish a no-liquidation, oracle-less leveraged trading platform. Uniswap v3 liquidity providers can lend their LP tokens, while traders deposit some collateral, borrow LP tokens, and redeem the relevant assets for their directional trades. The loans at redemption will be priced in either the underlying or quoted asset, depending on the price at redemption, and can be directly calculated by checking the LP composition on Uniswap, eliminating the need for oracles.
Timeswap is building a fixed-term, no-liquidation, oracle-less lending platform. It is a tri-party market consisting of lenders, borrowers, and liquidity providers. Unlike traditional lending markets, it employs "time-based" liquidation rather than "price-based" liquidation. In decentralized exchanges, liquidity providers are automatically set to always buy from sellers and sell to buyers; whereas in Timeswap, liquidity providers always lend to borrowers and borrow from lenders, playing a similar role in the market. They also bear the responsibility for loan defaults and are prioritized in receiving seized collateral as compensation.
Conclusion
Pricing data remains a crucial part of many decentralized applications, and over time, the total value obtained by oracles continues to increase, further affirming their product-market fit. This article aims to inform readers about the OEV-related challenges we currently face, as well as the design spaces of implementation schemes based on push, pull, and the use of AMM liquidity providers or off-chain co-processors.