Vitalik: How do centralized exchanges prove their funds?

vitalik
2022-11-20 17:50:14
Collection
This article will delve into the historical attempts to bring trading closer to a trustless model, the limitations of these technologies, and some updated and more powerful ideas relying on ZK-SNARKs and other advanced technologies.

Author: Vitalik

Compiled by: Dong Yiming, ChainCatcher

Whenever a major centralized exchange blows up, a common question arises: can we use cryptographic techniques to solve the problem? Exchanges can create cryptographic proofs indicating that the funds they hold on-chain are sufficient to cover their liabilities to users, rather than relying solely on "legal" methods such as government licenses, auditors, and the personal backgrounds of individuals governing and operating the exchange.

Exchanges can establish a system where funds cannot be withdrawn without the consent of depositors. Potentially, we can explore the entire spectrum between "aspirational good actor CEXs that don't do bad things" and "on-chain DEXs that can't do bad things," but are currently inefficient and leak privacy. This article will delve into historical attempts to bring trading closer to trustlessness, the limitations of these technologies, and some updated and more robust ideas relying on ZK-SNARKs and other advanced techniques.

Balance Sheet and Merkle Tree : Old-School Proof of Solvency

Exchanges have been trying to use cryptographic methods to prove they are not deceiving their users for a long time. In 2011, the largest Bitcoin exchange at the time, MtGox, proved it had funds by sending a transaction that transferred 424242 BTC to a pre-announced address. By 2013, discussions began on how to address the other side of the problem: proving the total scale of customer deposits. If you prove that customer deposits equal X ("Proof of Liabilities") and prove ownership of the private key for X coins ("Proof of Assets"), then you have a proof of solvency: you have demonstrated that the exchange has enough funds to repay all depositors.

The simplest way to prove deposits is to simply publish a list of (username, balance) pairs. Each user can check if their balance is included in the list, and anyone can check the complete list to see (i) that each balance is non-negative, and (ii) that the total amount equals the liabilities. Of course, this compromises privacy, so we can slightly modify the scheme: publish a list of (hash(username, salt), balance) pairs and privately send each user their salt value. But even this would leak balances and the patterns of balance changes. The desire to protect privacy led us to the next invention: Merkle Tree technology.

image

Green: Charlie node. Blue: David node, which is also the node Charlie will receive as part of his proof. Yellow: Root node, publicly displayed to everyone.

Merkle tree technology involves placing the customer balance sheet into a Merkle Sum Tree. In a Merkle sum tree, each node is a (balance, hash) pair. The underlying leaf nodes represent individual customer balances and salted username hash values. In each higher-level node, the balance is the sum of the two balances below, and the hash is the hash of the two nodes below. Merkle sum proofs, like Merkle proofs, are a "branch" of the tree, consisting of sibling nodes along the path from leaf to root.

The exchange will send each user a Merkle sum proof of their balance to prove that their balance is included correctly. Users will then receive assurance that their balance is correctly included in the total. A simple code example can be found ++here++**.

The privacy leakage in this design is much lower than in a fully public list, and can be further reduced by shuffling branches each time the root is published, but some privacy leakage still exists. Charlie can learn that someone's balance is 164 ETH, that two users have a combined balance of 70 ETH, and so on. An attacker controlling many accounts could still glean a significant amount of information about exchange users.

An important subtlety of this scheme is the possibility of negative balances: what if an exchange has 1390 ETH in customer balances but only 890 ETH in reserves, and tries to cover the shortfall by adding a -500 ETH balance under some fake account on the tree? It turns out that this possibility does not undermine the scheme, although this is precisely why we need a Merkle sum tree rather than a regular Merkle tree. Suppose Henry is a fake account controlled by the exchange, and the exchange places -500 ETH there.

image

Greta's proof verification will fail: the exchange will have to give her Henry's -500 ETH node, which she will reject because it is invalid. Eve and Fred's verifications will also fail because the total ETH amount above Henry's intermediate node is -230, making it invalid too! To escape theft, the exchange would have to hope that no one checks their balance proofs in the entire right half of the tree.

If the exchange can identify users worth 500 ETH, believing that these users either won't bother checking the proof or won't be believed when complaining about never receiving the proof, then they can confidently escape the penalty for theft. However, the exchange could also exclude these users from the tree, achieving the same effect.

Thus, if the goal is merely to achieve proof of liabilities, Merkle tree technology is essentially as good as a proof-of-liabilities scheme. However, its privacy properties are still not ideal. You could use Merkle trees in a more clever way, such as by ++making each satoshi or wei a separate leaf++, but ultimately, with more modern technologies, there are better ways to achieve this.

Improving Privacy and Robustness with ZK-SNARKs

ZK-SNARKs are a powerful technology. The role of ZK-SNARKs in cryptography may be akin to the role of transformers in artificial intelligence: it is such a powerful general-purpose technology that it will completely crush a host of problems in specific application technologies developed decades ago. Therefore, of course, we can use ZK-SNARKs to greatly simplify and improve the privacy of liability proof protocols.

The simplest thing we can do is to place all users' deposits into a Merkle tree (or more simply, a ++KZG commitment++) and use ZK-SNARKs to prove that all balances in that tree are non-negative and sum to some claimed value. If we add a layer of hashing for privacy, the Merkle branch (or KZG proof) given to each user will not reveal any other user's balance.

image

Using KZG commitments is one way to avoid privacy leakage, as there is no need to provide "sibling nodes" as proof; a simple ZK-SNARK can be used to prove the sum of balances and that each balance is non-negative.

We can use a specially designed ZK-SNARK to prove the sum and non-negativity of the balances in the aforementioned KZG. Here is a simple example of how to do this. We introduce an auxiliary polynomial that "establishes the bits of each balance" (for example, we assume the balances are integers), and tracks a running total with an offset every 16 positions, so that its sum is zero only when the actual total matches the claimed total. If z is a -128th root of unity, we can prove the following equation.

image

The first valid setup value is 0 0 0 0 0 0 0 0 0 0 1 2 5 10 20 -165 0 0 0 0 0 0 0 0 1 3 6 12 25 50 -300 …

For further explanations on how to convert such equations into polynomial checks and then into ZK-SNARKs, see my ++article on ZK-SNARKs++ for further explanations ++here++ and ++here++. This is not an optimal protocol, but it does show that these types of cryptographic proofs are not so esoteric these days!

With just a few additional formulas, such a constraint system can adapt to more complex environments. For example, in a leveraged trading system, individual users having negative balances is acceptable, provided they have enough other assets to cover the funds with some collateral. A SNARK can be used to prove this more complex constraint, reassuring users that the exchange will not put their funds at risk through ++secret exemptions++ from other users' rules.

In the longer term, this ZK proof of liabilities may not only apply to customer deposits at exchanges but could also extend to broader lending. Anyone taking out a loan would place a record in a polynomial or tree containing that loan, and the root of that structure would be published on-chain. This would allow anyone seeking a loan to provide lenders with a ZK proof that they have not borrowed too much from other loans. Ultimately, legal innovations could even make loans that have been committed in this way have a higher priority than those that have not been committed. This leads us in the same direction as the idea discussed in ++"Decentralized Society: Finding Web3's Soul"++: establishing a concept of negative reputation or collateral on-chain through some form of "soulbound token."

Proof of Assets

The simplest version of proof of assets is the protocol we saw above: to prove you hold X coins, you simply move X coins in a transaction at a pre-agreed time or in a transaction where the data field contains "these funds belong to Binance." To avoid paying transaction fees, you can sign an off-chain message instead; both Bitcoin and Ethereum have standards for off-chain signed messages.

This simple asset proof technique has two practical issues.

  • "Cold Storage" Handling
  • Double Use of Collateral

For security reasons, most exchanges keep the vast majority of customer funds in "cold storage": on offline computers, where transactions need to be manually signed and transferred to the internet. The cold storage setup I once used for personal funds involved a permanently offline computer that generated a QR code containing signed transactions, which I could scan with my phone. Modern exchange protocols are even crazier, often involving multiparty computations across several devices. Given this setup, even an additional piece of information to prove control over an address is an expensive operation!

A transaction can take several paths:

  • Retain a few publicly long-term used addresses. The exchange will generate several addresses, publish a proof for each address to prove ownership, and then reuse those addresses. This is by far the simplest scheme, although it does add some constraints on how to protect security and privacy.
  • Set up many addresses and randomly prove a few. The exchange might have many addresses, perhaps even using each address only once, and retiring them after a transaction. In this case, the exchange may have a protocol to randomly select a few addresses from time to time that must be "opened" to prove ownership. Some exchanges have already done similar things with auditors, but in principle, this technology could turn into a fully automated program.
  • More complex ZKP options. For example, an exchange could set all its addresses to a 1/2 multisignature, where each address's key is different, and the other is a blind version of some "significant" emergency backup key, stored in some complex but very high-security way, such as a 12/16 multisignature. To protect privacy and avoid exposing its entire set of addresses, the exchange could even run a zero-knowledge proof on the blockchain that proves the total balance of all addresses in this format.

Another major issue is preventing double use of collateral. Exchanges can easily move collateral back and forth between each other for reserve proof, which would allow them to pretend to be solvent when they are not. Ideally, proof of solvency should be done in real-time, updating the proof after each block. If this is not feasible, a suboptimal choice would be to coordinate a fixed schedule between different exchanges, for example, proving reserves every Tuesday at 14:00 UTC.

The final question is: Can asset proof be done on fiat currency? Exchanges do not just hold cryptocurrencies; they also hold fiat currency within the banking system. Here, the answer is: yes, but such a procedure will inevitably rely on a "fiat" trust model: banks themselves can prove balances, auditors can prove balance sheets, and so on. Given that fiat cannot be cryptographically verified, this is the best that can be done within that framework, but it is still worth doing.

Another approach is to cleanly separate an entity running the exchange and handling assets-backed stablecoins like USDC from another entity that manages the cash flow in and out between cryptocurrencies and the traditional banking system (USDC itself). Since USDC's "liabilities" are just ERC20 tokens on-chain, proof of liabilities is "free," requiring only proof of assets.

Plasma and Validiums: Can We Make CEXs Unconstrained?

Suppose we want to go further: we do not want to merely prove that the exchange has funds to repay users. Instead, we want to completely prevent exchanges from stealing users' funds.

The first significant attempt in this regard was Plasma, a scaling solution popular in the Ethereum research circles in 2017 and 2018. Plasma works by splitting balances into a set of individual "coins," each assigned an index and located at a specific position in the Merkle tree of Plasma blocks. Effectively transferring a coin requires placing the transaction in the correct position in the tree, while the root of the tree will be published on-chain.

image

A simplified schematic of a version of Plasma. Coins are stored in a smart contract, enforcing the rules of the Plasma protocol upon withdrawal.

OmiseGo attempted to create a decentralized exchange based on this protocol, but since then, they have turned to other ideas—so has the Plasma group itself, which is now the optimistic EVM rollup project ++Optimism++.

The technical limitations of Plasma envisioned in 2018 (for example, ++proof of coin fragmentation++) are not worth looking at. Since the peak of Plasma discourse in 2018, ZK-SNARKs have become more feasible in scaling-related use cases, as mentioned above, ZK-SNARKs change everything.

A more modern version of the Plasma idea is what Starkware calls ++validium++: essentially the same as ZK-rollup, except that the data is stored off-chain. This structure can be used for many use cases, and one can imagine any centralized server needing to run some code and prove its correct execution. Within a validium, operators have no way to steal funds, although depending on the implementation details, some amount of user funds may be stuck if the operator disappears.

All of this is really good: CEXs and DEXs are far from being a binary opposition; it turns out there is a whole set of options, including various forms of hybrid centralization, where you can gain some benefits like efficiency while still having many cryptographic guardrails to prevent centralized operators from engaging in most forms of abuse.

image

However, in the right half of this design space, we still need to discuss the most fundamental issue: Handling User Errors. So far, the most important type of error is: what to do if a user forgets their password, loses their device, gets hacked, or otherwise loses access to their account?

Exchanges can address this issue: first with email recovery, and if that fails, through more complex forms of recovery via KYC. However, to be able to resolve such issues, exchanges need to truly have control over the coins. To be able to recover user account funds for good reasons, exchanges need the power, which could also be used for bad reasons to steal user account funds. This is an unavoidable trade-off.

The ideal long-term solution is to rely on self-custody, supplemented by ++multisignature++ and social recovery wallets to help users handle emergencies. But in the short term, there are two obvious alternatives with significantly different costs and benefits.

image

Conclusion: Better Exchanges in the Future

In the short term, there are two obvious "categories" of exchanges: custodial exchanges and non-custodial exchanges. Today, the latter category is simply DEXs like Uniswap, and in the future, we may also see cryptographically "constrained" CEXs, where users' funds are held in something akin to validium smart contracts. We may also see semi-custodial exchanges that we trust with fiat rather than cryptocurrencies.

Both types of exchanges will continue to exist, and the simplest backward-compatible way to enhance the security of custodial exchanges is to increase proof of reserves. This includes a combination of proof of assets and proof of liabilities. There are technical challenges in developing good protocols for both, but we should also make progress in both areas as much as possible and open-source the software and processes so that all exchanges can benefit.

In the longer term, I hope we get closer to all exchanges being non-custodial, at least in terms of cryptocurrencies. Wallet recovery will exist, and for new users dealing with small amounts, as well as institutions that need such arrangements for legal reasons, there may need to be highly centralized recovery options, but this can be done at the wallet layer rather than within the exchange itself. The interaction methods with platforms like ++magic.link++ and ++Polymarket++ are an example of this approach. In terms of fiat, the flow between the traditional banking system and the cryptocurrency ecosystem can be accomplished through asset-backed stablecoins (like USDC) with local cash inflow/outflow processes. However, it will take some time before we fully achieve this goal.

Special thanks to Balaji Srinivasan, as well as discussions with staff from Coinbase, Kraken, and Binance.

ChainCatcher reminds readers to view blockchain rationally, enhance risk awareness, and be cautious of various virtual token issuances and speculations. All content on this site is solely market information or related party opinions, and does not constitute any form of investment advice. If you find sensitive information in the content, please click "Report", and we will handle it promptly.
ChainCatcher Building the Web3 world with innovators