a16z: How to Verify On-Chain Machine Learning Algorithms Using Zero-Knowledge Proofs?
Original Title: Checks and balances: Machine learning and zero-knowledge proofs
Original Author: Elena Burger, a16z
Compiled by: DeFi之道
In recent years, zero-knowledge proofs on the blockchain have primarily been used for two key purposes: (1) to scale computation-constrained networks by processing transactions off-chain and verifying results on the mainnet; (2) to protect user privacy by implementing shielded transactions, where only those with decryption keys can view them. In the context of blockchain, these features are clearly desirable: decentralized networks (like Ethereum) cannot increase throughput or block size without imposing unbearable demands on the processing power, bandwidth, and latency of validators (hence the need for validity rollups), and all transactions are visible to anyone (thus requiring on-chain privacy solutions).
However, zero-knowledge proofs are also useful for a third type of functionality: efficiently verifying that any type of computation (not just computations instantiated in the off-chain EVM) has been executed correctly. This has significant implications beyond the blockchain.
Now, advancements in systems that leverage zero-knowledge proofs to succinctly verify computational capabilities allow users to demand the same level of trustlessness and verifiability from every digital product as guaranteed by the blockchain, especially from machine learning models. The high demand for blockchain computation has incentivized research into zero-knowledge proofs, creating modern proof systems with smaller memory footprints and faster proof and verification times—making it possible to verify certain small machine learning algorithms on-chain.
So far, we may have all experienced the potential of interacting with a very powerful machine learning product. A few days ago, I used GPT-4 to help me create an AI that can continuously beat me at chess. This feels like a poetic encapsulation of all the advancements in machine learning over the past few decades: IBM developers spent twelve years creating Deep Blue, a model that ran on a 32-node IBM RS/6000 SP computer and could evaluate nearly 200 million chess moves per second, which defeated chess champion Garry Kasparov in 1997. In contrast, I spent a few hours—doing minimal coding on my part—to create a program that can beat me.
Admittedly, I doubt that the AI I created could defeat Garry Kasparov in chess, but that’s not the point. The point is that anyone who plays around with GPT-4 may have a similar experience of gaining superpowers: you can create something that approaches or surpasses your own capabilities with very little effort. We are all IBM researchers; we are all Garry Kasparov.
Clearly, this is both exciting and somewhat daunting. For anyone working in the cryptocurrency industry, the natural reaction (after marveling at what machine learning can do) is to consider potential paths to centralization and how to decentralize those paths to form a network that people can transparently audit and own. Current models are created by consuming vast amounts of publicly available text and data, but only a few people control and own these models. More specifically, the question is not "Does AI have immense value?" but rather "How do we build these systems so that anyone interacting with them can reap their economic benefits and ensure that their data is used in a privacy-respecting manner if they wish?"
Recently, there have been calls to pause or slow down the development of significant AI projects like Chat-GPT. Stopping progress may not be the solution: a better approach is to promote open-source models and, in cases where model providers wish to protect their weights or data privacy, use privacy-preserving zero-knowledge proofs to safeguard them, which are on-chain and fully auditable. Today, the latter use case regarding private model weights and data cannot yet be realized on-chain, but advancements in zero-knowledge proof systems will achieve this goal in the future.
Verifiable and Ownable Machine Learning
The chess AI I built using Chat-GPT seems relatively harmless at present: it outputs a relatively consistent program without using data that infringes on valuable intellectual property or privacy. But what happens when we want to ensure that the model running behind an API that we are told is operating has indeed been executed? Or, if I want to input certified data into an on-chain model and ensure that the data indeed comes from a legitimate source? What if I want to ensure that the "person" submitting the data is actually a human and not a bot attempting to launch a Byzantine attack on my network? Zero-knowledge proofs and their ability to succinctly represent and verify arbitrary programs provide a solution.
It is important to note that the primary use of zero-knowledge proofs in the context of on-chain machine learning is to verify correct computations. In other words, in the context of machine learning, the most useful aspect of zero-knowledge proofs and, more specifically, SNARKs (Succinct Non-Interactive Arguments of Knowledge) lies in their succinctness property. This is because zero-knowledge proofs protect the prover (and their processed data) from prying eyes. Enhanced privacy technologies such as Fully Homomorphic Encryption (FHE), Functional Encryption, or Trusted Execution Environments (TEE) are better suited for allowing untrusted provers to run computations on private input data (a deeper exploration of these technologies is beyond the scope of this article).
Let’s take a step back and understand the types of machine learning applications that can be represented with zero-knowledge (for a deeper dive into zero-knowledge, please refer to our articles on improvements in zero-knowledge proof algorithms and hardware, check out Justin Thaler's research on SNARK performance, or look at our zero-knowledge textbook). Zero-knowledge proofs typically represent programs as arithmetic circuits: using these circuits, the prover generates a proof from public and private inputs, and the verifier ensures that the output of this statement is correct through mathematical calculations—without gaining any information about the private inputs.
We are still in the very early stages of using on-chain zero-knowledge proofs to verify computations, but improvements in algorithms are expanding the feasible range. Here are five ways to apply zero-knowledge proofs in machine learning.
1. Model Authenticity: You want to ensure that a machine learning model claimed to have been run by an entity has indeed been executed. For example, in cases where a model is behind an API, the entity providing a specific model may have multiple versions, such as a cheaper, less accurate version and a more expensive, higher-performing version. Without proof, you cannot know whether the provider has given you the cheaper model when you actually paid for the more expensive version (for instance, if the provider wants to save on server costs and increase profit margins).
To achieve this, you need to provide separate proofs for each model instance. A practical approach is through the functional commitment framework by Dan Boneh, Wilson Nguyen, and Alex Ozdemir, which is a SNARK-based zero-knowledge commitment scheme that allows model owners to commit to a model, allowing users to input their data into that model and receive verification of the committed model that has been run. Some applications based on Risc Zero (a general-purpose STARK-based virtual machine) have also implemented this. Other research by Daniel Kang, Tatsunori Hashimoto, Ion Stoica, and Yi Sun has shown that valid inferences can be verified on the ImageNet dataset with an accuracy of 92% (comparable to the highest-performing non-zero-knowledge verified ImageNet models).
However, simply receiving proof that a submitted model has been run may not be sufficient. A model may not accurately represent the given program, so you would want a third party to audit the submitted model. Functional commitments allow the prover to prove that they used the committed model, but do not guarantee any information about the committed model. If we can get zero-knowledge proofs to perform well enough in proving training (see example #4 below), we may also begin to obtain these guarantees in the future.
2. Model Integrity: You want to ensure that the same machine learning algorithm runs in the same way on different users' data. This is particularly useful in areas where you do not want to apply arbitrary biases, such as credit scoring decisions and loan applications. You can also use functional commitments to achieve this. For this, you need to commit to a model and its parameters, allowing people to submit data. The output will verify whether the model has run the committed parameters against each user's data. Alternatively, the model and its parameters can be made public, allowing users to prove themselves that they have applied the appropriate model and parameters to their (certified) data. This could be particularly useful in the medical field, where laws require certain patient information to remain confidential. In the future, this could enable a medical diagnostic system that can learn and improve from completely private real-time user data.
3. Certification: You want to integrate certification from externally verified parties (e.g., any digital platform or hardware device that can generate digital signatures) into models running on-chain or any other type of smart contract. To do this, you would use zero-knowledge proofs to verify signatures and include the proof as input to the program. Anna Rose and Tarun Chitra recently hosted an episode of the Zero-Knowledge podcast featuring Daniel Kang and Yi Sun, where they explored the latest advancements in this area.
Specifically, Daniel and Yi recently published research exploring how to verify whether images captured by cameras with certified sensors have undergone transformations such as cropping, scaling, or limited occlusion, which is useful when you want to prove that an image has not been deep-faked but has indeed undergone some legitimate editing. Dan Boneh and Trisha Datta have also conducted similar research using zero-knowledge proofs to verify the provenance of images.
More broadly, any digitally certified information is a candidate for this form of verification: Jason Morton is developing the EZKL library (more on this in the next section), which he describes as "giving blockchain vision." Any signed endpoint (e.g., Cloudflare's SXG service, third-party notaries) will produce verifiable digital signatures, which can be very useful for proving provenance and authenticity from trusted sources.
4. Distributed Inference or Training: You want to perform machine learning inference or training in a distributed manner and allow people to submit data to a public model. To do this, you can deploy existing models on-chain or design an entirely new network and use zero-knowledge proofs to compress the model. Jason Morton's EZKL library is creating a method for ingesting ONNX and JSON files and converting them into ZK-SNARK circuits. A recent demonstration at ETH Denver showed that this technology could be used to create an on-chain treasure hunt game based on image recognition, where game creators could upload photos, generate proofs of the images, and players could upload images; verifiers check whether the user-uploaded images sufficiently match the proofs generated by the creators. EZKL can now verify models with up to 100 million parameters, meaning it can be used to verify ImageNet-sized models (which have 60 million parameters) on-chain.
Other teams, such as Modulus Labs, are benchmarking different proof systems for on-chain inference. Modulus's benchmarks cover up to 18 million parameters. In terms of training, Gensyn is building a distributed computing system where users can input public data and train models through a distributed node network while verifying the correctness of the training.
5. Human Proof: You want to verify that someone is a unique individual without compromising their privacy. To do this, you would create a verification method, such as a biometric scan or a method for submitting government IDs in an encrypted manner. You would then use zero-knowledge proofs to check whether someone has been verified without revealing any information about that person's identity, whether that identity is fully identifiable or pseudonymous like a public key.
Worldcoin achieves this through their human proof protocol, which ensures resistance to attacks by generating unique iris codes for users. Crucially, the private keys created for WorldID (as well as other private keys for Worldcoin users' encrypted wallets) are completely separate from the iris codes generated locally by the project's eye scanners. This separation entirely decouples biometric identifiers from any form of user keys that could be attributed to someone. Worldcoin also allows applications to embed an SDK that enables users to log in with WorldID and utilizes zero-knowledge proofs to protect privacy by allowing applications to check whether a person possesses a WorldID without allowing tracking of individual users (for more details, see this blog post).
This example uses the privacy-preserving features of zero-knowledge proofs to combat weaker and malicious AI, making it quite different from the other examples above (e.g., proving that you are a real human and not a bot while revealing no information about yourself).
Model Architecture and Challenges
Breakthroughs in proof systems for implementing SNARKs (Succinct Non-Interactive Arguments of Knowledge) have become a key driving force for putting many machine learning models on-chain. Some teams are creating custom circuits within existing architectures (including Plonk, Plonky2, Air, etc.). In terms of custom circuits, Halo 2 has become a widely used backend in the work of Daniel Kang and Jason Morton's EZKL project. The prover time for Halo 2 is approximately linear, proof sizes are typically only a few kilobytes, and verifier time is constant. Perhaps more importantly, Halo 2 has robust developer tools that make it a preferred SNARK backend for developers. Other teams, such as Risc Zero, are seeking general VM strategies. Some teams are also creating custom frameworks using Justin Thaler's ultra-efficient proof systems based on sum-check protocols.
The proof generation and verifier time absolutely depend on the hardware used to generate and check proofs and the size of the circuits generating the proofs. However, the key point to note here is that regardless of what program is represented, the size of the proof is always relatively small, so the burden on the verifier checking the proof is limited. However, there are some nuances here: for proof systems like Plonky2 that use FRI-based commitment schemes, proof sizes may increase. (Unless ultimately wrapped in pairing-based SNARKs like Plonk or Groth16, these proofs do not grow with the complexity of the statement being proven.)
The implication for machine learning models is that once a proof system accurately representing the model is designed, the cost of actually verifying the output will be very cheap. The most important considerations for developers are prover time and memory: representing the model in a way that can be proven relatively quickly, ideally with proof sizes around a few kilobytes. To prove the correct execution of machine learning models under zero-knowledge, you need to encode and represent the model architecture (layers, nodes, and activation functions), parameters, constraints, and matrix multiplication operations as circuits. This involves breaking these properties down into arithmetic operations that can be executed over finite fields.
This field is still in its infancy. During the process of converting models into circuits, accuracy and fidelity may be affected. When models are represented as arithmetic circuits, the aforementioned model parameters, constraints, and matrix multiplication operations may need to be approximated and simplified. When encoding arithmetic operations as elements in the finite field of the proof, some precision may be lost (or the cost of generating proofs without these optimizations will be overwhelming under the current zero-knowledge frameworks). Additionally, the parameters and activations of machine learning models are typically encoded in 32 bits for precision, but today’s zero-knowledge proofs cannot represent 32-bit floating-point operations in the necessary arithmetic circuit format without incurring huge overheads. Therefore, developers may choose to use quantized machine learning models, where 32-bit integers have already been converted to 8-bit precision. These types of models are favorable for representation as zero-knowledge proofs, but the verified models may be rough approximations of higher-quality initial models.
At this stage, it is indeed a game of catch-up. As zero-knowledge proofs become more optimized, machine learning models are becoming increasingly complex. There are already some promising areas of optimization: proof recursion can reduce overall proof size by allowing proofs to be used as inputs for the next proof, thereby achieving proof compression. There are also emerging frameworks, such as the Apache Tensor Virtual Machine (TVM) branch of Linear A, which has launched a converter to transform floating-point numbers into zero-knowledge-friendly integer representations. Finally, we at a16z crypto are optimistic about future work that will make representing 32-bit integers in SNARKs more feasible.
Two Definitions of "Scale" Zero-knowledge proofs achieve scalability through compression: SNARKs allow you to mathematically represent an extremely complex system (like a virtual machine or machine learning model) such that the cost of verifying it is less than the cost of running it. On the other hand, machine learning achieves scalability through expansion: today’s models improve with more data, parameters, and GPUs/TPUs involved in the training and inference processes. Centralized companies can run servers at nearly unlimited scale: charging monthly fees for API calls and paying operational costs.
The economic reality of blockchain networks is almost the opposite: developers are incentivized to optimize their code to make it feasible and cheap to run on-chain. This asymmetry has enormous advantages: it creates an environment that requires improving the efficiency of proof systems. We should seek to demand the same benefits in machine learning that blockchain provides, namely verifiable ownership and a shared sense of reality.
While blockchain incentivizes the optimization of zk-SNARKs, every area related to computation will benefit.
Acknowledgments: Justin Thaler, Dan Boneh, Guy Wuollet, Sam Ragsdale, Ali Yahya, Chris Dixon, Eddy Lazzarin, Tim Roughgarden, Robert Hackett, Tim Sullivan, Jason Morton, Peiyuan Liao, Tarun Chitra, Brian Retford, Daniel Kang, Yi Sun, Anna Rose, Modulus Labs, DC Builder.