zkML, the next grand narrative after artificial intelligence?
Original author: hitesh.eth
Original compilation: Frank, Foresight News
zkML might be the next grand narrative after artificial intelligence.
However, for many people, zkML is a bit complex to understand. In this article, I will interpret it in the simplest way.
What is zkML?
In short, zkML = ZKP + ML
Where: ZKP = Zero-Knowledge Proof, ML = Machine Learning.
So: zkML = Zero-Knowledge Proof Machine Learning
In a nutshell, it is using ZKP technology to generate output on machine learning models while not disclosing sensitive data used during the training process and ensuring the correctness of the computation.
So what is a machine learning model? A machine learning model is a computer program that can make predictions based on a large amount of data after being trained.
For example, large language models like ChatGPT are built on machine learning models.
What is inference? Inference is the process of analyzing user prompts, trying to understand the context, and using a trained data model to provide results.
Let’s take ChatGPT as an example:
The first step of the inference process is to write the input, for example, we input a prompt "Write a crypto rap song in the style of Drake."
In the second step, ChatGPT will analyze the context, "crypto rap song in the style of Drake." Then, it will activate the trained model based on the user's prompt requirements, identify patterns in the training data, and create a crypto rap song in the style of Drake as output.
What can zkML do?
Throughout the inference process, there are two potential privacy issues that may leak sensitive data:
Membership Inference attacks: Attackers can analyze the model's output to infer whether a specific data point is part of the training process;
Model Inversion attacks: By constructing specific prompts, attackers may attempt to reconstruct fragments of the training data from the output;
How can zkML help with this? zkML allows for inference on sensitive data without exposing the training data itself.
This is achieved through the use of ZK proof systems such as Plonky and Halo 2, with Plonky 2 currently being the fastest ZK proof system.
With zkML, attackers will never be able to directly access the training data.
Current Development Status of zkML
As of now, zkML is still in its early stages, with several startups working on building zkML infrastructure.
Among them, Risc Zero is collaborating with Spice AI to create a complete zkML solution for developers.
Ingonyama is developing hardware specifically for ZK technology, which may lower the barrier to entry into the ZK technology field, and zkML may also be used in the model training process.
Modulus is using zkML to apply artificial intelligence to on-chain inference processes. They currently have six partners who are building different zkML use cases:
For example, Upshot has built a price prediction model, Worldcoin is using Modulus for private identity verification, and AI ARENA is using zkML in the economic model of games.
Privacy-preserving blockchain projects, such as Oasis Protocol, Secret Network, and Aleo, are also exploring zkML-based use cases within their ecosystems. Additionally, NOYA.ai is using zkML to build full-chain DeFi strategies.
OraProtocol is building a trustless machine learning inference protocol based on ZK, allowing developers to use zkML inference to build any decentralized application powered by machine learning and secured by Ethereum.
The entire narrative around zkML is still in its infancy, but I expect that in the coming months, there will be a hype cycle around this narrative in this bull market, so now is an excellent time to closely track this field and prepare accordingly.