Hack VC Partner: 8 Real Advantages of AI + Crypto

Hack VC
2024-06-24 23:52:36
Collection
Analyze the concept of crypto x AI, discussing the real challenges and opportunities within it, which are empty promises? Which are feasible?

Author: Ed Roman, Managing Partner at Hack VC

Compiled by: 1912212.eth, Foresight News

AI + Crypto is one of the most attention-grabbing frontiers in the recent cryptocurrency market, including decentralized AI training, GPU DePINs, and censorship-resistant AI models.

Behind these dazzling advancements, we can't help but ask: is this a genuine technological breakthrough or just riding the hype? This article will clear the fog for you, analyzing the crypto x AI vision and discussing the real challenges and opportunities within it, revealing which are hollow promises and which are truly feasible.

Vision #1: Decentralized AI Training

The issue with on-chain AI training is the need for high-speed communication and coordination between GPUs, as neural networks require backpropagation during training. Nvidia has two innovations for this (NVLink and InfiniBand). These technologies make GPU communication extremely fast, but they are limited to local technologies, applicable only to GPU clusters located within a single data center (50+ Gbps speeds).

If a decentralized network is introduced, the speed would suddenly slow down by several orders of magnitude due to network latency and increased bandwidth. This speed is impractical for AI training use cases compared to the throughput obtained from Nvidia's high-speed interconnects within data centers.

Note that the following innovations may bring hope for the future:

  • Large-scale distributed training is happening on InfiniBand, as NVIDIA itself is supporting distributed non-local training on InfiniBand through the NVIDIA Collective Communications Library. However, it is still in its infancy, and adoption metrics are yet to be determined. The physical law bottlenecks of distance still exist, so local training on InfiniBand remains much faster.
  • Some new research on decentralized training has been published, which reduces communication synchronization time, potentially making decentralized training more practical in the future.
  • Intelligent sharding and scheduling of model training help improve performance. Similarly, new model architectures may be specifically designed for future distributed infrastructures (Gensyn is researching in these areas).

The data aspect of training is also challenging. Any AI training process involves handling vast amounts of data. Typically, models are trained on centralized secure data storage systems that have high scalability and performance. This requires the transfer and processing of several TB of data, and this is not a one-time cycle. Data is often noisy and contains errors, so it must be cleaned and transformed into a usable format before training the model. This phase involves repetitive tasks of standardization, filtering, and handling missing values. All of these face severe challenges in a decentralized environment.

The data aspect of training is also iterative, which is not compatible with Web3. OpenAI went through thousands of iterations to achieve its results. In AI teams, the most basic task scenarios for data scientists include defining objectives, preparing data, analyzing and organizing data to extract important insights, and making it suitable for modeling. Then, machine learning models are developed to solve defined problems, and their performance is validated using test datasets. This process is iterative: if the current model does not perform as expected, experts return to the data collection or model training phase to improve results. Imagine how this process would unfold in a decentralized environment, where adapting existing frameworks and tools to Web3 becomes challenging.

Another issue with training AI models on-chain is that, compared to inference, this market is much less interesting. Currently, training large AI language models requires significant GPU computing resources. In the long run, inference will become the primary application scenario for GPUs. Consider how many AI large language models need to be trained to meet global demand, compared to the number of customers using these models— which is greater?

Vision #2: Achieving Consensus Through Overly Redundant AI Inference Computation

Another challenge regarding crypto and AI is verifying the accuracy of AI inference, as you cannot fully trust a single centralized party to perform inference operations, and there is a potential risk of nodes behaving improperly. This challenge does not exist in Web2 AI, as there is no decentralized consensus system.

The solution is redundant computation, allowing multiple nodes to repeat the same AI inference operation, enabling operation in a trustless environment and avoiding single points of failure.

However, the problem with this approach is the extreme shortage of high-end AI chips. The wait time for high-end NVIDIA chips can be several years, leading to price increases. If you require AI inference to be re-executed multiple times across nodes, the high costs will multiply, making it unfeasible for many projects.

Vision #3: Recent Web3-Specific AI Use Cases

Some suggest that Web3 should have its unique AI use cases specifically targeting Web3 customers. This could include (for example) Web3 protocols that use AI to risk-score DeFi pools, Web3 wallets that suggest new protocols to users based on wallet history, or using AI to control non-player characters in Web3 games (NPCs).

Currently, this is a nascent market (in the short term), where use cases are still in the exploratory phase. Some challenges include:

  • Due to the market demand still being in its infancy, there are fewer potential AI transactions required for Web3-native use cases.
  • There are fewer customers, with Web3 customers being several orders of magnitude smaller than Web2 customers, resulting in a lower degree of market decentralization.
  • Customers themselves are less stable, as they are often underfunded startups, and some may fade away over time. Web3 AI service providers catering to Web3 customers may need to regain part of their customer base to replace those that have disappeared, making scaling their business extremely challenging.

In the long run, we are very optimistic about Web3-native AI use cases, especially as AI agents become more prevalent. We envision a future where any specific Web3 user will have a multitude of AI agents to assist them in completing tasks.

Vision #4: Consumer-Grade GPU DePIN

There are many decentralized AI computing networks that rely on consumer-grade GPUs rather than data centers. Consumer GPUs are well-suited for low-end AI inference tasks or consumer use cases that are flexible in terms of latency, throughput, and reliability. However, for serious enterprise use cases (which constitute the majority of important markets), customers require networks with higher reliability compared to home machines, and if they have more complex inference tasks, they typically need higher-end GPUs. Data centers are better suited for these more valuable customer use cases.

Note that we believe consumer-grade GPUs are suitable for demonstrations and for individuals and startups that can tolerate lower reliability. However, these customers are of lower value, so we believe that DePINs tailored for Web2 enterprises will be more valuable in the long run. Thus, GPU DePIN projects have evolved from primarily using consumer-grade hardware in their early stages to now having A100/H100 and cluster-level availability.

Reality ------ Actual Use Cases of Cryptocurrency x AI

Now we discuss use cases that can provide real benefits. These are the true victories where cryptocurrency x AI can add significant value.

Real Benefit #1: Serving Web2 Customers

McKinsey estimates that among the 63 use cases analyzed, generative AI could add between $2.6 trillion to $4.4 trillion in revenue annually—compared to the total GDP of the UK in 2021, which was $3.1 trillion. This would increase the impact of AI by 15% to 40%. If we consider the impact of embedding generative AI into other task software currently used for use cases, the estimated impact would roughly double.

If you calculate based on the above estimates, this means the total market value of global AI (beyond generative AI) could reach several tens of trillions of dollars. In contrast, the total value of all cryptocurrencies today (including Bitcoin and all altcoins) is only about $2.7 trillion. So let's face it: in the short term, the vast majority of customers needing AI will be Web2 customers, as the truly AI-needing Web3 customers will only make up a small portion of that $2.7 trillion (considering BTC is in this market, Bitcoin itself does not need/use AI).

Web3 AI use cases are just beginning, and it is still unclear how large this market will be. But one thing is certain— in the foreseeable future, it will only occupy a small portion of the Web2 market. We believe Web3 AI still has a bright future, but this merely means that the most powerful applications of Web3 AI currently are to serve Web2 customers.

Examples of Web2 customers that could benefit from Web3 AI include:

  • Building AI-centric vertical-specific software companies from scratch (e.g., Cedar.ai or Observe.ai)
  • Large enterprises fine-tuning models for their own purposes (e.g., Netflix)
  • Rapidly growing AI providers (e.g., Anthropic)
  • Software companies integrating AI into existing products (e.g., Canva)

These are relatively stable customer roles, as customers are usually large and valuable. They are less likely to go bankrupt quickly, and they represent a huge potential customer base for AI services. Web3 AI services serving Web2 customers will benefit from these stable customer bases.

But why would Web2 customers want to use the Web3 stack? The next part of this article elaborates on this situation.

Real Benefit #2: Reducing GPU Usage Costs Through GPU DePIN

GPU DePIN aggregates underutilized GPU computing power (the most reliable coming from data centers) and makes it available for AI inference. A simple analogy for this issue is "Airbnb for GPUs."

The reason we are excited about GPU DePIN is, as mentioned above, the shortage of NVIDIA chips, and there are currently wasted GPU cycles available for AI inference. These hardware owners incur sunk costs and currently have underutilized equipment, so these partial GPUs can be offered at a much lower cost compared to the status quo, as this effectively "finds money" for hardware owners.

Examples include:

  • AWS machines. If you want to rent an H100 from AWS today, you must commit to a one-year lease due to limited market supply. This creates waste, as you may not use the GPU every day of the year, every week.
  • Filecoin mining hardware. Filecoin has a large supply of subsidies but not much actual demand. Filecoin has never found a true product-market fit, so Filecoin miners face the risk of going out of business. These machines are equipped with GPUs that can be repurposed for low-end AI inference tasks.
  • ETH mining hardware. When Ethereum transitioned from PoW to PoS, this quickly released a large amount of hardware that can be repurposed for AI inference.

Note that not all GPU hardware is suitable for AI inference. One obvious reason for this is that older GPUs do not have the amount of GPU memory required for LLMs, although there are some interesting innovations that can help in this regard. For example, Exabits' technology can load active neurons into GPU memory and inactive neurons into CPU memory. They predict which neurons need to be active/inactive. This allows low-end GPUs to handle AI workloads even with limited GPU memory. This effectively makes low-end GPUs more useful for AI inference.

Web3 AI DePINs need to evolve their products over time and provide enterprise-level services such as single sign-on, SOC 2 compliance, service level agreements (SLA), etc. This is similar to the services that current cloud service providers offer to Web2 customers.

Real Benefit #3: Censorship-Resistant Models to Avoid OpenAI Self-Censorship

There has been much discussion about AI censorship regimes. For example, Turkey temporarily banned OpenAI (later OpenAI improved compliance, and they changed their approach). We find national-level censorship regimes uninteresting, as countries need to adopt AI to remain competitive.

OpenAI also engages in self-censorship. For instance, OpenAI does not handle NSFW content. OpenAI also does not predict the next presidential election. We believe AI use cases are not only interesting but also have a huge market, but OpenAI avoids this market for political reasons.

Open source is a great solution because GitHub repositories are not influenced by shareholders or boards. Venice.ai is one example, promising to protect privacy and operate in a censorship-resistant manner. Web3 AI can effectively elevate its level by supporting these open-source software (OSS) models on cost-effective GPU clusters to perform inference. For these reasons, we believe OSS + Web3 is an ideal combination to pave the way for censorship-resistant AI.

Real Benefit #4: Avoiding Sending Personal Identifiable Information to OpenAI

Large enterprises have privacy concerns regarding their internal data. For these customers, trusting a third party like OpenAI to hold this data can be difficult.

In Web3, for these enterprises, their internal data suddenly appearing on a decentralized network may seem even more concerning (on the surface). However, there are innovations in privacy-enhancing technologies for AI:

Trusted Execution Environments (TEE), such as Super Protocol

Fully Homomorphic Encryption (FHE), such as Fhenix.io (a portfolio company of Hack VC) or Inco Network (both supported by Zama.ai), and Bagel's PPML

These technologies are still evolving, and performance continues to improve with the upcoming release of zero-knowledge (ZK) and FHE ASICs. The long-term goal is to protect enterprise data during model fine-tuning. With the emergence of these protocols, Web3 may become a more attractive place for privacy-preserving AI computation.

Real Benefit #5: Leveraging the Latest Innovations from Open Source Models

Over the past few decades, open-source software has been steadily eating into the market share of proprietary software. We view LLMs as a form of proprietary software that is sufficient to disrupt OSS. Notable challenger examples include Llama, RWKV, and Mistral.ai. Over time, this list will undoubtedly continue to grow (a more comprehensive list can be found on Openrouter.ai). By leveraging Web3 AI (powered by OSS models), one can capitalize on these new innovations to innovate.

We believe that over time, the global development workforce of open source combined with cryptocurrency incentives can drive rapid innovation in open-source models and the agents and frameworks built on top of them. One example of an AI agent protocol is Theoriq. Theoriq utilizes OSS models to create a composable AI agent interconnection network that can be assembled to create higher-level AI solutions.

We are confident in this because, in the past, over time, most "developer software" innovations have been slowly surpassed by OSS. Microsoft was once a proprietary software company, and now they are the number one contributor to GitHub. There is a reason for this; if you look at how Databricks, PostGresSQL, MongoDB, and other companies have disrupted proprietary databases, it is an example of OSS disrupting entire industries, making the precedent here very compelling.

However, there is also a problem. One tricky aspect of open-source large language models (OSS LLMs) is that OpenAI has begun signing paid data licensing agreements with some organizations (like Reddit and The New York Times). If this trend continues, open-source large language models may find it harder to compete due to financial barriers to acquiring data. Nvidia may further strengthen its investment in confidential computing as a boost for secure data sharing. Time will reveal the developments in this area.

Real Benefit #6: Achieving Consensus Through High-Cost Random Sampling or ZK Proofs

One of the challenges of Web3 AI inference is verification. Suppose validators have the opportunity to cheat their results to earn fees; thus, verifying inferences is an important measure. Note that this cheating has not actually occurred yet, as AI inference is still in its infancy, but unless measures are taken to curb this behavior, it is inevitable.

The standard Web3 approach is to have multiple validators repeat the same operation and compare results. As mentioned earlier, the prominent challenge facing this issue is that the cost of AI inference is very high due to the current shortage of high-end Nvidia chips. Given that Web3 can provide lower-cost inference through underutilized GPU DePIN, redundant computation will severely undermine Web3's value proposition.

A more promising solution is to execute ZK proofs for off-chain AI inference computation. In this case, concise ZK proofs can be verified to determine whether the model has been correctly trained or whether the inference has been correctly executed (known as zkML). Examples include Modulus Labs and ZK onduit. Since ZK operations are computationally intensive, the performance of these solutions is still in its early stages. However, we expect the situation to improve with the upcoming release of ZK hardware ASICs in the near future.

Another promising idea is a somewhat "optimistic" sampling-based AI inference approach. In this model, only a small portion of the results generated by validators needs to be verified, but the significantly reduced economic costs are set high enough that if found cheating, it would impose a strong economic deterrent on the validators. This way, you can save on redundant computation.

Another promising idea is watermarking and fingerprinting solutions, such as the one proposed by Bagel Network. This is similar to how Amazon Alexa provides quality assurance for AI models within its millions of devices.

Real Benefit #7: Saving Costs Through OSS (OpenAI's Profits)

The next opportunity that Web3 brings to AI is cost democratization. So far, we have discussed saving GPU costs through DePIN. But Web3 also offers the opportunity to save on the profit margins of centralized Web2 AI services (such as OpenAI, which has over $1 billion in annual revenue as of this writing). These cost savings come from the fact that using OSS models instead of proprietary models achieves additional savings, as the model creators are not trying to profit.

Many OSS models will remain completely free, providing the best economic benefits for customers. However, some OSS models may also attempt these monetization methods. Consider that only 4% of all models on Hugging Face are trained by companies with budgets to help subsidize the models. The remaining 96% of models are trained by the community. This group (96% of Hugging Face) has basic practical costs (including computational costs and data costs). Therefore, these models will need to be monetized in some way.

There are some proposals to monetize open-source software models. One of the most interesting is the concept of "initial model issuance," which involves tokenizing the model itself, reserving a portion of the tokens for the team, and directing some of the model's future revenue streams to token holders, although there are certainly legal and regulatory hurdles in this area.

Other OSS models will attempt to monetize through usage. Note that if this becomes a reality, OSS models may begin to resemble their Web2 profit models more closely. But in reality, the market will be divided into two parts, with some models remaining completely free.

Real Benefit #8: Decentralized Data Sources

One of the biggest challenges AI faces is finding the right data to train models. We previously mentioned that decentralized AI training has its challenges. But what about using decentralized networks to obtain data (which can then be used for training elsewhere, even in traditional Web2 venues)?

This is precisely what startups like Grass are doing. Grass is a decentralized network composed of "data scrapers" who contribute the idle processing power of machines to data sources, providing information for training AI models. It is assumed that, at scale, this data source can outperform any company's internal data source due to the powerful capabilities of a large incentive node network. This includes not only obtaining more data but also obtaining data more frequently to keep it relevant and up-to-date. In fact, it is impossible to stop the decentralized army of data scrapers, as they are inherently decentralized and do not reside at a single IP address. They also have a network that can clean and standardize data so that it is useful after being scraped.

Once data is obtained, you also need to store it on-chain and the LLMs generated using that data.

Note that the role of data in Web3 AI may change in the future. Currently, the state of LLMs is to use data to pre-train models and refine them over time with more data. However, since data on the internet is constantly changing, these models are always somewhat outdated. Therefore, the responses from LLM inference are slightly inaccurate.

The future direction may be a new paradigm— "real-time" data. The concept is that when a large language model (LLM) is asked an inference question, the LLM can transfer and inject data through prompts, and this data is freshly re-collected from the internet. This way, the LLM can use the most up-to-date data. Grass is researching this aspect.

Special thanks to the following individuals for their feedback and assistance with this article: Albert Castellana, Jasper Zhang, Vassilis Tziokas, Bidhan Roy, Rezo, Vincent Weisser, Shashank Yadav, Ali Husain, Nukri Basharuli, Emad Mostaque, David Minarsch, Tommy Shaughnessy, Michael Heinrich, Keccak Wong, Marc Weinstein, Phillip Bonello, Jeff Amico, Ejaaz Ahamadeen, Evan Feng, JW Wang.

ChainCatcher reminds readers to view blockchain rationally, enhance risk awareness, and be cautious of various virtual token issuances and speculations. All content on this site is solely market information or related party opinions, and does not constitute any form of investment advice. If you find sensitive information in the content, please click "Report", and we will handle it promptly.
ChainCatcher Building the Web3 world with innovators