Coinbase in-depth analysis: Is Crypto x AI a mirage?

DAOSquare
2024-03-08 18:43:38
Collection

Crypto's AI Mirage

Author: David Han, Coinbase Institutional Research Analyst

Compiled by: DAOSquare

Published on: March 6, 2024

Abstract: AI tokens have gained widespread support in the Crypto and AI markets, but may lack sustainable demand drivers in the medium to short term.

Overview

Decentralized crypto AI (Crypto-AI) applications face numerous headwinds in the medium to short term that may hinder their adoption. However, the constructive narrative surrounding Crypto and AI may sustain a trading narrative for some time.

Key Points

  • The intersection of AI and Crypto is broad, yet few have a deep understanding of it. We believe that different subfields at this intersection have distinctly different opportunities and development cycles.
  • We generally consider that decentralization alone is not a sufficient competitive advantage for AI products; they must also maintain functional parity with centralized competitors in certain other key areas.
  • Our contrarian view is that the widespread attention on the AI industry has led to the potential overvaluation of many AI tokens, and that many AI tokens may lack sustainable demand drivers in the medium to short term.

In recent years, the ongoing breakthroughs in AI (especially in generative AI) have generated significant attention towards the AI industry and provided opportunities for crypto projects that lie between the two. In a report we published in June 2023, we highlighted some possibilities in the industry and noted that the AI sector appeared undervalued in the overall capital allocation of Crypto. Since then, the crypto AI space has experienced rapid development. At this moment, we believe it is crucial to emphasize certain practical challenges that may hinder its widespread adoption.

The rapid changes in AI make us cautious about bold claims from some Crypto platforms that their unique positioning will disrupt the entire industry, rendering the long-term and sustainable value accumulation of most AI tokens uncertain, especially for projects with fixed token models. Conversely, we believe that certain emerging trends in the AI sector may actually make the adoption of Crypto-based innovations more difficult, given broader market competition and regulatory factors.

That said, we believe the intersection of AI and Crypto is extensive and presents different opportunities. The adoption rate in certain subfields may be faster, although many such areas lack tradable tokens. However, this does not seem to deter investors' appetite. We find that the performance of AI-related crypto tokens has been driven by the AI market frenzy, supporting their positive price movements even on days when Bitcoin trading declines. Therefore, we believe many AI-related tokens may continue to be traded as representatives of AI advancements.


### Major Trends in AI

In our view, one of the most important trends in the AI field (related to crypto AI products) is the continuation of the culture surrounding open-source models. There are over 530,000 models publicly available on Hugging Face (a collaborative platform for the AI community) for researchers and users to run and fine-tune. The role of Hugging Face in AI collaboration is no different from relying on GitHub for code hosting or Discord for community management (both widely used in Crypto). We believe this situation is unlikely to change in the near future unless there is severe mismanagement.

The models available on Hugging Face range from large language models (LLMs) to generative image and video models, coming from major industry players like OpenAI, Meta, and Google, as well as independent developers. Some open-source language models even outperform the leading closed-source models in terms of throughput (while maintaining comparable output quality), ensuring a degree of competition between open-source models and commercial models (see Figure 1). Importantly, we believe this vibrant open-source ecosystem, combined with a competitive commercial sector, has driven an industry where poorly performing models will be weeded out by competition.

The second trend is the increasing quality and cost-effectiveness of smaller models (which was emphasized in LLM research as early as 2020 and recently highlighted in a paper by Microsoft), aligning with the open-source culture to further realize the future of high-performance, locally-run AI models. Under certain benchmarks, some fine-tuned open-source models can even outperform leading closed-source models. In such a world, some AI models can run locally, maximizing decentralization. Of course, existing tech companies will continue to train and run larger models in the cloud, but there will be trade-offs in the design space between the two.

Additionally, given the increasing complexity of tasks in AI model benchmarking (including data contamination and changing test scopes), we believe that the output of generative models may ultimately be best evaluated by end-users in a free market. In fact, there are already tools available for end-users to conduct parallel comparisons of model outputs, and some benchmarking companies provide similar services. The difficulty of generating AI benchmarks can be seen in the growing variety of open LLM benchmarks, including MMLU, HellaSwag, TriviaQA, BoolQ, etc., each testing different use cases such as common sense reasoning, academic topics, and various question formats.

The third trend we observe in the AI field is that existing platforms with strong user lock-in or specific business problems can disproportionately benefit from AI integration. For example, the integration of GitHub Copilot with code editors enhances an already robust developer environment. Embedding AI interfaces into other tools such as email clients, spreadsheets, and customer relationship management software is also a natural use case for AI (e.g., Klarna's AI assistant can do the work of 700 full-time agents).

However, it is important to note that in many such scenarios, AI models do not spawn new platforms but merely enhance existing ones. Other AI models that improve traditional business processes (e.g., Meta's Lattice restored its advertising performance after Apple introduced App Tracking Transparency) often rely on proprietary data and closed systems. Since these types of AI models are vertically integrated into their core products and utilize proprietary data, they may always remain closed-source.

In the realm of AI hardware and computing, we see two additional related trends. The first is the transition of computing usage from training to inference. That is, when AI models are first developed, a significant amount of computing resources is used to "train" the model by providing it with large datasets. Now, it has shifted to model deployment and querying.

Nvidia disclosed in its February 2024 earnings call that about 40% of its business is inference, and Satya Nadella made similar comments in Microsoft's January earnings call, noting that "most" of their Azure AI usage is for inference. As this trend continues, we believe entities seeking to monetize their models will prioritize platforms that can reliably run models in a secure and production-ready manner.

The second major trend we see is the competitive landscape surrounding hardware architecture. Nvidia's H200 processor is set to launch in the second quarter of 2024, and the next-generation B100 is expected to further double performance. Additionally, Google's continued support for its proprietary tensor processing units (TPUs) and Groq's new language processing units (LPUs) may also enhance their market share in this field in the coming years (see Figure 2). These developments could change the cost dynamics of the AI industry and potentially benefit cloud service providers that can quickly adapt, procure hardware at scale, and set up any related physical networks and development tools.

Overall, the AI field is an emerging and rapidly developing domain. Since ChatGPT was first launched in November 2022 (although its underlying GPT-3 model has existed since June 2020), the rapid development in this field has been astonishing. Despite some biases behind generative AI models, we are beginning to see the effects of market competition (ignoring poorly performing models in favor of better alternatives). The rapid development of the industry and upcoming regulations mean that as new solutions continuously flood the market, the problem space of the industry will also change.

The often-touted package of measures "decentralization solves [insert problem]" seems to have become a consensus; however, in our view, it is still too early for such a rapidly innovating field. Moreover, it preemptively addresses a centralization problem that may not necessarily exist. The reality is that through competition among many different companies and open-source projects, the AI industry has already seen many decentralized phenomena in both technical and business verticals. Furthermore, on both technical and social levels, truly decentralized protocols are much slower in decision-making and consensus processes compared to centralized protocols. This may pose a barrier to seeking a balance between decentralization and competitive products in the current stage of AI development. That said, we do believe there are some meaningful synergies between Crypto and AI, but they are more likely to manifest over a longer time frame.


## Defining the Scope of Opportunities

Broadly speaking, we categorize the intersection of AI and Crypto into two main categories. The first is the use cases where AI products improve the use cases of the crypto industry. This includes creating human-readable transactions, improving blockchain data analytics, and scenarios where model outputs are used in permissionless protocols. The second category consists of use cases aimed at breaking traditional AI processes through decentralized methods of computation, validation, and identity via Crypto.

In our view, within the former category, the use cases in scenarios that align with business objectives are clear, and we believe that despite significant technical challenges, they will still hold promise in more complex on-chain reasoning model scenarios in the long term. Centralized AI models can improve Crypto just like any other tech-centric industry, such as developer tools, code audits, and translating human language into on-chain actions. However, investments in this area are typically owned by private companies through venture capital, and thus are often overlooked by the public markets.

However, what is less certain for us is the value proposition of the second category (i.e., that Crypto will disrupt the existing AI landscape). The challenges in the latter category replace technical challenges (which we believe are generally solvable in the long run) and are a tough struggle against broader market and regulatory forces. Nevertheless, a reality is that much of the recent attention on AI + Crypto has focused on this category, as these use cases are better suited to create liquid tokens. This is the focus of the next section, where there are relatively few liquid tokens associated with centralized AI tools in Crypto (at least for now).


## The Role of Crypto in AI

To simplify, we analyze the potential impact of Crypto on AI through four main stages of the AI process: (1) data collection, storage, and processing, (2) model training and inference, (3) validation of model outputs, and (4) tracking AI model outputs. A plethora of new crypto AI projects has emerged in these areas, although we believe that many projects will face significant demand-side challenges and fierce competition from centralized companies and open-source solutions in the medium to short term.

Proprietary Data

Data is the foundation of all AI models and perhaps the key differentiator in the performance of specialized AI models. Historical blockchain data itself is a new rich data source for models, and certain projects (like Grass) aim to leverage Crypto incentives to acquire new datasets from the open internet. In this regard, Crypto has the opportunity to provide industry-specific datasets and incentivize the creation of new valuable datasets. (The recent $60 million annual data licensing agreement between Reddit and Google hints at a growing trend in the monetization of datasets in the future.)

Many early models (like GPT-3) mixed open datasets such as CommonCrawl, WebText2, books, and Wikipedia, and similar datasets are freely available on Hugging Face (which currently hosts over 110,000 options). However, many recently released closed-source models have not disclosed their final training dataset compositions, possibly to protect their commercial interests. We believe the trend of proprietary datasets, especially in business models, will continue and lead to an increased importance of data licensing.

Existing centralized data markets have already been helping to bridge the gap between data providers and consumers, and we believe this will create an emerging opportunity space for decentralized data market solutions between open-source data directories and enterprise competitors. In the absence of legal structures to support it, a purely decentralized data market would also need to build standardized data interfaces and channels, verify data integrity and configurations, and address the cold start problem of its products. Additionally, it would need to balance token incentives among market participants.

Moreover, decentralized storage solutions may ultimately find a niche market in the AI industry, although we believe there are still significant challenges in this area. On one hand, channels for distributing open datasets already exist and are widely used. On the other hand, many proprietary dataset owners have strict security and compliance requirements. Currently, there are no regulatory pathways to govern the hosting of sensitive data on decentralized storage platforms like Filecoin and Arweave. In fact, many enterprises are still transitioning from local servers to centralized cloud storage providers. On a technical level, the decentralized nature of these networks is currently incompatible with certain regional issues and physical data island requirements for sensitive data storage.

While price comparisons between decentralized storage solutions and mature cloud providers indicate that decentralized solutions may be cheaper on a per-storage-unit basis, we believe this overlooks the larger issues. First, in addition to ongoing operational costs, the upfront costs required to migrate systems between vendors must also be considered. Second, Crypto-based decentralized storage platforms need to match the better tools and integrations offered by mature cloud systems developed over the past two decades. From a business operations perspective, cloud solutions offer more predictable costs, contractual obligations, dedicated support teams, and a large pool of developer talent.

It is also worth noting that a rough comparison with just the "big three" cloud providers (AWS, Google Cloud Platform, and Microsoft Azure) is incomplete. There are dozens of low-cost cloud companies also competing for market share by offering cheaper, basic server services. In our view, they are the true main competitors for cost-sensitive consumers in the near term. That said, recent innovations, such as Filecoin's data computation and Arweave's ao computing environment, may play a role in some upcoming innovative projects that typically use less sensitive datasets or are cost-sensitive (potentially smaller) companies that have not yet locked in a vendor.

Therefore, while there is certainly room for new Crypto products in the data space, we believe that short-term breakthroughs will occur in cases where they can generate unique value propositions. In our view, decentralized products will take longer to make substantial progress in areas where they directly compete with traditional and open-source competitors.

Training and Inference Models

The decentralized computing (DeComp) space in Crypto also aims to serve as an alternative to centralized cloud computing, partly due to the current GPU supply crunch. One proposed solution to this shortage, as adopted by protocols like Akash and Render, is to reintegrate idle computing resources into a centralized network, thereby reducing costs for centralized cloud providers. Preliminary indicators suggest that such projects seem to be gaining traction in terms of user and supplier adoption. For instance, Akash's active leases (i.e., number of users) have tripled since the beginning of this year (see Figure 3), primarily due to increased usage of its storage and computing resources.

However, since peaking in December 2023, the fees paid to the network have actually declined as the supply of available GPUs has outstripped the growth in demand for these resources. That said, as more providers join the network, the number of leased GPUs (which seems to be the largest revenue driver proportionally) has decreased (see Figure 4). For a network where computing pricing can fluctuate based on supply and demand, if supply-side growth exceeds demand-side growth, we are unclear where sustained, usage-driven native token demand will ultimately come from. We believe that this token model may need to be re-evaluated in the future to optimize for market changes, although the long-term implications of such changes are currently unclear.

On a technical level, decentralized computing solutions also face challenges related to network bandwidth limitations. For large models that require multi-node training, the physical network infrastructure layer plays a crucial role. Data transfer speeds, synchronization overhead, and support for certain distributed training algorithms mean that specific network configurations and custom network communications (like InfiniBand) are needed to facilitate their efficient execution. This leads to difficulties in achieving decentralization once the cluster size exceeds a certain range.

Overall, we believe that the long-term success of decentralized computing (and storage) faces fierce competition from centralized cloud providers. In our view, any adoption will be a long-term process, at least referencing the cloud service adoption cycle. Given the increasing technical complexity of decentralized network development, coupled with a lack of similarly scalable development and sales teams, we believe that fully realizing the vision of decentralized computing will be a challenging journey.

Validation and Trust Models

As AI models become increasingly important in our lives, concerns about their output quality and biases are growing. Certain crypto projects aim to find a decentralized, market-based solution to this issue by leveraging a set of algorithms to evaluate outputs across different categories. However, the aforementioned challenges surrounding model benchmarking, along with apparent costs, throughput, and quality trade-offs, make direct competition somewhat challenging. BitTensor is one of the largest AI-focused cryptocurrencies in this category, aiming to address this issue, although it still faces some technical challenges that may hinder its widespread application (see Appendix 1).

Additionally, trustless model inference (i.e., proving that model outputs are indeed generated by the claimed model) is another promising research area in Crypto x AI. However, we believe that as open-source models shrink in scale, these solutions may face demand challenges. In a world where models can be downloaded and run locally, and content integrity can be verified through established file hashes/checksums, the importance of trustless inference becomes less clear. Admittedly, many LLMs still cannot be trained and run on lightweight devices like smartphones, but powerful desktops (such as those used for high-end gaming) can already run many high-performance models.

Data Provenance and Identity

As the outputs of generative AI become increasingly indistinguishable from human outputs, the importance of tracking AI-generated content has also become a focal point. GPT-4's speed in passing the Turing test is three times that of GPT-3.5, and we can almost be certain that in the not-so-distant future, we will be unable to distinguish whether an online persona is from a machine or a real human. In such a world, determining the humanity of online users and watermarking AI-generated content will become key functionalities.

Decentralized identifiers and proof-of-personhood mechanisms like Worldcoin aim to address the former issue of identifying humans on-chain. Similarly, publishing data hashes to the blockchain can help verify the age and provenance of content, thus aiding data provenance. However, similar to some of the previous sections, we believe the feasibility of Crypto-based solutions must be weighed against centralized alternatives.

Some countries, like China, link online personas to government-controlled databases. While much of the world is not as centralized, KYC provider alliances can also offer proof-of-personhood solutions independent of blockchain technology (potentially in a manner similar to trusted certificate authorities that form the bedrock of today's internet security). Research on AI watermarks is also underway to embed hidden signals in text and image outputs to allow algorithms to detect whether content is AI-generated. Many leading AI companies, including Microsoft, Anthropic, and Amazon, have publicly committed to adding such watermarks to their generated content.

Moreover, for compliance reasons, many existing content providers are trusted to maintain strict records of content metadata. Therefore, users often trust the metadata associated with social media posts (though not their screenshots), even if they are stored centrally. It is important to note that any Crypto-based data provenance and identity solutions will need to integrate with user platforms to be widely effective. Thus, while Crypto-based solutions are technically feasible in proving identity and data provenance, we also believe their adoption is not a foregone conclusion and will ultimately depend on business, compliance, and regulatory requirements.


## Trading AI Narratives

Despite the aforementioned issues, many AI tokens have outperformed Bitcoin and Ethereum, as well as major AI stocks like Nvidia and Microsoft, starting from Q4 2023. We believe this is because AI tokens typically benefit from the broader Crypto market and the related performance of the AI frenzy (see Appendix 2). Therefore, even when Bitcoin prices decline, AI-focused tokens experience price surges, creating upward volatility during Bitcoin downturns. Figure 5 illustrates the performance of AI tokens on days when Bitcoin trading declines.

Overall, we still believe that many short-term sustainable demand drivers are lacking in AI narrative trading. The absence of clear adoption forecasts and metrics has led to a wide space occupied by various meme-like speculative sentiments, which we believe may not be long-term sustainable. Ultimately, price and utility will converge, and the unresolved question is how long it will take, and whether utility will rise to match price, or vice versa. That said, we do believe that a sustainable constructive Crypto market and performance superior to the AI industry may maintain a strong Crypto AI narrative for some time.


## Conclusion

The role of Crypto in AI does not exist in a vacuum; any decentralized platform competes with existing centralized alternatives and must be analyzed in the context of broader business and regulatory requirements. Therefore, we believe that merely replacing centralized providers for the sake of "decentralization" is insufficient to drive meaningful market adoption. Generative AI models have existed for several years and have maintained a degree of decentralization due to market competition and open-source software.

A recurring theme in this report is that while Crypto-based solutions are often technically feasible, they still require a significant amount of work to achieve functional parity with more centralized platforms, assuming those platforms do not stagnate during this period. In fact, due to consensus mechanisms, centralized development is often faster than decentralized, which may pose challenges for a rapidly evolving field like AI.

In light of this, we believe that the overlap between AI and Crypto is still in its infancy, and it may change rapidly in the coming years as the broader AI field develops. The decentralized AI future envisioned by many Crypto insiders is not guaranteed to materialize; in fact, the future of the AI industry itself remains largely uncertain. Therefore, we believe that a cautious approach is prudent in navigating such a market and exploring more deeply how Crypto-based solutions can truly provide meaningful better alternatives, or at least understand the underlying trading narratives. Thus, we believe it is wise to proceed cautiously in such a market and delve deeper into how Crypto-based solutions can genuinely offer a meaningful better choice, or at least understand the underlying trading narratives.


## Appendix 1: BitTensor

BitTensor incentivizes different intelligence markets across its 32 subnets. This aims to address some benchmarking issues by enabling subnet owners to create game-like constraints to extract intelligence from information providers. For example, its flagship subnet 1 is text-prompt centered and incentivizes miners who "generate the best responses based on prompts sent by validators in that subnet." In other words, it rewards miners who can generate the best text responses to a given prompt, as judged by other validators in that subnet. This allows network participants to create an intelligent economy of models across various markets.

However, this validation and reward mechanism is still in its early stages and is susceptible to adversarial attacks, especially if models are evaluated using other biased models (although progress has been made in this area with the use of new synthetic data for evaluating certain subnets). This is particularly true for "fuzzy" outputs like language and art, where evaluation metrics may be subjective, leading to the emergence of multiple benchmarks for model performance.

For instance, the validation mechanism of BitTensor's subnet 1 requires in practice:

Validators generate one or more reference answers, and all miners' responses are compared. Those whose responses are most similar to the reference answers will receive the highest rewards and ultimately the greatest incentives.

Current similarity algorithms use a combination of string literals and semantic matching as the basis for rewards, but it is challenging to capture different style preferences through a limited set of reference answers.

It remains unclear whether models produced by BitTensor's incentive structure will ultimately outperform centralized models (or whether the best-performing models will pivot to BitTensor), or how they will adapt to other trade-offs, such as model scale and underlying computational costs. A user may freely choose a market for models that suit their preferences, potentially achieving similar resource allocation through an "invisible hand." That said, BitTensor indeed attempts to tackle a highly challenging problem within an ever-expanding problem space.


## Appendix 2: WorldCoin

Perhaps the most obvious example of AI tokens following the AI market frenzy is Worldcoin. It released the World ID 2.0 upgrade on December 13, 2023, but it garnered little attention until Sam Altman promoted Worldcoin on December 15, after which it surged by 50%. (Speculation about Worldcoin's future remains highly polarized, partly because Sam Altman is a co-founder of Tools for Humanity, the developer behind Worldcoin. Similarly, the release of Sora by OpenAI on February 15, 2024, caused its price to nearly triple, despite no related announcements on Worldcoin's Twitter or blog (see Figure 6). As of the time of writing, Worldcoin's valuation stands at $80 billion, which is close to OpenAI's $86 billion valuation on February 16 (a company with an annual revenue of $2 billion).


Log on to https://dao2.io to stay updated with the latest news, insights, and research from major media and institutions around the world.

Join the DAOSquare community to learn about the latest developments in the DAOSquare incubator.

Related tags
ChainCatcher reminds readers to view blockchain rationally, enhance risk awareness, and be cautious of various virtual token issuances and speculations. All content on this site is solely market information or related party opinions, and does not constitute any form of investment advice. If you find sensitive information in the content, please click "Report", and we will handle it promptly.
banner
ChainCatcher Building the Web3 world with innovators