The Era of Intelligent Agents: The Clash and Symbiosis of AI and Crypto

YBB Capital
2024-11-27 17:27:11
Collection
The agent utilizes chain of thought technology, enabling AI to possess more autonomous and transparent capabilities in decision-making, reasoning, and action, gradually breaking through the limitations of traditional AI and bringing new opportunities to the cryptocurrency field.

Author: YBB Capital Researcher Zeke

I. The Novelty of Attention

Over the past year, due to a disconnect in application layer narratives that could not match the pace of infrastructure explosions, the crypto space has gradually turned into a game for attention resources. From Silly Dragon to Goat, from Pump.fun to Clanker, the novelty-seeking behavior in attention has led to an intense competition. Starting with the most cliché eye-catching monetization, it quickly evolved into a unified platform model for attention seekers and providers, and now silicon-based entities have become new content providers. Among the bizarre carriers of Meme Coins, a consensus between retail investors and VCs has finally emerged: AI Agent.

Attention is ultimately a zero-sum game, but speculation can indeed promote wild growth. In our article about UNI, we reviewed the beginning of the last golden age of blockchain, where the rapid growth of DeFi stemmed from the LP mining era initiated by Compound Finance. The primitive on-chain game during that period involved entering and exiting various pools with APYs in the thousands or even tens of thousands, although the eventual outcome was a collapse of various pools. However, the frenzied influx of gold miners did leave unprecedented liquidity in blockchain, and DeFi eventually evolved beyond pure speculation into a mature track, meeting users' financial needs in payments, trading, arbitrage, staking, and more. Currently, AI Agents are also undergoing this wild phase, and we are exploring how Crypto can better integrate with AI, ultimately leading to new heights in the application layer.

II. How Agents Operate Autonomously

In our previous article, we briefly introduced the origin of AI Memes: Truth Terminal, and our outlook on the future of AI Agents. This article focuses primarily on the AI Agent itself.

Let's start with the definition of AI Agent. In the field of AI, "Agent" is a relatively old but vaguely defined term, primarily emphasizing autonomy, meaning any AI that can perceive its environment and react can be called an Agent. In today's definitions, AI Agents are closer to intelligent agents, which set up a system that mimics human decision-making for large models. In academia, this system is seen as the most promising path to AGI (Artificial General Intelligence).

In early versions of GPT, we could clearly sense that large models resembled humans, but when answering many complex questions, they could only provide some seemingly plausible answers. The essential reason is that the large models at that time were based on probability rather than causality, and they lacked the abilities humans possess, such as using tools, memory, and planning. AI Agents can fill these gaps. So, to summarize with a formula: AI Agent = LLM (Large Language Model) + Planning + Memory + Tools.

Prompt-based large models resemble a static person; they only come to life when we input data. The goal of an agent is to be a more realistic person. Currently, the agents in the field are primarily fine-tuned models based on Meta's open-source Llama 70b or 405b versions (with different parameters), possessing memory and the ability to use API tools. In other aspects, they may still require human assistance or input (including interaction and collaboration with other agents). Therefore, we see that the main agents in the field still exist in the form of KOLs on social networks. To make agents more human-like, planning and action capabilities need to be integrated, with the sub-item of the thought chain in planning being particularly critical.

III. Chain of Thought (CoT)

The concept of Chain of Thought (CoT) first appeared in the 2022 Google paper "Chain-of-Thought Prompting Elicits Reasoning in Large Language Models," which pointed out that generating a series of intermediate reasoning steps can enhance a model's reasoning ability, helping it better understand and solve complex problems.

A typical CoT Prompt consists of three parts: clear task description, logical basis supporting the task solution, and specific solution demonstration. This structured approach helps the model understand task requirements and gradually approach the answer through logical reasoning, thereby improving the efficiency and accuracy of problem-solving. CoT is particularly suitable for tasks requiring in-depth analysis and multi-step reasoning. For simpler tasks, CoT may not provide significant advantages, but for complex tasks, it can significantly enhance model performance by reducing error rates through step-by-step solving strategies.

In building AI Agents, CoT plays a crucial role. AI Agents need to understand the information received and make reasonable decisions based on it. CoT provides an orderly way of thinking, helping agents effectively process and analyze input information, transforming the analysis results into specific action guidelines. This method not only enhances the reliability and efficiency of agent decision-making but also increases the transparency of the decision-making process, making agent behavior more predictable and traceable. By breaking tasks down into smaller steps, CoT helps agents carefully consider each decision point, reducing erroneous decisions caused by information overload. CoT makes the decision-making process of agents more transparent, allowing users to better understand the basis for the agents' decisions. In interactions with the environment, CoT allows agents to continuously learn new information and adjust behavior strategies.

As an effective strategy, CoT not only enhances the reasoning ability of large language models but also plays an important role in building smarter and more reliable AI Agents. By utilizing CoT, researchers and developers can create intelligent systems that are more adaptable to complex environments and possess a high degree of autonomy. CoT has demonstrated its unique advantages in practical applications, especially in handling complex tasks. By breaking tasks down into a series of small steps, it not only improves the accuracy of task resolution but also enhances the model's interpretability and controllability. This step-by-step problem-solving approach can significantly reduce erroneous decisions caused by excessive or overly complex information when facing complex tasks. At the same time, this method also improves the traceability and verifiability of the entire solution.

The core function of CoT lies in integrating planning, action, and observation, bridging the gap between reasoning and action. This thinking mode allows AI Agents to formulate effective countermeasures when predicting potential anomalies and to accumulate new information while interacting with the external environment, validating pre-set predictions and providing new reasoning bases. CoT acts like a powerful engine of precision and stability, helping AI Agents maintain high efficiency in complex environments.

IV. The Right Pseudo-Demand

What aspects of AI technology stacks should Crypto integrate with? In last year's article, I believed that the decentralization of computing power and data is a key step in helping small businesses and individual developers save costs. This year, in the segmented track of Crypto x AI compiled by Coinbase, we saw a more detailed classification:

  1. Computing Layer (referring to networks focused on providing GPU resources for AI developers);
  2. Data Layer (referring to networks supporting decentralized access, orchestration, and validation of AI data pipelines);
  3. Middleware Layer (referring to platforms or networks supporting the development, deployment, and hosting of AI models or agents);
  4. Application Layer (referring to user-facing products utilizing on-chain AI mechanisms, whether B2B or B2C).

In these four classification layers, each layer has grand visions, and their ultimate goal is to counter the dominance of Silicon Valley giants in the next era of the internet. As I mentioned last year, should we really accept the exclusive control of computing power and data by Silicon Valley giants? Under their monopoly, the closed-source large models are black boxes. Science, as the most revered religion of humanity today, will see every word answered by future large models regarded as truth by a significant portion of people, but how should this truth be verified? According to the vision of Silicon Valley giants, the permissions ultimately held by agents will be unimaginable, such as having the payment rights to your wallet and the rights to use terminals. How can we ensure that humans harbor no ill intentions?

Decentralization is the only answer, but sometimes we need to reasonably consider how many payers there are for these grand visions. In the past, we could overlook commercial closed loops and use tokens to compensate for the errors brought by idealism. However, the current situation is very severe; Crypto x AI needs to be designed in conjunction with reality. For example, how to balance supply on both ends of the computing layer under performance loss and instability to achieve competitiveness with centralized clouds? How many real users will data layer projects have, how to verify the authenticity and validity of the provided data, and what kind of customers need this data? The same reasoning applies to the other two layers; in this era, we do not need so many seemingly correct pseudo-demands.

V. Meme Has Escaped SocialFi

As I mentioned in the first paragraph, Meme has rapidly evolved into a SocialFi form that aligns with Web3. Friend.tech was the first Dapp to fire the first shot in this round of social applications, but unfortunately, it failed due to hasty token design. Pump.fun validated the feasibility of pure platformization, not creating any tokens or rules. The demand for attention from seekers and providers is unified; you can post memes, do live broadcasts, issue tokens, leave messages, and trade on the platform—all is free, with Pump.fun only charging a service fee. This is essentially consistent with the attention economy model of current social media like YouTube and Instagram, except that the charging targets differ, and in terms of gameplay, Pump.fun is more Web3.

Base's Clanker is a comprehensive player, benefiting from the integrated ecosystem personally crafted by the ecosystem, with Base having its own social Dapp as an auxiliary, forming a complete internal closed loop. The intelligent agent Meme is the 2.0 form of Meme Coin; people always seek novelty, and Pump.fun happens to be at the forefront of trends. From a trend perspective, it is only a matter of time before silicon-based entities replace the vulgar memes of carbon-based entities.

I have mentioned Base countless times, with different content each time. From a timeline perspective, Base has never been a first mover but is always a winner.

VI. What Else Can Agents Be?

From a pragmatic perspective, agents are unlikely to be decentralized for a long time in the future. Looking at the traditional AI field's construction of agents, it is not a simple reasoning process that decentralization and open-source can solve. It requires access to various APIs to access Web2 content, and its operating costs are very high. The design of the thought chain and the collaboration of multiple agents usually still rely on a human as a mediator. We will experience a long transition period until a suitable fusion form emerges, perhaps similar to UNI. But like the previous article, I still believe that agents will have a significant impact on our industry, just as Cex exists in our industry—not correct but very important.

The "AI Agent Overview" released by Stanford & Microsoft last month described numerous applications of agents in the medical industry, intelligent machines, and virtual worlds. In the appendix of this article, there are already many experimental cases of GPT-4V participating as agents in top-tier AAA game development.

There is no need to overly demand the speed of its integration with decentralization; I hope that the first puzzle that agents fill is the bottom-up capability and speed. We have so many narrative ruins and blank metaverses that need filling, and at the right stage, we can consider how to make it the next UNI.

References

What Ability Does the "Emergence" of Large Models' Chain of Thought Represent? Author: Brain Extreme Body

Understanding Agents: The Next Stop for Large Models Author: LinguaMind

ChainCatcher reminds readers to view blockchain rationally, enhance risk awareness, and be cautious of various virtual token issuances and speculations. All content on this site is solely market information or related party opinions, and does not constitute any form of investment advice. If you find sensitive information in the content, please click "Report", and we will handle it promptly.
banner
ChainCatcher Building the Web3 world with innovators