Deconstructing the AI Framework: From Intelligent Agents to Decentralized Exploration

2025-01-09 15:31:55

Collection

AI frameworks are transitioning from traditional centralized architectures to decentralized models, with intelligent agents as the core driving force of this shift, providing more efficient solutions for various applications. In this process, blockchain technology offers lower costs and higher security infrastructure, promoting the chain-based development of agents. By integrating DeFi and intelligent agents, future AI systems will possess stronger self-adaptability and interactivity.

Scan with WeChat

Author: YBB Capital Researcher Zeke

Introduction

In previous articles, we have explored multiple perspectives on the current state of AI memes and the future development of AI agents. However, the rapid narrative development and dramatic evolution in the AI agent space can be quite overwhelming. In just two months since the "Truth Terminal" opened the Agent Summer, the narrative combining AI and crypto has changed almost weekly. Recently, the market's attention has shifted back to "framework" projects dominated by technical narratives, with several dark horses emerging in this niche segment, each surpassing a market cap of over a hundred million or even a billion. Such projects have also given rise to a new asset issuance paradigm, where projects issue tokens based on GitHub repositories, and agents built on these frameworks can also issue tokens. With frameworks as the foundation and agents on top, it resembles an asset issuance platform, yet it is actually a unique infrastructure model emerging in the AI era. How should we view this new trend? This article will start with an introduction to frameworks and combine personal reflections to interpret what AI frameworks mean for crypto.

1. What is a Framework?

By definition, an AI framework is a foundational development tool or platform that integrates a set of pre-built modules, libraries, and tools, simplifying the process of building complex AI models. These frameworks typically also include functionalities for data processing, model training, and prediction. In simple terms, you can think of a framework as an operating system in the AI era, similar to Windows or Linux in desktop operating systems, or iOS and Android in mobile environments. Each framework has its own advantages and disadvantages, allowing developers to choose freely based on specific needs.

Although the term "AI framework" is still an emerging concept in the crypto space, its development history has been nearly 14 years since the inception of Theano in 2010. In the traditional AI community, both academia and industry have mature frameworks available, such as Google's TensorFlow, Meta's PyTorch, Baidu's PaddlePaddle, and ByteDance's MagicAnimate, each with its own strengths tailored to different scenarios.

The framework projects emerging in crypto are built upon the surge in demand for agents driven by this wave of AI enthusiasm, and they have subsequently branched out into other crypto sectors, ultimately forming AI frameworks in various niche domains. Let's take a look at a few mainstream frameworks in the current space to elaborate on this statement.

1.1 Eliza

First, let's consider Eliza from ai16z, which is a multi-agent simulation framework specifically designed for creating, deploying, and managing autonomous AI agents. Developed using TypeScript as the programming language, its advantage lies in better compatibility and easier API integration.

According to the official documentation, Eliza primarily targets social media scenarios, such as multi-platform integration support. The framework provides a fully functional Discord integration, supports automated accounts on X/Twitter, integrates with Telegram, and offers direct API access. In terms of media content processing, it supports reading and analyzing PDF documents, extracting and summarizing link content, audio transcription, video content processing, image analysis and description, and dialogue summarization.

The current use cases supported by Eliza mainly fall into four categories:

AI assistant applications: customer support agents, community managers, personal assistants;
Social media roles: automated content creators, interactive bots, brand representatives;
Knowledge workers: research assistants, content analysts, document processors;
Interactive roles: role-playing characters, educational tutors, entertainment bots.

The models currently supported by Eliza include:

Open-source model local inference: such as Llama3, Qwen1.5, BERT;
Cloud inference using OpenAI's API;
Default configuration as Nous Hermes Llama 3.1B;
Integration with Claude for complex queries.

1.2 G.A.M.E

G.A.M.E (Generative Autonomous Multimodal Entities Framework) is an automated generative and management multimodal AI framework launched by Virtual, primarily designed for intelligent NPCs in games. A unique feature of this framework is that it can be used by users with low-code or even no-code backgrounds, as users only need to modify parameters to participate in agent design based on its trial interface.

In terms of project architecture, G.A.M.E's core design is a modular design where multiple subsystems work in coordination, as detailed in the diagram below.

Agent Prompting Interface: The interface through which developers interact with the AI framework. Through this interface, developers can initialize a session and specify parameters such as session ID, agent ID, and user ID;
Perception Subsystem: The perception subsystem is responsible for receiving input information, synthesizing it, and sending it to the strategic planning engine. It also handles responses from the dialogue processing module;
Strategic Planning Engine: The strategic planning engine is the core part of the entire framework, divided into a High Level Planner and a Low Level Policy. The High Level Planner is responsible for formulating long-term goals and plans, while the Low Level Policy translates these plans into specific action steps;
World Context: The world context contains environmental information, world state, and game state data, which help the agent understand the current situation;
Dialogue Processing Module: The dialogue processing module is responsible for handling messages and responses, generating dialogues or reactions as output;
On Chain Wallet Operator: The on-chain wallet operator may involve applications of blockchain technology, with specific functions not clearly defined;
Learning Module: The learning module learns from feedback and updates the agent's knowledge base;
Working Memory: The working memory stores the agent's recent actions, results, and current plans, among other short-term information;
Long Term Memory Processor: The long-term memory processor is responsible for extracting important information about the agent and its working memory, ranking it based on importance, recency, and relevance;
Agent Repository: The agent repository stores the agent's goals, reflections, experiences, and personality attributes;
Action Planner: The action planner generates specific action plans based on the low-level policy;
Plan Executor: The plan executor is responsible for executing the action plans generated by the action planner.

Workflow: Developers initiate the agent through the Agent Prompting Interface, the perception subsystem receives input and passes it to the strategic planning engine. The strategic planning engine utilizes the memory system, world context, and information from the agent repository to formulate and execute action plans. The learning module continuously monitors the agent's action results and adjusts the agent's behavior based on the outcomes.

Application Scenarios: From the overall technical architecture, this framework primarily focuses on the decision-making, feedback, perception, and personality of agents in virtual environments. Besides gaming, it is also applicable to the Metaverse, and a large number of projects have already adopted this framework for construction, as seen in the list below from Virtual.

1.3 Rig

Rig is an open-source tool written in Rust, designed to simplify the development of applications using large language models (LLMs). It provides a unified operating interface, allowing developers to easily interact with multiple LLM service providers (such as OpenAI and Anthropic) and various vector databases (like MongoDB and Neo4j).

Core Features:

Unified Interface: Regardless of which LLM provider or vector storage is used, Rig offers a consistent access method, greatly reducing the complexity of integration work;
Modular Architecture: The framework employs a modular design, including key components such as "Provider Abstraction Layer," "Vector Storage Interface," and "Intelligent Agent System," ensuring system flexibility and scalability;
Type Safety: Utilizing Rust's features, it achieves type-safe embedding operations, ensuring code quality and runtime safety;
Efficient Performance: Supports asynchronous programming patterns, optimizing concurrent processing capabilities; built-in logging and monitoring features aid in maintenance and troubleshooting.

Workflow: When a user requests to enter the Rig system, it first passes through the "Provider Abstraction Layer," which standardizes the differences between different providers and ensures consistent error handling. Next, in the core layer, intelligent agents can call various tools or query vector storage to obtain the required information. Finally, through advanced mechanisms like Retrieval-Augmented Generation (RAG), the system can combine document retrieval and contextual understanding to generate precise and meaningful responses, which are then returned to the user.

Application Scenarios: Rig is suitable not only for building question-answering systems that require quick and accurate responses but also for creating efficient document search tools, context-aware chatbots or virtual assistants, and even supporting content creation by automatically generating text or other forms of content based on existing data patterns.

1.4 ZerePy

ZerePy is an open-source framework based on Python, aimed at simplifying the deployment and management of AI agents on the X (formerly Twitter) platform. It evolved from the Zerebro project, inheriting its core functionalities but designed in a more modular and extensible manner. Its goal is to enable developers to easily create personalized AI agents and automate various tasks and content creation on X.

ZerePy provides a command-line interface (CLI) that facilitates users in managing and controlling their deployed AI agents. Its core architecture is based on a modular design, allowing developers to flexibly integrate different functional modules, such as:

LLM Integration: ZerePy supports large language models (LLMs) from OpenAI and Anthropic, allowing developers to choose the model that best fits their application scenario. This enables agents to generate high-quality text content;
X Platform Integration: The framework directly integrates with the X platform's API, allowing agents to post, reply, like, and retweet;
Modular Connection System: This system allows developers to easily add support for other social platforms or services, expanding the framework's functionality;
Memory System (Future Planning): Although the current version may not have fully implemented this, ZerePy's design goals include integrating a memory system that enables agents to remember previous interactions and contextual information, thus generating more coherent and personalized content.

While both ZerePy and ai16z's Eliza project aim to build and manage AI agents, they differ slightly in architecture and objectives. Eliza focuses more on multi-agent simulation and broader AI research, while ZerePy is dedicated to simplifying the deployment of AI agents on a specific social platform (X), leaning more towards practical applications.

2. A Reflection of the BTC Ecosystem

In terms of development paths, AI agents share many similarities with the BTC ecosystem at the end of 2023 and the beginning of 2024. The development path of the BTC ecosystem can be simply summarized as: BRC20-Atomical/Rune and other multi-protocol competition-BTC L2-BTCFi centered around Babylon. AI agents, built upon a mature traditional AI technology stack, have developed even more rapidly, but their overall development path indeed has many similarities with the BTC ecosystem, which I summarize as follows: competition among GOAT/ACT-Social type agents/analytical AI agent frameworks. From a trend perspective, infrastructure projects focusing on decentralization and security around agents are likely to inherit this wave of framework enthusiasm and become the main theme of the next phase.

Will this track, like the BTC ecosystem, lead to homogenization and bubble formation? I believe not. Firstly, the narrative of AI agents is not aimed at recreating the history of smart contract chains. Secondly, the existing AI framework projects, whether technically strong or stagnating at the PPT stage or merely copying and pasting, at least provide a new infrastructure development idea. Many articles compare AI frameworks to asset issuance platforms, with agents likened to assets. In my opinion, compared to Memecoin Launchpads and inscription protocols, AI frameworks resemble future public chains more, while agents resemble future DApps.

In today's crypto landscape, we have thousands of public chains and tens of thousands of DApps. Among general chains, we have BTC, Ethereum, and various heterogeneous chains, while application chains take on more diverse forms, such as gaming chains, storage chains, and Dex chains. Public chains correspond to AI frameworks, and DApps can correspond well to agents.

In the AI era of crypto, it is highly likely to progress towards this form, and future debates will shift from discussions about EVM and heterogeneous chains to framework competitions. The current question is more about how to decentralize or "chainify"? I believe future AI infrastructure projects will expand on this basis, and another point is what significance does doing this on the blockchain hold?

3. The Significance of On-Chain?

Regardless of what blockchain combines with, it ultimately faces a fundamental question: Is it meaningful? In last year's article, I criticized the inversion of priorities in GameFi and the premature development of infrastructure. In previous articles about AI, I also expressed skepticism about the combination of AI and crypto in practical fields at this stage. After all, the driving force of narratives for traditional projects has been weakening, and the few traditional projects that performed well last year generally needed to possess the strength to match or exceed their token prices. What can AI offer to crypto? Previously, I thought of ideas like agents acting on intentions, the Metaverse, and agents as employees, which are relatively mundane yet in demand. However, these needs do not necessitate a complete on-chain solution, and from a business logic perspective, they cannot form a closed loop. The agent browser mentioned in the last issue can indeed generate demands for data labeling and reasoning power, but the combination of the two is still not tight enough, and the computational power aspect remains dominated by centralized computing.

Reconsidering the success of DeFi, the reason DeFi could carve out a niche from traditional finance is due to its higher accessibility, better efficiency, lower costs, and the absence of a trust-based centralized security model. Following this line of thought, I believe there may be several reasons to support the chainification of agents.

Can the chainification of agents achieve lower usage costs, thereby enhancing accessibility and choice? Ultimately allowing ordinary users to participate in the "rental rights" of AI that belong to Web2 giants;
Security: Based on the simplest definition of an agent, an AI that can be called an agent should be able to interact with the virtual or real world. If an agent can intervene in reality or my virtual wallet, then a blockchain-based security solution would be a necessity;
Can agents create a unique financial gameplay exclusive to blockchain? For example, in AMM, LPs allow ordinary people to participate in automated market-making, while agents may require computational power, data labeling, etc., and users can invest in the protocol in the form of USDT when they are optimistic. Alternatively, agents in different application scenarios could form new financial gameplay;
DeFi currently lacks perfect interoperability. If blockchain-based agents can achieve transparent and traceable reasoning, they may be more attractive than the agent browsers provided by traditional internet giants mentioned in the previous article.

4. Creativity?

Framework projects in the future will also provide an entrepreneurial opportunity similar to the GPT Store. Although currently, launching an agent through a framework is still quite complex for ordinary users, I believe that simplifying the agent construction process and providing some complex functional combinations will give frameworks an advantage in the future, leading to a more interesting Web3 creative economy than the GPT Store.

The current GPT Store still leans towards practicality in traditional fields, and most popular apps are created by traditional Web2 companies, with revenues monopolized by creators. According to OpenAI's official explanation, this strategy only provides funding support to a select group of outstanding developers in the U.S., offering a certain amount of subsidies.

From a demand perspective, Web3 still has many areas that need to be filled, and in terms of the economic system, it can make the unfair policies of Web2 giants fairer. Additionally, we can naturally introduce community economics to make agents more complete. The creative economy of agents will be an opportunity for ordinary people to participate, and the future AI memes will be far more intelligent and interesting than the agents issued on GOAT or Clanker.

References:

Related tags

AI eliza rig ZerePy BTC

ChainCatcher reminds readers to view blockchain rationally, enhance risk awareness, and be cautious of various virtual token issuances and speculations. All content on this site is solely market information or related party opinions, and does not constitute any form of investment advice. If you find sensitive information in the content, please click "Report", and we will handle it promptly.

YBB Capital

Focus on investment and research in the underlying infrastructure of the cryptocurrency field.

The End Times of Native Encryption

Metis Hyperion: Igniting Hope for AI Narratives on Ethereum?

Related tags

AI eliza rig ZerePy BTC