In-depth Analysis of Multi-Agent: Will Web3 and AI Ultimately Achieve Mutual Success?

Meteorite Labs
2024-09-18 20:48:36
Collection
How does Web3 help top VCs rush to bet on the AI Agent track?

If AIGC has opened the intelligent era of content generation, then AI Agent has the opportunity to truly productize the capabilities of AIGC.

AI Agent is like a more tangible all-purpose employee, referred to as the primary form of artificial intelligence robots, capable of observing the surrounding environment, making decisions, and taking action automatically, just like humans.

Bill Gates once stated, "Mastering AI Agent is the real achievement. At that point, you will no longer need to personally search for information online." Experts in the AI field also have high hopes for the prospects of AI Agent. Microsoft CEO Satya Nadella predicted that AI Agent will become the main way of human-computer interaction, able to understand user needs and proactively provide services. Professor Andrew Ng also predicted that in the future work environment, humans and AI Agents will collaborate more closely, forming an efficient working model and improving productivity.

AI Agent is not only a product of technology but also the core of future lifestyles and working methods.

This inevitably brings to mind that when Web3 and blockchain first sparked widespread discussion, people often used the term "disruptive" to describe the potential of this technology. Looking back over the past few years, Web3 has gradually evolved from the initial ERC-20 and zero-knowledge proofs to DeFi, DePIN, GameFi, and other integrations with other fields.

If we combine the two hot digital technologies of Web3 and AI, will it produce a 1+1>2 effect? Will the increasingly large-scale Web3 AI projects bring new use case paradigms to the industry and create new real demands?

AI Agent: The Ideal Intelligent Assistant for Humans

Where exactly lies the imagination of AI Agent? An answer that has circulated online states, "A large language model can only code a simple snake game, while AI Agent can create an entire Honor of Kings." It sounds exaggerated, but it is not an overstatement.

Agent, commonly translated as "intelligent entity" in China, is a concept proposed by "the father of artificial intelligence" Minsky in his 1986 book "The Society of Mind." Minsky believed that certain individuals in society could arrive at a solution to a problem after negotiation, and these individuals are Agents. For many years, Agents have been the cornerstone of human-computer interaction, from Microsoft's Clippy to Google Docs' auto-suggestions. These early forms of Agents demonstrated the potential for personalized interaction but still had limited capabilities in handling more complex tasks. It was not until the emergence of large language models (LLMs) that the true potential of Agents was realized.

In May of this year, authoritative scholar in the AI field, Professor Andrew Ng, shared a talk on AI Agent at the Sequoia AI event in the United States, where he showcased a series of experiments conducted by his team:

They had AI write some code and run it, comparing the results from different LLMs and workflows. The results were as follows:

  • GPT-3.5 model: 48% accuracy

  • GPT-4 model: 67% accuracy

  • GPT-3.5 + Agent: Performance exceeding that of the GPT-4 model

  • GPT-4 + Agent: Far exceeding the GPT-4 model, very impressive

5XpOgWF7OHz77n2n1hoo9216YCCW4ZZXUTLQl667.png

Indeed. Most people using LLMs like ChatGPT typically input a prompt, and the model immediately generates an answer without automatically recognizing and correcting errors or rewriting.

In contrast, the workflow of AI Agent is as follows:

First, let the LLM write an article outline. If necessary, it searches the internet for content to conduct research and analysis, outputs a draft, then reads the draft and considers how to optimize it, iterating this process multiple times until a logically rigorous, low-error-rate high-quality article is produced.

We can see that the difference between AI Agent and LLM lies in the interaction between LLM and humans being based on prompts. AI Agent, on the other hand, only needs to set a goal; it can independently think and take action towards that goal. It breaks down the given task into detailed planning steps, relying on external feedback and independent thinking to create prompts for itself to achieve the goal.

Therefore, OpenAI defines AI Agent as: a system driven by LLM as its brain, capable of autonomous understanding, perception, planning, memory, and tool usage, able to automate the execution of complex tasks.

When AI transforms from a tool being used to a subject that can use tools, it becomes AI Agent. This is precisely why AI Agent can become the most ideal intelligent assistant for humans. For example, AI Agent can understand and remember user interests, preferences, and daily habits based on historical online interactions, recognize user intentions, proactively make suggestions, and coordinate multiple applications to complete tasks.

Xr6Mt6cTh09plZxQRFO7OzGnqNjlVqangYbuZTn4.png

Just as in Gates' vision, in the future, we will no longer need to switch to different applications for different tasks; we will simply tell computers and phones what we want to do in everyday language. Based on the data users are willing to share, AI Agent will provide personalized responses.

Single-Person Unicorn Companies Are Becoming a Reality

AI Agent can also help businesses create a new intelligent operation model centered on "human-machine collaboration." More and more business activities will be handled by AI, while humans will only need to focus on the company's vision, strategy, and key path decisions.

As OpenAI CEO Sam Altman mentioned in an interview, with the development of AI, we are about to enter the "single-person unicorn" era, where companies are founded by a single person and reach a valuation of $1 billion.

It sounds like a fantasy, but with the assistance of AI Agent, this idea is becoming a reality.

Let’s make a hypothesis: now we want to start a tech startup. By traditional methods, I would clearly need to hire software engineers, product managers, designers, marketers, salespeople, and finance personnel, each performing their roles but all coordinated by me.

But what if I use AI Agent? I might not even need to hire employees.

  • Devin --- Automated Programming

Replacing software engineers, I might use the popular AI software engineer Devin this year, which can help me complete all front-end and back-end work.

Devin, developed by Cognition Labs, is known as "the world's first AI software engineer." It can independently complete the entire software development process, analyze problems, make decisions, write code, and fix errors autonomously, greatly reducing the workload of developers. Devin secured $196 million in funding within just six months, rapidly increasing its valuation to several billion dollars, with investors including Founders Fund, Khosla Ventures, and other well-known venture capital firms.

Although Devin has not yet launched a public version, we can glimpse its potential from another recently popular Web2 product, Cursor. It can almost complete all your work, turning a simple idea into functional code within minutes; you just need to give commands and can "sit back and enjoy." Reports indicate that an eight-year-old child, with no programming experience, managed to use Cursor to complete coding work and build a website.

  • Hebbia --- Document Processing

Replacing product managers or finance personnel, I might choose Hebbia, which can help me organize and analyze all documents.

Unlike Glean, which focuses on internal document search, Hebbia Matrix is an enterprise-level AI Agent platform that uses multiple AI models to help users efficiently extract, structure, and analyze data and documents, thereby enhancing productivity. Impressively, Matrix can process millions of documents at once.

Hebbia completed a $130 million Series B round in July this year, led by a16z, with participation from well-known investors like Google Ventures and Peter Thiel.

  • Jasper AI --- Content Generation

Replacing social media operations and designers, I might choose Jasper AI, which can help me generate content.

Jasper AI is an AI Agent writing assistant designed to help creators, marketers, and businesses streamline the content generation process, improving productivity and creative efficiency. Jasper AI can generate various types of content in the style requested by users, including blog posts, social media posts, advertising copy, and product descriptions. It can also generate images based on user descriptions to provide visual support for text content.

Jasper AI has secured $125 million in funding and reached a valuation of $1.5 billion in 2022. According to statistics, Jasper AI has helped users generate over 500 million words, making it one of the most widely used AI writing tools.

  • MultiOn --- Web Automation

Replacing assistants, I might choose MultiOn to help me manage daily tasks, schedule appointments, set reminders, and even plan business trips, automatically booking hotels and arranging rides.

MultiOn is an automated web task AI agent that can autonomously execute tasks in any digital environment, such as helping users complete online shopping, appointments, and other personal tasks, enhancing personal efficiency, or helping users streamline daily affairs to improve work efficiency.

  • Perplexity --- Search and Research

Replacing researchers, I might choose Perplexity, which is used daily by NVIDIA's CEO.

Perplexity is an AI search engine that can understand user questions, break down problems, and search and integrate content to generate reports, providing clear answers to users.

Perplexity is suitable for various user groups; for example, students and researchers can simplify the information retrieval process during writing, improving efficiency; marketers can obtain reliable data to support marketing strategies.

The above content is merely imaginative; the current true capabilities and levels of these AI Agents are still not sufficient to replace elite talents in various industries. As Logenic AI co-founder Li Bojie stated, the current capabilities of LLMs are still at an entry-level, far from expert-level; the current AI Agents are more like fast-working but unreliable employees.

However, these AI Agents, leveraging their respective strengths, are helping existing users improve efficiency and convenience across diverse scenarios.

Not limited to tech companies, various industries can benefit from the wave of AI Agents. In education, AI Agents can provide personalized learning resources and tutoring based on students' learning progress, interests, and abilities; in finance, AI Agents can help users manage personal finances, provide investment advice, and even predict stock trends; in healthcare, AI Agents can assist doctors in diagnosing diseases and formulating treatment plans; in e-commerce, AI Agents can serve as intelligent customer service representatives, automatically answering user inquiries, handling order issues, and processing return requests through natural language processing and machine learning technologies, thereby improving customer service efficiency.

Multi-Agent: The Next Step for AI Agents

In the previous section about the concept of single-person unicorn companies, a single AI Agent faces limitations when handling complex tasks, making it difficult to meet actual needs. However, when using multiple AI Agents, due to their reliance on heterogeneous LLMs, collective decision-making becomes challenging, and their capabilities are limited, necessitating human intervention as a dispatcher among these independent AI Agents to coordinate their work across different application scenarios. This has given rise to the emergence of the "Multi-Agent framework."

Complex problems often require the integration of knowledge and skills from multiple areas, and the limited capabilities of a single AI Agent make it difficult to handle such tasks. By organically combining AI Agents with different capabilities, a Multi-Agent system can allow AI Agents to leverage their strengths and complement each other, thus solving complex problems more effectively.

This is very similar to our actual workflows or organizational structures: a leader assigns tasks to individuals with different abilities, each responsible for different tasks, with the results of each process passed on to the next, ultimately achieving the final task outcome.

In the implementation process, lower-level AI Agents execute their respective tasks, while higher-level AI Agents assign tasks and supervise their completion.

Multi-Agent can also simulate our human decision-making process; just as we consult others when faced with problems, multiple AI Agents can also simulate collective decision-making behavior, providing us with better information support. For instance, Microsoft's AutoGen meets this need:

  • It can create AI Agents with different roles. These AI Agents have basic conversational abilities and can generate responses based on the messages they receive.

  • It creates a group chat environment involving multiple AI Agents through GroupChat, where an AI Agent in the role of administrator manages the chat records, speaking order, and termination of speeches of other AI Agents.

3R1USFVo2AGjBHplSACzULu0WYL2XSr08dESwSPE.jpeg

If applied to the concept of single-person unicorn companies, we can create several AI Agents with different roles through the Multi-Agent architecture, such as project managers, programmers, or supervisors. We just need to tell them our goals and let them think of ways to achieve them; we can simply listen to their reports and request changes if we have any opinions or if they do something incorrectly, until we are satisfied.

Compared to a single AI Agent, Multi-Agent can achieve:

  • Scalability: By increasing the number of AI Agents to handle larger-scale problems, with each AI Agent processing a part of the task, allowing the system to scale with growing demands.

  • Parallelism: Naturally supports parallel processing, with multiple AI Agents working simultaneously on different parts of the problem, thus accelerating problem-solving.

  • Decision Improvement: Enhances decision-making by aggregating insights from multiple AI Agents, as each AI Agent has its own perspective and expertise.

With the continuous advancement of AI technology, it is conceivable that the Multi-Agent framework will play a larger role across more industries and drive the development of various new AI-driven solutions.

The Wind of AI Agents Blows Towards Web3

Stepping out of the laboratory, the road for AI Agents and Multi-Agents is long and challenging.

Setting aside Multi-Agent, even the most advanced single AI Agents currently have clear physical limits on the computational resources and capabilities they require, making infinite scaling impossible. When faced with extremely complex and computation-intensive tasks, AI Agents will undoubtedly encounter computational bottlenecks, significantly degrading performance.

Moreover, AI Agents and Multi-Agent systems are essentially a centralized architectural model, which means they carry a high risk of single points of failure. More importantly, the monopolistic business model based on closed-source large models by companies like OpenAI, Microsoft, and Google severely threatens the survival environment of independent, single AI Agent startups, making it difficult for AI Agents to effectively utilize vast enterprise private data to become smarter and more efficient. There is an urgent need for a democratic collaborative environment among AI Agents, allowing truly valuable AI Agents to serve a broader range of needs and create greater value for society.

Finally, although AI Agents are closer to the industry compared to LLMs, their development is based on LLMs, and the characteristics of the current large model track include high technical barriers, significant capital investment, and immature business models, making it generally difficult for AI Agents to secure funding for continuous updates and iterations.

The Multi-Agent paradigm is an excellent angle for Web3 to empower AI, and many Web3 development teams are already investing in research and development to provide solutions in this area.

ixbxFyflo4JLSovYnCUfEw7MrvpmFTLxrwmFOtvc.png

AI Agents and Multi-Agent systems typically require substantial computational resources to make complex decisions and process tasks. Web3, through blockchain and decentralized technologies, can build decentralized computing power markets, allowing computational resources to be distributed and utilized more fairly and efficiently on a global scale. Web3 projects like Akash, Nosana, Aethir, and IO.net can provide computational power for AI Agent decision-making and reasoning.

Traditional AI systems are often centrally managed, leading to single points of failure and data privacy issues for AI Agents. The decentralized nature of Web3 can make Multi-Agent systems more distributed and autonomous, with each AI Agent operating independently on different nodes, autonomously executing user requests, enhancing robustness and security. By establishing incentive and penalty mechanisms for stakers and delegators through PoS, DPoS, and other mechanisms, the democratization of single AI Agents or Multi-Agent systems can be promoted.

In this regard, GaiaNet, Theoriq, PIN AI, and HajimeAI are all making very cutting-edge attempts.

  • Theoriq is a project serving "AI for Web3," aiming to establish a calling and economic system for AI Agents through the Agentic Protocol, popularizing Web3 development and many functional scenarios, and providing verifiable model reasoning capabilities for Web3 dApps.

  • GaiaNet is a node-based AI Agent creation and deployment environment, starting from the protection of experts' and users' intellectual property and data privacy, countering the centralized OpenAI GPT Store.

  • HajimeAI focuses on establishing AI Agent workflows based on actual needs and the intelligent and automated nature of intentions, echoing PIN AI's mention of "personalized AI intelligence."

  • At the same time, Modulus Labs and ORA Protocol have made progress in the algorithmic directions of zkML and opML for AI Agents.

Finally, the development and iteration of AI Agents and Multi-Agent systems often require substantial financial support, and Web3 can help promising AI Agent projects gain valuable early support through its characteristic of pre-positioned liquidity.

Both Spectral and HajimeAI have proposed product concepts to support the issuance of on-chain AI Agent assets: issuing tokens through IAO (Initial Agent Offering), allowing AI Agents to directly obtain funding from investors while becoming part of DAO governance, providing investors with opportunities to participate in project development and share future profits. Among them, HajimeAI's Benchmark DAO hopes to organically combine decentralized AI Agent scoring and AI Agent asset issuance through crowdfunding and token incentives, creating a closed loop for AI Agents relying on Web3 for financing and cold start, which is also a relatively novel attempt.

The Pandora's box of AI has been opened, and everyone within it is both excited and confused; amidst the fervor, whether there are opportunities or reefs remains unknown. Nowadays, no industry is still in the PPT financing era; no matter how cutting-edge the technology, only implementation can realize its value. The future of AI Agents is destined to be a long marathon, and Web3 is ensuring that it does not fade away in this race.

ChainCatcher reminds readers to view blockchain rationally, enhance risk awareness, and be cautious of various virtual token issuances and speculations. All content on this site is solely market information or related party opinions, and does not constitute any form of investment advice. If you find sensitive information in the content, please click "Report", and we will handle it promptly.
ChainCatcher Building the Web3 world with innovators