Dialogue 0G Labs: The Final Path of DA and the New Era of On-Chain AI

BlockBeats
2024-06-13 22:59:32
Collection
The ambition and dream of the first modular AI chain.

Compilation: Thinking Weird, BlockBeats

Translator's Note: In March this year, 0G Labs completed a $35 million Pre-Seed round of financing led by Hack VC. 0G Labs aims to build the first modular AI chain to help developers launch AI dApps on a high-performance, programmable data availability layer. Through innovative system design, 0G Labs strives to achieve GB-level on-chain data transmission per second, supporting high-performance application scenarios such as AI model training.

In the fourth episode of the DealFlow podcast, BSCN editor Jonny Huang, MH Ventures general partner Kamran Iqbal, and Animoca Brands head of investment and strategic partnerships Mehdi Farooq jointly interviewed 0G Labs co-founder and CEO Michael Heinrich. Michael shared his personal background, from being a software engineer at Microsoft and SAP Labs to founding the Web2 company Garten, which is valued at over $1 billion, and now dedicating himself full-time to 0G, committed to building a modular AI tech stack on the blockchain. The discussion covered the current state and vision of DA, the advantages of modularity, team management, and the bidirectional dependency between Web3 and AI. Looking ahead, he emphasized that AI will become mainstream, bringing about significant social change, and Web3 needs to keep pace with this trend.

The following is the text of the interview:

Web2 Unicorn Leader Starts Again

Jonny: Today we are going to delve into an important topic—Data Availability (DA), especially in the realm of crypto AI. Michael, your company has a significant voice in this area. Before we dive into the details, could you briefly introduce your career background and how you entered this niche field?

Michael: I initially worked as a software engineer and technical product manager at Microsoft and SAP Labs, engaging in cutting-edge technology work with the Visual Studio team. Later, I shifted to the business side, working at Bain & Company for a few years, and then moved to Connecticut to work for Bridgewater Associates, focusing on portfolio construction. I reviewed about $60 billion in trades daily and learned a lot about risk metrics. For instance, we looked at CDS rates to assess counterparty risk. This experience gave me deep insights into traditional finance.

After that, I returned to Stanford for graduate studies and founded my first Web2 company, Garten. At its peak, the company grew to 650 employees, with annual revenue reaching $100 million and total funding of about $130 million, becoming a unicorn valued at over $1 billion and a star project incubated by Y Combinator.

At the end of 2022, my Stanford classmate Thomas reached out to me. He mentioned that he had invested in Conflux five years ago and believed that Ming Wu and Fan Long were the best engineers he had funded. He suggested that the four of us should get together to see if we could spark something. After six months of interaction, I came to the same conclusion. I thought, "Wow, Ming and Fan are the best engineers and computer scientists I have ever worked with. We must start a company together." I began transitioning to the chairman of Garten and dedicated myself full-time to 0G.

The four co-founders of 0G Labs, from left to right: Fan Long, Thomas Yao, Michael Heinrich, Ming Wu

The Current State, Challenges, and Ultimate Goals of DA

Jonny: This is one of the best founder introductions I have ever heard, and I guess your VC fundraising process must have gone smoothly. Before we dive into the topic of data availability, I want to discuss the current state of DA. While there are some well-known players, how do you assess the current landscape of DA?**

Michael: DA now has multiple sources, depending on the blockchain. For example, before the Ethereum Danksharding upgrade, Ethereum's DA was about 0.08 MB per second. Later, Celestia, EigenDA, and Avail entered the market, with their throughput typically ranging from 1.2 to 10 MB per second. The problem is that this throughput is far from sufficient for AI applications or any on-chain gaming applications. What we need to discuss is GB-level DA per second, not MB-level. For instance, if you want to train an AI model on-chain, you actually need a data transmission volume of 50 to 100 GB per second to achieve that. This is an order of magnitude difference. We see this opportunity and think about how to create this breakthrough, enabling large Web2 applications to be built on-chain with the same performance and cost. This is a huge gap we see in the field. Additionally, there are some issues that have not been fully considered. For example, we believe that data availability is a combination of data publishing and data storage. Our core insight is to divide data into these two channels to avoid broadcast bottlenecks in the system, thus achieving breakthrough performance improvements.

An additional storage network can enable many things, such as model storage, training data storage for specific use cases, and even programmability. You can manage the entire state, decide where to store data, how long to store it, and how much security is needed. Therefore, real use cases that various fields truly need are now possible.

Currently, the state of DA is that we have made significant progress, increasing from 0.08 MB per second to 1.4 MB, which has indeed reduced transaction costs, in some cases by as much as 99%. But this is still not enough for the real demands of the future world. High-performance AI applications, on-chain gaming, high-frequency DeFi—all these applications require higher throughput.

Mehdi: I have two fundamental questions. The first is about storage. You mentioned the transaction history of L2 and even the history of AI models. How long do we need to store data? That’s my first question. The second question is, with decentralized storage networks like Arweave and Filecoin already available, do you think they can help improve throughput? I’m referring not to data publishing, but to storage.**

Michael: The duration of data storage depends on its purpose. If considering disaster recovery, data should be stored permanently to reconstruct the state. For optimistic rollups with a fraud proof window, at least 7 days of storage is needed to reconstruct the state when necessary. For other types of rollups, the storage time may be shorter. It varies, but that’s the general idea.

As for other storage platforms, we chose to build an internal storage system because Arweave and Filecoin are more designed for log-type storage, which is long-term cold storage. Therefore, they are not designed for very fast data writing and reading, which is crucial for AI applications and structured data applications that require key-value storage or transactional data types. This is necessary for fast processing and even for building decentralized Google Docs applications.

Jonny: You have articulated very clearly why DA is needed and why existing decentralized storage solutions are not suitable for this specific scenario. Can we discuss the ultimate goal of data availability?**

Michael: The ultimate goal is easy to define; we want to achieve performance and cost comparable to Web2, making it possible to build anything on-chain, especially AI applications. It’s straightforward, just like AWS has computing and storage, with S3 being a key component. Data availability, while having different characteristics, is also a key component. Our ultimate goal is to build a modular AI tech stack, where the data availability part includes not only data publishing but also storage components integrated by a consensus network. We let the consensus network handle data availability sampling, and once consensus is reached, we can prove it on the underlying Layer 1 (like Ethereum). Our ultimate goal is to build an on-chain system that can run any high-performance application, even supporting on-chain training of AI models.

Kamran: Can you elaborate on your target market? Besides artificial intelligence and those building AI applications on the blockchain, what other projects do you hope will use 0G?**

Michael: You have already mentioned one application area. We are working to build the largest decentralized AI community and hope to have many projects built on top of us. Whether it's Pond building large graph models, or Fraction AI or PublicAI doing decentralized data labeling or cleaning, or even execution layer projects like Allora, Talus Network, or Ritual, we are striving to create the largest community for AI builders. This is a fundamental requirement for us.

But in reality, any high-performance application can be built on top of us. For example, in on-chain gaming, 5,000 users in an uncompressed state require 16 MB of data availability per second to achieve a complete on-chain game state. Currently, no DA layer can achieve this; maybe Solana can, but that is different from the Ethereum ecosystem and has limited support. So, such applications are also very interesting to us, especially if they incorporate on-chain AI agents (like NPCs). There is a lot of potential for cross-application in this area.

High-frequency DeFi is another example. Future fully homomorphic encryption (FHE), data markets, high-frequency deep-end applications—all these require very large data throughput and need a DA layer that can truly support high performance. Therefore, any high-performance DA app or Layer 2 can be built on top of us.

The Advantages of Modularity: Flexible Choices

Mehdi: You are working to improve scalability, throughput, and address the state bloat issues caused by storage components. Why not launch a complete Layer 1 directly? If you have the capability to achieve breakthroughs technically, why take a modular approach instead of creating a Layer 1 with its own virtual machine? What is the logic behind adopting a modular stack?**

Michael: Fundamentally, we are a Layer 1, but we firmly believe that modularity is the way to build applications in the future. And being modular does not exclude the possibility of providing a dedicated execution environment optimized for AI applications in the future. We haven’t fully defined the roadmap in this regard, but it is possible.

The core of modularity is choice. You can choose the settlement layer, execution environment, and DA layer. Depending on different use cases, developers can select the best solution. Just like in Web2, TCP/IP succeeded because it is inherently modular, allowing developers to freely choose its different aspects. Therefore, we want to give developers more choices, enabling them to build the most suitable environment based on their application types.

Mehdi: If you had to choose a virtual machine right now, which one on the market would be the best fit for the applications you are considering or striving to achieve?**

Michael: I have a very pragmatic view on this. If the goal is to attract more Web2 developers into Web3, it should be some type of WASM virtual machine that allows applications to be built using the most common programming languages like JavaScript or Python. These languages may not necessarily be the best choices for on-chain development.

The Move VM is excellent in terms of object and throughput design. If high performance is the goal, it is a choice worth considering. If we consider a battle-tested virtual machine, then it would be the EVM, as there are a large number of Solidity developers. So the choice depends on the specific use case.

Prioritization and Community Building

Jonny: I want to hear what the biggest obstacles you face are, or is everything going smoothly? I can’t imagine that your venture is so large that it could always be smooth sailing.**

Michael: Yes, I believe no startup has a smooth journey; there will always be some challenges. From my perspective, the biggest challenge is ensuring that we can keep up with the pace because we must execute multiple tasks exceptionally well and make some trade-offs to enter the market quickly.

For example, we initially wanted to launch with a custom consensus mechanism, but that would extend the launch time by four to five months. So we decided to use an off-the-shelf consensus mechanism in the first phase to create a strong proof of concept, achieving part of the end goal, such as 50 GB per second for each consensus layer. Then, in the second phase, we will introduce a horizontally scalable consensus layer to achieve infinite DA throughput. Just like flipping a switch to start another AWS server, we can add additional consensus layers to increase overall DA throughput.

Another challenge is ensuring that we can attract top talent to join the company. Our team is strong, including gold medalists from the International Olympiad in Informatics and top computer science PhDs, so we need our marketing team and new developers to match that caliber.

Jonny: It sounds like the biggest obstacle you currently face is prioritization, right? Accepting that you can’t do everything in a short time and must make some trade-offs. What about competition? How do you view it? I guess Celestia or EigenDA won’t pose a serious threat to your specific use cases.**

Michael: In Web3, competition largely depends on the community. We have built a strong community around high performance and AI builders, while Celestia and EigenDA may have more general-purpose communities. EigenDA may be more concerned with how to bring economic security and build AVS on EigenLayer, while Celestia is more focused on which Layer 2 wants to reduce their transaction costs and doesn’t have many high-throughput applications. For example, building high-frequency DeFi on Celestia is very challenging because you need multi-megabyte throughput per second, which would completely clog the Celestia network.

From this perspective, we do not feel threatened. We are building a very strong community, and even if others emerge, we already have the network effects of developers and market share, which we expect to attract more funding. So, the best defense is our network effect.

The Bidirectional Dependency Between Web3 and AI

Jonny: You have chosen artificial intelligence as your main focus, but why does Web3 need to host AI within its ecosystem? Conversely, why does AI need Web3? This is a bidirectional question, and the answers to both may not necessarily be affirmative.**

Michael: Of course, a Web3 without AI is possible. But I believe that in the next 5 to 10 years, every company will become an AI company because AI will bring about tremendous change, just like the internet did. Do we really want to miss this opportunity in Web3? I don’t think so. According to McKinsey, AI will unlock trillions of dollars in economic value, and 70% of jobs can be automated by AI. So why not leverage it? A Web3 without AI can exist, but with AI, the future will be even better. We believe that in the next 5 to 10 years, most participants on the blockchain will be AI agents executing tasks and transactions for you. It will be a very exciting world, and we will have a plethora of automated services driven by AI, tailored for users.

Conversely, I believe AI absolutely needs Web3 as well. Our mission is to make AI a public good. This is fundamentally a question of incentives. How do you ensure that AI models do not cheat, and how do you ensure they make decisions that are most beneficial to humanity? Alignment can be broken down into incentive, verification, and security components, each of which is very suitable for implementation in a blockchain environment. The blockchain can help achieve monetization and incentives through tokens, creating an environment where AI has no economic incentive to cheat. All transaction histories are also on the blockchain. Here’s a bold statement: I believe that fundamentally, everything from training data to data cleaning components, to data ingestion and collection components should be on-chain, allowing for complete traceability of who provided the data and what decisions the AI model made.

Looking ahead to the next 5 to 10 years, if AI systems are managing logistics, administration, and manufacturing systems, I would want to know the model's version, its decisions, and supervise models that exceed human intelligence to ensure they align with human interests. Putting AI into a black box that may cheat and not make decisions in the best interest of humanity, I’m not sure we can trust a few companies to consistently ensure the safety and integrity of such systems, especially considering the superpowers AI models may have in the next 5 to 10 years.

Kamran: We all know that the crypto space is filled with various narratives, and your focus on AI could become a barrier in the long run. As you mentioned, your tech stack will far surpass what we see now. Do you think the narrative and naming around AI itself will hinder your development in the future?**

Michael: We don’t think so. We firmly believe that in the future, every company will become an AI company. Almost no company will not use AI in some form in its applications or platforms. From this perspective, every time GPT releases a new version, like one with trillions of parameters that unlocks new capabilities previously unavailable and achieves higher performance levels, I believe the hype will continue to exist because this is a whole new paradigm. This is the first time we can tell computers what to do using human language. In some cases, you can achieve capabilities that surpass ordinary people, automating processes that were previously impossible. For example, some companies have almost completely automated their sales development and customer support. With the release of GPT-5, GPT-6, etc., AI models will become smarter. We need to ensure that we keep up with this trend in Web3 and build our own open-source versions.

AI agents will run certain parts of society in the future, and it is crucial to ensure they are governed appropriately by the blockchain. In 10 to 20 years, AI will definitely be mainstream, bringing about significant social change. Just look at Tesla's full self-driving mode; the future is becoming a reality day by day. Robots will also enter our lives, providing us with substantial support. We are essentially living in a science fiction movie.

ChainCatcher reminds readers to view blockchain rationally, enhance risk awareness, and be cautious of various virtual token issuances and speculations. All content on this site is solely market information or related party opinions, and does not constitute any form of investment advice. If you find sensitive information in the content, please click "Report", and we will handle it promptly.
ChainCatcher Building the Web3 world with innovators