Sahara founder, which raised $43 million in funding: Serving Microsoft and Amazon, how does Sahara facilitate AI assetization?
Interviewer: Grapefruit, ChainCatcher
Guests: Sean Ren, Tyler Zhou, Co-founder of Sahara
Editor: Marco, ChainCatcher
Since OpenAI released the text-to-video AI model Sora, AI has once again become the hottest track in today's market, with waves of investment pouring in. Innovative projects that integrate AI and Web3 are emerging like mushrooms after rain. According to the crypto data platform Rootdata, there are nearly 240 projects listed in the "AI and Web3" sector, which has clearly become an independent track, with the decentralized AI network Sahara being one of the star projects in this sector.
Sahara was founded by Sean Ren and Tyler Zhou in May last year. It is a decentralized AI network infrastructure that facilitates the assetization of AI, aiming to help users deploy or build customized and personalized AI products.
Sean Ren is a tenured professor in the Computer Science Department at the University of Southern California, with 15 years of industry research experience in the AI field; Tyler Zhou previously served as the investment director at Binance Labs and participated in multiple project investments and incubations.
In March of this year, Sahara announced that it had completed a $6 million financing round led by Polychain Capital back in August of last year, with participation from investment institutions such as Sequoia Capital, Samsung Next, and Nomad Capital.
The two founders told ChainCatcher that Sahara has provided data services to over 30 enterprise clients, including well-known companies like Microsoft, Amazon, MIT, Snapchat, and Character AI, and has already generated millions of dollars in revenue.
In an exclusive interview with ChainCatcher, Tyler Zhou revealed that Sahara will launch consumer products in April-May; the Sahara Testnet will go live in Q3, with plans to launch the mainnet in Q4.
On April 4, Sahara launched its first points activity, Sahara Social, on the task platform Galxe for early users, allowing users to earn early points rewards by connecting to the Sahara network, registering for the waiting list, and completing other tasks.
The Story Behind the Creation of Sahara
1. ChainCatcher: What is Sean Ren's personal background, and what led you to Web3? What is your role at Sahara?
Sean Ren: My personal learning and work experience have a strong engineering background.
Before creating Sahara, I had been a tenured professor in the Computer Science Department at USC for seven years, focusing on academic research in AI and NLP.
While pursuing my PhD in Computer Science at the University of Illinois at Urbana-Champaign, I created my first startup, StylePuzzle (a fashion recommendation e-commerce platform), which received investment from Plug & Play Ventures and progressed from angel round to Series C.
Tyler and I have been friends for six years, and our entrepreneurial opportunity arose in 2022 when we discussed many of the shortcomings of Web2 AI products, particularly the economic model issues.
Currently, the AI economic model benefits only a small number of professionals, while other participants in the AI ecosystem, such as data owners, collectors, providers, and model feedback providers, do not receive reasonable economic compensation, and user data privacy issues remain unresolved, which is not conducive to long-term development.
The first principle of Sahara's product is to address the pain points of the traditional AI industry, ensuring that all participants in the AI ecosystem can receive appropriate or reasonable returns based on their contributions, rather than being limited to the computational capabilities and application scenarios of large models.
Currently, I am mainly responsible for product development and business development at Sahara.
ChainCatcher: Tyler Zhou previously served as the investment director at Binance Labs. Why did you leave Binance to pursue entrepreneurship in the AI sector?
Tyler Zhou: After graduating from UC Berkeley, I worked in investment banking and private equity, primarily investing in infrastructure, information technology, and real estate.
I joined Binance Labs in early 2022, responsible for investment work in the U.S. market, focusing on incubation and investment projects, and I led the launch of the first MVB.
I chose to leave Binance at the beginning of 2023 to establish Sahara with Professor Ren for several reasons: first, the entire AI ecosystem's economic model has many issues, and blockchain's encryption technology and token economy might help solve these problems.
I personally believe that Professor Ren's background and expertise make him the most suitable candidate for developing related products; no other team in the market understands the entire closed loop of AI systems, technology, and economics like Professor Ren.
Moreover, Professor Ren is not just a traditional researcher; he also has a strong business acumen and sensitivity.
2. ChainCatcher: What is Sahara's product positioning and goals? Besides the commercialization of AI, what other issues do you aim to address?
Sean Ren: Currently, Sahara's main product is a decentralized network infrastructure that supports anyone in building or deploying their personalized AI products.
Sahara can be viewed as a decentralized network composed of an Execution Layer, Transaction Layer, and Application Layer.
At the application layer, Sahara provides a native built-in decentralized data marketplace (also known as Sahara Data) and offers a toolkit for handling data-related tasks (such as collection, labeling, QA, etc.) for users to utilize and access, helping to train their AI models.
Users come to Sahara mainly to build their AI products, and Sahara Data can help solve the problems of data collection, labeling, and transformation.
Additionally, as a data marketplace, Sahara serves as an important linking platform for data suppliers and demanders. It not only provides high-value data services for AI model training but also helps users with data needs discover more data providers, facilitating the construction of autonomous AI.
The decentralized data marketplace Sahara Data is a significant advantage of Sahara's product and is key to distinguishing it from other Web3 AI projects in the market. Launched in October last year, it has been operational for about six months, initially serving enterprise clients such as Microsoft, Snapchat, MIT, Motherson Group, and Amazon, providing them with relevant data services and addressing some of the industry's most challenging data needs.
The Sahara Execution Layer supports data encryption and attribution, i.e., proof of ownership, achieved through innovative digital watermarking technology and public key facilities. It is similar to a proof of ownership; when a user creates a data point, dataset, or model, they can embed their DID into their data or model, generating a watermark that proves ownership of the data. This watermark will persist as the user's data and model circulate, allowing for attribution of the data and model.
By using the proof of ownership mechanism, if users need to rely on a foundational model built by someone else while training their AI or making inferences, the income generated by that AI product can be shared with the holders of the underlying model.
3. ChainCatcher: On March 5, Sahara announced the completion of a $6 million seed round led by Polychain Capital, with participation from Sequoia Capital and others. How did Sahara connect with these investors? Why do you think they are optimistic about Sahara? What support did the investment institutions provide?
Tyler Zhou: In fact, the seed round financing is not a recent financing; it was completed back in August of last year. Unlike other teams, Sahara did not immediately announce the completion of the financing but waited for the right timing to promote it after the product launch.
The reason for the current announcement is that Sahara's product development has entered a new phase, and we will be launching a series of new products, including ecosystem-related applications.
The seed round for Sahara was oversubscribed, with many options available, but we chose investors with significant strategic meaning. These investors can help Sahara understand the development trends of AI companies globally, what startups at different stages of AI are doing, global AI economic trends, and the differences in AI economies across countries, as well as what leading AI companies are doing.
Sahara's Advantages and Development Progress
4. ChainCatcher: How do you think Sahara differs from traditional AI products like ChatGPT?
Sahara Co-founder Sean Ren: This can be viewed from two perspectives.
First, Sahara is not an application; the final delivered product is not a specific GPT product. Sahara is a provider of decentralized network infrastructure, and its application layer does not define what AI Agent products developers should build. Instead, it provides relevant APIs and SDKs, allowing anyone to easily build their AI Agents.
Second, ChatGPT is a question-and-answer format chatbot, while the Sahara Knowledge Agents (KA) on Sahara are customizable AI programs, which are very different from traditional chatbots. They autonomously analyze data and make reliable decisions based on specific needs, acting or executing tasks according to certain instructions to achieve a specific goal.
For example, a KOL's KA may aim to help filter advertising invitation messages in their Twitter DMs, generating a concise report daily and responding to others' DMs. The KA can automatically execute these commands at any time.
Sahara is an infrastructure builder and provides tools and platforms for constructing custom Knowledge Agents (KAs).
Tyler Zhou: Compared to ChatGPT and other AI projects on the market, Sahara focuses on Personalized Agents, which are customizable AI programs with capabilities that extend beyond chatting and can assist users in executing many tasks.
To build a "Personalized Agent," two prerequisites are needed: first, a personal database is required, and the AI Agent must be trained based on that data to achieve desired capabilities; second, relevant infrastructure and tools for building the Agent are necessary.
Sahara not only provides data markets and related processing tools but also ensures that users can train their AI while protecting their data privacy and offers relevant infrastructure tools to help users better build KAs.
The Sahara network supports users in customizing their AI Agents without sacrificing their data privacy and also provides a no-code platform for building Agents.
5. ChainCatcher: How has Sahara performed since its establishment? What are the upcoming work priorities?
Sean Ren: First, regarding the product, Sahara has been building the decentralized data marketplace Sahara Data, which was launched in October last year.
As of now, Sahara Data has collaborated with 31 enterprise clients, refining technology while increasing revenue.
By the end of Q1 this year, Sahara Data had accumulated 200,000 users.
In the near term, Sahara will focus on three areas, which are also important actions in the short term:
First, to open the decentralized data marketplace to the public, allowing both individuals and businesses to use it;
Second, to build or launch some C-end user products related to the execution layer based on Sahara Data, such as Knowledge Vault and Knowledge Marketplace.
Third, the Sahara Testnet will be launched in Q3.
6. ChainCatcher: What is the biggest challenge the company currently faces?
Sean Ren: The company's team has expanded too quickly, growing from a team of a dozen people to over 40. In recent plans, the team may add another 30-40 members in the next month or two.
Tyler Zhou: From a market and ecosystem perspective, I think a significant challenge is that in the early stages, Sahara's positioning is as a "Crypto for AI" product (i.e., using Crypto to empower the entire AI), which will reach a larger market and ecosystem compared to "AI for Crypto."
Regarding the difference between "Crypto for AI" and "AI for Crypto," the former refers to using Crypto and blockchain technology to better integrate with AI, helping to improve and address issues related to AI products, which is a larger global market; the latter refers to AI being used to improve crypto products, such as in smart contracts or some uses of blockchain, which is currently a relatively small market and more narrative-driven.
However, the mainstream hype in the current market is around "AI for Crypto" products, overlooking the economic aspects of the entire AI ecosystem and the trends of the global economy, resulting in a lot of noise in the market, especially in the past six months.
How Sahara can persist in doing what it wants to do and what it should do amidst the noise is a significant challenge.
Sahara Has Served Over 30 Enterprises Including Microsoft and Amazon
7. ChainCatcher: Since its establishment, Sahara has attracted over 30 enterprise clients, including Microsoft, Snapchat, and MIT. What services does Sahara provide to them, and why do these well-known companies choose to collaborate with you?
Sean Ren: The first product Sahara launched was the decentralized data marketplace Sahara Data, which has similar competitors in Web2, such as Skill AI centralized data service providers. Compared to centralized data service providers, Sahara Data has advantages.
First, Sahara can reach a large number of AI data collection and labeling workers globally through various rewards and economic incentive mechanisms. Currently, there are about 200,000 AI-related workers on the Sahara network, most of whom are natives of the Web2 AI industry. They are attracted to Sahara because it allows them to earn from their data contributions, such as receiving Crypto as payment.
Regarding the collaborating enterprises, they have various data needs, such as Snapchat needing to collect conversational data, Microsoft collecting multimodal data, and MIT requiring various video data.
As a data supplier, Sahara has a significant advantage in data diversification, offering a very diverse candidate database for clients to choose from, adapting to different data needs.
By collaborating with over 30 enterprises, Sahara continuously refines its products, making them more mature to better meet the data needs of various global enterprises and businesses, forming a positive cycle.
8. ChainCatcher: Currently, users need to join the waiting list on the official website to participate in Sahara. How is the product development progress? What ways can users participate in Sahara? What rewards will be given to early participants?
Tyler Zhou: In April-May, Sahara will launch its first C-end product, allowing C-end users to contribute their knowledge and skills on the platform.
For early users, Sahara will also have different reward mechanisms. More information will be released once we go public in April or May, and the Sahara Testnet will be launched in Q3, with plans to launch the mainnet in Q4.
9. ChainCatcher: What are your views on popular decentralized GPU, Agent, and other AI crypto projects in the market? How do you assess the reliability of an AI crypto project?
Sean Ren: Currently, the crypto AI products on the market can be divided into two factions based on "AI for Crypto" and "Crypto for AI."
"Crypto for AI" is a larger market. For us, as natives of the AI industry, we are more focused on how to leverage blockchain and Web3 technology to address some of the criticisms of Web2 AI products, especially regarding economic models and data ownership.
Many projects currently use blockchain's economic models to incentivize certain behaviors in AI, but I believe they are superficial, only focusing on the economic model without considering the entire AI ecosystem behind it, such as the privacy and encryption of training data.
From the perspective of the entire AI ecosystem, the upstream should be data and data processing. There are data-related projects in the market, but most only have the most attractive crypto economic models (like Label to earn) without addressing the fundamental issues of data ownership and model attribution; they are merely building an application.
Regarding training large models using distributed GPUs, I personally think this direction is very challenging and depends on the project's level of decentralization. If it merely symbolically binds a few data centers in the same facility or nearby, that is a form of artificially constructed decentralization. If the goal is to bundle all idle GPUs worldwide for decentralized training, the implementation difficulty is significant due to the vast differences in speed between different networks.
Additionally, developments in machine learning and ZK-related technologies are still relatively long-term. Therefore, when assessing a project, it is crucial to distinguish which projects are feasible and can be commercialized in the short term and which are research-oriented projects that require long-term exploration.