The first data DAO behind the AI era, Vana: A small Bittensor defending user data rights

BlockBeats
2024-06-20 22:48:03
Collection
Vana, supported by Paradigm and Polychain, has completed a $20 million financing round and is attempting to establish an efficient data liquidity network using the entire mechanism of Bittensor.

Author: Siwei Guai Guai, BlockBeats

The First Data DAO, The First User-Owned Data Network

In February this year, Reddit revealed in its IPO prospectus that it had achieved a total revenue of $203 million through data licensing agreements with AI companies. The reason AI companies are willing to spend big is that data, like computing power, is an essential resource for developing AI models.

Sadly, none of this revenue flows to Reddit's users, even though they are the actual creators of over 1 billion posts and more than 16 billion comments on the platform.

In light of such immense injustice, the first data DAO "r/datadao" was born on April 4. It encourages users to export their data from the Reddit platform and upload it to a community database, voting together to decide to lease the data to AI companies for profit sharing. Users can also earn governance tokens RDAT based on their data contributions.

Subsequently, media revealed that the driving force behind r/datadao is a startup called Vana, which raised $20 million from VCs like Paradigm and Polychain Capital. This news invigorated the market, causing RDAT to soar 50 times from its opening price of $0.011 to a peak of $0.67 within five days. However, it later adopted a model of unlimited issuance, leading to a price drop back to its initial point without any recovery.

Generally speaking, the story would end here. When a crypto project reaches this point, it basically announces Game Over.

Our attention has been focused on r/datadao, while we overlooked what the giant Vana behind it actually wants to do. r/datadao is merely a test by Vana to reclaim user data rights from the giants; what it truly aims to establish is the first blockchain network for user-owned data. In this open internet, users own and manage their data, as well as the intelligent products created from that data. Users gain ownership of AI models through data contributions, and value flows to users and independent model developers, rather than centralized platforms.

Yes, Vana wants to create a unique new public chain. On June 11, this public chain welcomed its first testnet, "Satori Testnet." It turns out that the good show of disrupting the giants' meal tickets has just begun.

Is the New Public Chain Actually a Small Bittensor?

In fact, Vana's public chain is not entirely new. In many ways, it resembles Bittensor.

Bittensor is renowned for establishing a market mechanism where "multiple subnets compete to improve the quality of digital goods production" for AI development.

It can be said that Vana has mimicked the entire mechanism of Bittensor to serve its need to "establish an efficient data liquidity network."

Thus, Vana proposed the concept of DLP, which is akin to a subnet in Bittensor. To understand Bittensor, it is crucial to grasp the concept of "subnet." Similarly, to understand Vana, one must understand "DLP."

DLP

DLP stands for "Data Liquidity Pool."

At the blockchain level, Vana is an EVM-compatible chain based on proof-of-stake consensus, and DLP is essentially a smart contract on the Vana network.

Data DAOs like r/datadao are specific manifestations of DLP. Builders on the Satori testnet are currently developing various data DAOs, such as ChatGPT Data DAO, LinkedIn Data DAO, Twitter Data DAO, and Github Data DAO.

In the future, Vana's mainnet will launch 16 DLP slots, which will be selected by holders of Vana's native gas token DAT through voting based on metrics such as total transaction count, transaction fees, verified data upload volume, and independent wallet interaction frequency.

Just like subnets can earn Bittensor's TAO emission rewards, DLPs can also earn Vana's DAT emission rewards. Of course, DLPs that are not selected can still receive emission rewards, just not as much as the top 16 DLPs. New DLPs must also operate for a period without emission rewards to prove themselves.

Users who submit data to DLP are called "data contributors," and they receive specific token rewards from DLP based on the quality of their contributions, similar to miners earning TAO token rewards for completing various tasks on Bittensor subnets. Each DLP will implement its own contribution proof function based on its specific dataset. For example, r/datadao determines the value of contributed data by measuring users' karma and requires users to post code in their Reddit profiles to confirm ownership.

Here is a seemingly mundane but significant detail—did readers notice? The rewards given to users by DLP are not Vana's native gas token DAT, but specific governance tokens issued by the DLP itself! This means Vana allows each DLP to create its own dataset-specific tokens, enabling DLPs to fully control the token economics of their pools. r/datadao issues its own governance token RDAT to users and can completely control its token supply. These are the most significant differences between Vana and Bittensor, and they are also a key issue I will focus on at the end of the article.

Nagoya Consensus

The data submitted by users first needs to be scored by validation nodes, which will verify according to the standards set by the DLP creator. In this process, Vana employs a fuzzy consensus mechanism similar to Bittensor's Yuma consensus—called Nagoya consensus—where a group of validation nodes collectively assesses the quality of the submitted data and determines the final score using a weighted average.

Additionally, validation nodes will score the scoring behavior of other validation nodes. If a validation node gives a high score to a very poor file, other validation nodes will score that node low.

Every 1800 blocks (approximately 3 hours) constitutes an epoch. At the end of each epoch, the DLP contract distributes the emission rewards to validation nodes based on the final scores. This mechanism both suppresses behaviors that deviate from the consensus majority and incentivizes validation nodes to honestly assess data contributions.

All of the above transactions will be verified for validity by propagation nodes, which will add them to the Vana network's blocks for confirmation. Propagation nodes can earn transaction fees and emission rewards, which is no different from other EVM-compatible chains based on proof-of-stake consensus.

User Self-Custodied Data

It is worth noting that although users submit personal data to the data DAO, this data is not actually stored on-chain.

Data is not like tokens; it is non-exclusive and can be copied at will once made public on-chain. To ensure data liquidity, it is essential to guarantee users' control over their private data, ensuring that data is not reused without the owner's consent, thus addressing the "double-spending problem" of data.

In this regard, Vana employs a clever and rigorous design that makes the flow of user data resemble a meticulously choreographed dance.

First, data contributors encrypt their data using symmetric keys and store the encrypted data in personal cloud storage accounts like Google Drive. After obtaining the data's URL and unique identifier (ETAG), this information, along with the encryption key, is recorded on the Vana blockchain. Next, a root validation node is selected to coordinate other validation nodes to download, decrypt, and verify the data file. Through the fuzzy consensus mechanism, validation nodes confirm the data's validity and record the results on the blockchain, forming an index of valid files.

When a data requester initiates an access request, the root validation node again organizes validation nodes to download, decrypt, and compile the data, ultimately delivering the securely verified data to the requester. Throughout this process, only authorized validation nodes can decrypt and access the data through blockchain permission controls, preventing unauthorized downloads and decryption operations.

On the solid foundation of data liquidity and blockchain layers, Vana has created an open application layer for collaboration between data contributors and developers. Developers can build applications using the data liquidity accumulated by DLP, while the contributor community can create real economic value from their data.

A Step Ahead in Implementing the dTAO Mechanism?

As mentioned earlier, the biggest difference between Vana and Bittensor is that it allows DLPs to have their own token economies.

I believe most people initially share my confusion: why allow each DLP to create its own token? What if they mess it up (as r/datadao's governance token RDAT experienced wild fluctuations)? It doesn't seem necessary.

After consulting with the Vana team, I realized that Vana does not just want to adopt Bittensor's mechanism as is; it also aims to respond more proactively to the challenges currently faced by Bittensor. Allowing each DLP to have its own token economy is at the core of the Dynamic TAO (BIT001) network upgrade that Bittensor has been pushing but progressing slowly.

Bittensor's Dynamic TAO upgrade aims to decentralize the TAO emission allocation rights, originally determined by a few validation nodes in the root network, to all TAO holders through a market-driven dynamic pricing mechanism. For this mechanism to take effect, each subnet needs to issue its own token (dTAO token) and establish a liquidity pool composed of subnet tokens and TAO. TAO holders can choose to stake TAO into the liquidity pools corresponding to different subnets to obtain specific dTAO tokens for that subnet.

Each subnet's dTAO token has its own independent supply, and subnet validation nodes use dTAO to participate in consensus and earn rewards. The price of each pool is determined by the ratio of TAO and dTAO reserves within it, reflecting market demand for that subnet. Bittensor will inject newly issued TAO into each pool according to the price ratio of the dTAO pools.

This changes the original TAO allocation method based on root network voting to one based on the price ratios of dynamic TAO pools, allowing all TAO holders to "vote with their feet" through staking to determine which subnets should receive more TAO rewards.

Although Vana's current official documentation still states that the "root network" module is responsible for managing DLP and token reward distribution, it is clear that Vana aims to break free from this centralized governance mechanism before launching its mainnet, taking a bolder step forward than Bittensor.

Why do I say "a bolder step forward than Bittensor"? Because the original method of allocation determined by root network validation node voting, while relatively centralized, has validation nodes motivated to maintain their reputation and thus act more cautiously. Opening participation to all token holders may lead to an influx of speculators, affecting ecological stability. Since the price of dTAO pools is entirely determined by market supply and demand, a sudden large-scale staking or unstaking by large holders could trigger severe price fluctuations, leading to significant volatility in TAO allocations based on those price ratios and causing systemic risks.

If Vana indeed plans to adopt a similar approach to Bittensor's dTAO mechanism after launching its mainnet, I believe it must prepare in advance to address the above issues. Its former teacher Bittensor is still moving slowly behind, and there is no one ahead to test and pave the way for it.

ChainCatcher reminds readers to view blockchain rationally, enhance risk awareness, and be cautious of various virtual token issuances and speculations. All content on this site is solely market information or related party opinions, and does not constitute any form of investment advice. If you find sensitive information in the content, please click "Report", and we will handle it promptly.
ChainCatcher Building the Web3 world with innovators