Understanding Subsquid: Scalability of Data Modularity
Original Title: “A Deep Dive into Subsquid”
Source: CoinList
Compiled by: Elvin, ChainCatcher
We recently announced that the Subsquid Community Public Offering will take place on January 18, 2024, at 18:00 UTC.
Subsquid Network is an innovative decentralized data lake and query engine designed to provide developers with high-performance, permissionless data access and contribute to the creation of a neutral and open internet based on Web3 principles.
In conversations with the Subsquid team, we explored the real problems they are solving, their cost-effective methods for handling blockchain data, the utility of the SQD token, their growth strategies, and emerging trends in Web3 data.
Let’s dive in.
1. What is Subsquid, and what problems does it solve?
Subsquid Network is an innovative decentralized data lake and query engine designed to provide developers with high-performance, permissionless data access, aimed at contributing to the creation of a neutral and open internet based on Web3 principles. The Subsquid network is protected by zero-knowledge (ZK) proofs and employs a modular architecture designed for exceptional scalability and developer convenience, specifically optimized for blockchain indexing, dApp development, and analytics.
Subsquid is a response to the unscalable and rigid monolithic indexing frameworks (such as Graph) that were previously popular among Web3 developers and received market attention. Today, these frameworks are struggling to adapt to the rapidly evolving blockchain environment. Furthermore, the Subsquid network serves as an efficient, decentralized alternative to centralized infrastructure companies, including large RPC and API providers.
2. How does Subsquid make blockchain data more affordable?
Subsquid currently provides historical data access at a significantly lower cost than RPC or API providers. Over time, the reduction in network data costs will also extend to real-time data (unfinalized "hot blocks"). Here are some details on how Subsquid's cost-reduction mechanisms work:
- Unlimited horizontal scalability: Subsquid is designed to scale infinitely as new nodes join the network. This means that as the network grows, it can handle increasing amounts of data without proportional cost increases. In other words, the data lake provides a "shared cost infrastructure," where the management costs of data in the network are shared by an ever-increasing number of data consumers (dApps, analysts, and others).
- Efficient data storage and retrieval powered by Duck DB: Data is compressed and distributed across network nodes, with each node efficiently querying local data through DuckDB. By leveraging this new database technology, we have developed an efficient storage and retrieval mechanism that significantly reduces the overall costs of managing and accessing large amounts of data.
3. What are the inherent use cases of the SQD token?
The SQD token is a vital component of the Subsquid ecosystem. The use cases of the SQD token focus on simplifying and securing network operations in a permissionless manner:
- Incentives for coordinating infrastructure providers: SQD is used to reward node operators who contribute computing and storage resources to the network.
- Governance for network participants: Through delegation, the SQD token design includes built-in node governance, helping to select trustworthy operators for rewards in a permissionless manner.
- Fair resource consumption: By locking SQD tokens, data consumers from the decentralized data lake can increase rate limits.
- Network decision-making: SQD token holders can participate in governance and vote on protocol changes and other proposals.
4. How does Subsquid plan to build and attract a healthy community around the SQD token?
As evidenced by its very successful testnet (which has deployed over 58,000 decentralized indexers to date), Subsquid has developed various incentive mechanisms for both technical and non-technical community members.
Technical community members, including developers and data analysts, derive inherent value from the network itself and the tools built on top of it. In addition, Subsquid collaborates with a vast ecosystem of enterprises and Web3-native tool projects to conduct joint integrations and large DevRel initiatives and events.
For non-technical community members, Subsquid has undertaken extensive efforts to build awareness of the network's value based on the large project ecosystem utilizing the network. Regular cryptocurrency users can easily start understanding Subsquid's "deep technology" by learning how Subsquid helps them access and use their favorite consumer applications!
Moreover, any SQD holder can delegate in a permissionless manner, which is a crucial component of the network, while the community indicates which working nodes perform well. This is an important governance function within the network and provides a way for non-technical individuals to create value for the network itself.
5. What exactly is the "modular approach" to data? How does Subsquid execute this strategy?
Subsquid adopts a modular approach to data to provide flexibility, efficiency, and scalability when handling various types of data in the Web3 ecosystem. This modular approach is designed to meet the diverse needs of decentralized applications (dApps) and adapt to different types of data sources. Here are the reasons and ways Subsquid implements this modular approach:
- Data-agnostic and flexible data ingestion: Web3 applications require access to a wide range of data sources, including on-chain data from blockchains, off-chain data from external APIs, and other decentralized storage solutions like IPFS and Arweave. By adopting a modular approach, Subsquid can handle data from almost any source.
- Efficient data processing: Different types of data require different processing workflows, such as storage, retrieval, and querying. By modularizing its data processing capabilities, Subsquid can optimize its processes for specific data types, ensuring efficient and scalable operations tailored to the needs of each data source.
- Scalability and extensibility: The modular architecture allows Subsquid to scale and evolve its capabilities more effectively. New modules can be added to support new data sources or functionalities without significant changes to the existing system, making it easier to adapt to the ever-changing demands and technologies in the Web3 space.
- Customization for specific use cases: Different dApps have varying requirements for data processing based on their use cases. By providing a modular framework, Subsquid enables developers to customize and configure data processing workflows to fit their specific use cases, ensuring that the platform can meet the diverse needs of the Web3 ecosystem.
- Interoperability for builders: The modular architecture promotes interoperability by allowing different modules to work together seamlessly. This interoperability is crucial in decentralized environments, where applications often need to interact with multiple data sources and other components to function effectively.
6. What is the best way to get involved in the Subsquid community?
First, the Subsquid incentive testnet is about to enter its second phase. Like the community public offering, the Subsquid testnet can be found on the CoinList platform. The testnet includes participation opportunities for both technical and non-technical community members.
We encourage developers to take some time to read the official Subsquid documentation. In the documentation, they will find links to appropriate chats for technical discussions. Of course, this does not exclude Subsquid's Twitter and Discord.
7. 2023 was an important year for Web3 data. What data trends is the Subsquid team most excited about for 2024?
One of the most exciting data trends for 2024 is the increasing popularity and adoption of open-source analytical database management systems like DuckDB. These systems are gaining attention for their outstanding performance and versatility, particularly in handling complex analytical workloads on large datasets.
Additionally, the trend of embedding these database systems directly into applications is changing how developers conduct data analysis, allowing for seamless integration of analytical capabilities without the need for separate database servers. This trend represents a shift towards more efficient and flexible data analysis solutions, which is especially important in today’s fast-paced and data-driven environment. According to Subsquid, their project is the first Web3 project to implement DuckDB at scale, and they do not expect to be the last.