Comprehensive Overview of Seven New Generation Web3 Data Tools
Author: Louis Wang, Biteye
Editor: Biteye Core Contributor Crush
With the development of blockchain technology and the prosperity of on-chain ecosystems, the rich interactive behaviors have brought about massive data, making data analysis an important component of blockchain applications.
These on-chain data correspond to the flow of on-chain value, and the analysis of this data, along with the insights derived from it, has become extremely valuable.
Emerging Web3 data products provide rich data visualization capabilities and powerful analytical tools, helping users better utilize the data within the blockchain through real-time monitoring, data mining, and analysis.
Data Product Classification
From the perspective of the data stack, blockchain data products can be divided into three categories: data sources, data development tools, and data apps.
Among these, data development tools and data apps have the broadest audience. Mature data products like Nansen and Messari package the finished results of data analysis for users, such as tracking hot projects and on-chain capital flows.
The characteristics of such products are that they are easy to understand, have a low learning curve, and can be used immediately. However, they cannot meet customized needs, are passive in research, and often require payment for detailed functionalities.
Highly customizable data development tools, such as Dune Analytics, Flipside, and Footprint, are means of active research.
These tools are characterized by high freedom and upper limits, allowing users to explore freely. However, they require a deep understanding of blockchain data structures, a basic grasp of database languages, and some knowledge of smart contracts, making the learning curve relatively high and easy to understand but difficult to master.
Although there are already mature products in the segmented data space, as people continue to explore blockchain technology, on-chain data is experiencing explosive growth, and user demand for data analysis is increasing day by day. A batch of emerging data tools and products has come into the users' sight.
The following content introduces seven emerging data products (in alphabetical order):
01 0xScope
Product Background
Blockchain data is real and transparent, but the decentralized production relationships have led to the problem of information fragmentation. The data is scattered; each piece is valid and usable, but disconnected, making it difficult to produce profound analyses with relevance.
At the same time, due to the ease of "multi-opening" Ethereum EOA accounts, analyzing only addresses is singular and one-sided. The user profile aims to portray the operational entities behind the accounts.
0xScope addresses these pain points in Web3 data by using an invisible hand to pick up the fragmented blockchain data, making it possible to piece together a complete user profile. The team proposed an innovative solution: entity analysis.
0xScope has established a weight aggregation algorithm based on graph computing, identifying other addresses of users by assigning different weights to dozens of different types of rules. These rules are continuously tested and refined through deep learning to improve the accuracy of address aggregation and reconstruct information units from the perspective of entities.
As shown in the figure below, these four seemingly unrelated on-chain addresses are identified by 0xScope's aggregation algorithm as belonging to the same entity, indicating a very high likelihood that these accounts are controlled by the same person! Moreover, through the traces left by their different interactions, it is even possible to infer the time zone of the person behind the screen and other Web2 information.
The emergence of entities means that the dimensions of on-chain analysis are no longer singular; the image of each on-chain entity becomes fuller and richer, and the analysis perspective is no longer limited to thin EOA addresses. This also means that once someone commits wrongdoing, they can no longer escape by simply switching to another EOA account; the "Tianyancha" will track them to the ends of the earth, leaving no place to hide…
Product Features
Watcher has just been updated to version 2.0, making adaptive adjustments to the page logic. There are three main functional modules:
1. Discovery Module
This part belongs to the data wandering area, allowing users to roam aimlessly. Users can view the distribution of large holders for various tokens; rank by market capitalization to see the price and holder changes of various coins;
Rank based on TVL and daily user data to discover hot projects on-chain; or check the latest blue-chip rankings.
2. DD Module
Watcher is very suitable for conducting due diligence. In VC Watch, users can view the VC holdings and changes for specific tokens, clearly seeing who has exited and who remains;
Users can also track the dynamics of each VC, bridging the gap between small and large investors.
When using whale tracking or wallet tracking, users can set alerts to stay informed about on-chain dynamics in real-time. To avoid data rollback, there is a delay of about 12 Ethereum blocks, within five minutes.
Notable Entity Tracker provides tracking for well-known institutions/market makers/celebrities, summarizing multiple accounts controlled by one entity, making their movements clear and allowing for one-click following.
3. Research Module
This part should best represent 0xScope's technical strength, focusing on address aggregation and visualizing capital flows.
Through the address aggregation function, users can filter out other addresses that are highly associated with a specific address. For example, if a phishing account or hacker account is locked, users can check if that address is associated with any exchanges and request assistance from the exchange to capture the hacker, bringing new hope for recovering lost funds.
The capital flow function clearly presents on-chain dynamics through visual charts. The following figure shows how the "FTX hacker" operated on-chain after stealing assets, converting them into BNB and other assets, and where they were transferred afterward.
By observing the transfer of mask tokens and combining it with 0xScope's tagging system, it can be inferred that Jump Trading is the market maker for mask tokens, making the tracking of these two red addresses significant for investment decisions.
When assisting investment decisions, combining whale tracking and alert functions can be particularly effective. In the whale watch function, users can view the whale ratio for each token and enter the corresponding whale analysis panel for that token through the Entity Dashboard to see the latest dynamics of the whales.
Previously, someone used this method to observe a large number of whale withdrawals for YFII, identifying the whale with the most accumulation and the most relevant behavior and price, setting alerts to track operations, ultimately achieving a 700% return through whale-catching tactics. Specific operations can be seen in this tweet.
In Summary
0xScope's Watcher has powerful technological innovations. Through entity analysis and address aggregation, it helps users untangle relationships in the complex web of on-chain addresses and behaviors, which is particularly useful for due diligence;
It can also visualize the capital flows of tokens, allowing users to set alerts for tracking targets, watch to earn, turning smart asses into smart money.
02 Arkham
Product Background
Arkham is a data analysis app that focuses on address aggregation and tracking. The product is active in tracking and analyzing various hot events on-chain, such as the FTX explosion and the CRV shorting in November, gaining widespread attention.
The product is currently in a closed beta phase, and users can apply for beta slots on the official website.
Product Features
1. Asset Information
Once connected to a wallet, users can see their asset information, which is essentially the simplest instantiation of Arkham's dashboard, including: asset table, historical asset levels, asset composition, and transaction records. The real-time 24-hour changes in held assets will also be displayed.
2. Data Tracking Dashboard
The Dashboard is Arkham's main feature, and it is very simple to create; no programming is required. Users just need to input and check conditions to output results quickly. This can be oriented towards entity analysis or token analysis.
Arkham collects the addresses of mainstream market makers in the market and aggregates them into entities for easy user queries. For example, in entity analysis, I created a dashboard comparing market makers. The dashboard includes the assets, asset composition, historical asset curves, and recent operations of four market makers: Wintermute, Jump Trading, Alameda, and Jane Street.
Users can see what tokens powerful market makers hold and what tokens they are operating, providing guidance on what to follow or avoid.
In token analysis, looking at the AAVE borrowing of $CRV shorting event in November, by filtering for large CRV tokens deposited into OKX, it can be seen that all outputs are from Eisenberg (the same person who borrowed CRV to short this time and caused Mango to lose $100 million last time).
Subsequently, $CRV skyrocketed, forcing a squeeze, and the AAVE position was liquidated. The dashboard shows that due to insufficient liquidity, the pending liquidation position was too large, and the entire liquidation process lasted 40 minutes, ultimately leaving AAVE with a seven-figure bad debt.
3. Visualizer
The visualizer is a tool that visualizes the relationships between on-chain addresses. By inputting an address or entity name, a graph can be automatically generated. For example, if I want to track the initial distribution of APEcoin, the upper part shows the airdrop addresses, the middle part is the foundation wallet managed by Coinbase, and the lower part shows some unmarked addresses.
In Summary
Arkham is a simple and easy-to-use on-chain data tracking app. Users can create information-rich dashboards by flexibly filtering conditions, whether for entity-oriented analysis or token and protocol analysis.
The product is still in beta, with more features like large transfer alerts under development.
03 Chaineye
Product Background
As the Web3 ecosystem continues to grow, various projects with distinct characteristics emerge in every track. A rich ecosystem is certainly good for users, but it can also cause some confusion about which is better or what options are available for selection.
After all, not every ordinary user has the time and energy to research various projects. Chaineye is a data product aimed at assisting users in their daily choices. It aims to solve problems such as selecting cross-chain solutions with minimal friction from a practical perspective, functioning as an aggregator.
Product Features
1. Asset Cross-Chain Aggregation
The multi-chain landscape brings diversity to the blockchain ecosystem but also causes liquidity fragmentation.
Users cannot fully understand which bridges support cross-chain transfers between specific chains and what the wear and tear is, which can cause confusion.
Chaineye aggregates information on various asset cross-chain options, listing real-time cross-chain choices based on user filters, including the time and fees involved in cross-chain transfers. Users can select the most suitable option based on their needs and directly link to the cross-chain bridge for operations, making it very convenient.
2. Centralized Exchange Token Deposits and Withdrawals
If users are concerned about the safety of decentralized cross-chain bridges and want to use centralized exchanges for asset cross-chain transfers, the CEX Transfer Fee function supports hundreds of commonly used tokens and major exchanges, detailing the wear and tear for deposits and withdrawals across various chains, coins, and exchanges, allowing users to choose the most suitable method based on their needs.
3. Stablecoin Dashboard
For mainstream stablecoins on the market, Chaineye provides one of the most comprehensive comparison dashboards. The content includes the total market capitalization of stablecoins, issuers of various stablecoins, deployed chains, whether there are de-pegging situations, the composition of stablecoin collateral, whether they have been audited, and detailed audit reports.
It also distinguishes between native and non-native stablecoins (e.g., Ethereum's BUSD and Binance-Peg BUSD on BNB chain), helping users gain a deeper understanding of the associated risks of stablecoins, avoiding daily noise and FUD.
4. ETH2.0 Staking Yield Comparison
After Ethereum officially transitioned to POS, ETH2.0 staking has become a financial choice for many, especially with V God recently estimating that the Ethereum Shanghai upgrade will occur in early 2023, at which point staked ETH can be redeemed, making 2.0 staking no longer a "dead investment." How can retail investors choose staking providers to maximize returns?
Chaineye aggregates and compares information on the staking yields, scales, and withdrawal redemption standards of mainstream staking service providers, allowing users to input their pre-invested ETH amount to get a rough return result, making it very convenient.
5. RPC List
If users want to avoid being monitored by Infura, Chaineye also provides an RPC List, allowing users to choose alternative RPCs based on latency and providers.
In Summary
In simple terms, Chaineye is an information aggregator focusing on hot topics such as stablecoins and cross-chain solutions, providing users with the most comprehensive information comparisons, saving users time, and enhancing user experience.
04 EigenPhi
Product Background
As research on MEV gradually becomes a prominent field, more and more people are starting to pay attention to this corner of the dark forest.
EigenPhi is a research platform focused on MEV data in DeFi. Its feature is the ability to identify and track on-chain MEV captures, such as arbitrage, sandwich attacks, and liquidations, while presenting detailed information on each MEV transaction for reference.
Product Features
On the Overview page, EigenPhi can inform users of the actual trading volume and actual profits (gross profit - miner fees) of various MEV types, their respective proportions, and growth trends, providing a clear overview of the current MEV market.
By observing the trading volume and profits of various MEVs, it can be found that the overall profit from arbitrage accounts for about one percent of arbitrage trading volume, while sandwich attacks account for about one thousandth.
Currently, the most traded type by volume is sandwich attacks, accounting for over eighty percent of total MEV trading volume, while by profit, arbitrage profits account for more than half of the actual MEV market profits.
In other words, sandwich attacks, which we consider to be bad MEV that harms user experience, pervade the entire MEV market. However, the profits they bring are actually less than the good MEV profits that promote market price stability through arbitrage.
MEV Live-Stream is a very interesting feature that monitors and captures the latest MEV-related transactions in real-time, displaying them on a scrolling panel, similar to watching stock market fluctuations.
Users can clearly see the type of MEV involved in each transaction, the contract address initiating the MEV transaction, actual profits, and expenses.
Interestingly, one can often see transactions with almost zero or negative profits, such as failed arbitrages or very low-profit sandwich attacks.
For those unfamiliar with MEV or who find the technical threshold too high, it might be worth checking out this panel, as it can help avoid one way of losing money.
Of course, there are also many skilled MEV players. From the leaderboard, it is interesting to note that MEV income on BSC is significantly higher than on Ethereum. The top player on BSC earned $567k with a cost of $587, achieving an astonishing return on investment of 90,000%, which is hard to match.
In contrast, the top profit on ETH is $41k with a cost of $103k, yielding only a 39.7% return; half of the top earners on Ethereum gained profits through sandwich attacks, while the other half through arbitrage. Arbitrage typically has a higher return on investment, while sandwich attack profits are smaller and costs are higher.
Users can view details based on MEV profit methods, such as examining the transaction distribution of sandwich attacks. UniswapV3 is a hotspot, accounting for over half, and users can also see when recent attacks occurred, the tokens involved, and the profit situation.
In EigenPhi, each MEV transaction clearly shows the entire operational behavior and logic, simplifying and revealing the complexities of interaction processes, transaction costs, profits, and other details for user understanding.
Sharing an interesting arbitrage: An arbitrage contract (0x8) borrowed 90 ETH from DYDX flash loans, purchased #1633BAYC, and simultaneously obtained 10,000 Ape staked under that NFT. After exchanging the Ape for 32.68 ETH, the NFT was sold for 65 ETH, yielding a profit of 4 ETH, as recorded in EigenPhi as shown below:
Currently, EigenPhi's MEV section supports Ethereum and BSC chains. In addition to MEV, EigenPhi also provides monitoring of liquidity pool distributions, specifying which protocol's pool has the highest trading volume;
It features a malicious token identification function, which refers to tokens that charge transfer or transaction fees without notifying users, helping users avoid risks.
Additionally, EigenPhi has a Research section that provides their latest research reports, which are worth reading.
In Summary
EigenPhi is a data analysis platform focused on the MEV market, capable of identifying and tracking various MEVs, using visual charts to analyze the MEV market and dissect each transaction.
It provides data for researchers studying MEV and offers insights and references for MEV enthusiasts.
05 GeniiData
Product Background
GeniiData is positioned as a data analysis platform focused on cross-chain data analysis and API factory, emphasizing high performance, high data reliability, and broad coverage.
It provides professional tools for individual data studios, discovering, verifying, and building Web3 through blockchain analysis. For enterprises, GeniiData serves as an incubator for decentralized applications, providing access to comprehensive on-chain data APIs.
GeniiData is a community-driven open platform that allows users to share complex data insights and collaborate on applications within the Web3 ecosystem.
Product Features
1. High Granularity Multi-Chain Parsing + Powerful Cross-Chain Analysis Functionality + Excellent User Experience
GeniiData currently parses over a dozen chains, including BTC (most data platforms do not have BTC data) and some new public chains and layer 2s, such as Aptos and StarkNet. More parsed chains bring richer analysis content.
With the development of blockchain, the new pattern of multi-chain parallelism will be the trend. Many major protocols, such as Uniswap and Aave, will deploy cross-chain, and even Aave V3 has new cross-chain access features, making analysis of protocols on a single chain increasingly one-sided.
GeniiData provides powerful cross-chain analysis capabilities, allowing users to aggregate and compare data across chains for more comprehensive and higher-dimensional insights.
In terms of user experience for data query dashboards, GeniiData is among the best on the market, with a clean interface, clear logical structure, and many commonly used abstract tables, such as balance tables for various standard tokens, greatly facilitating queries.
2. Smart Contract Automatic Decoding Function
On most data analysis platforms, parsing a project's smart contract often requires manpower, needing to submit applications, and staff manually reviewing and parsing, which can take anywhere from 2-3 days to one or two weeks.
GeniiData offers an automatic smart contract decoding function without verification, completing the decoding within a few hours after submitting the contract. This is very helpful for focused project research, as it provides complete event data for easier analysis.
3. Data Product Driver
The API market for data platforms is becoming increasingly competitive. GeniiData plans to create an API factory that balances historical integrity and timeliness of data queries, becoming a driver for Web3 data products, providing stable and efficient services through easily constructed APIs, supporting consumer-facing business access.
At the same time, GeniiData also supports machine learning to train on-chain data, using artificial intelligence to explore new patterns in Web3.
In Summary
GeniiData leads the industry in chain diversity, historical data integrity, and user experience. Its innovative smart contract automatic parsing function improves research efficiency, while the API factory supports data product development.
06 MetaDock
Product Background
MetaDock differs from other data platform products in that it operates as a browser extension, aiming to optimize users' blockchain browser experience.
Product Features
- It aggregates various EVM blockchain explorers, eliminating the step of searching for blockchain explorers, allowing users to query directly within the extension, although it only supports some EVM chains.
- As a web plugin, it adds some tagging features when linking to blockchain explorers, such as account risk indicators (low/high risk) and hyperlinks to commonly used tools like DeBank.
- Fund Flow. By clicking on the Fund Flow tab, users can visualize the associated accounts and fund flow charts of the queried account. The filter in the upper right corner allows users to choose to display specific transfer addresses/entities or tokens.
In Summary
MetaDock is a Chrome extension designed to enhance the user experience of blockchain explorers. It aggregates multiple EVM-compatible blockchain explorers and adds address tagging;
It also provides fund flow tracking, making it a practical tool.
07 Zettablock
Product Background
ZettaBlock is an enterprise-level full-stack Web3 infrastructure for indexing and analyzing, connecting on-chain and off-chain data.
ZettaBlock primarily targets enterprise teams and developers, allowing them to quickly build real-time, public, and reliable GraphQL APIs on the platform built on Zettablock.
Applications built on ZettaBlock only need to interact with a few general and user-friendly APIs, which provide efficient access to a large number of isolated data sources, allowing developers to easily customize APIs to fit their own business logic.
Zettablock Features
Full-stack. Provides a unified data platform with streaming processing, OLTP indexing, OLAP analysis, and visualization architecture, along with hundreds of readily available blockchain datasets;
Flexibility. Build your own GraphQL API and SQL based on custom transformation logic requirements;
Real-time. Customized APIs with 10 milliseconds response time and sub-second data freshness with high throughput;
Data unification. Seamlessly access any decoded on-chain data, combining it with your own off-chain data;
Scalability. Easily connect PB-level data using SQL;
Reliability. Ensures near-perfect system uptime (99.99%) to support real-time, data-intensive applications.
ZettaBlock currently supports Ethereum, Polygon, Arbitrum, Aptos, Solana, Ripple XRP, IoTex, and Aptos. The technical team has strong engineering capabilities, having built a powerful data platform from scratch in just a few months, and it is now available for public testing.
Product Features
1. Zettablock provides a fast, efficient, and highly customizable on-chain data analysis platform
In terms of usage flow, it is almost indistinguishable from most customized data query platforms (like Dune and Flipside), allowing users to perform SQL queries, visualize data, integrate, create dashboards, and query quickly.
The difference lies in the ability to use the automatic API generation feature after completing a query, packaging it into an API that can be flexibly called, greatly expanding the application scenarios for data analysis and providing a breeding ground for entrepreneurial endeavors in data analysis products.
2. Rich and Flexible API Services are ZettaBlock's Features, Currently Offering Three Types of APIs
- Custom API
Custom APIs are GraphQL API endpoints. Users can easily build them through ZettaBlock's API builder after writing SQL queries. The customized GraphQL APIs are real-time (100 milliseconds latency), cost-effective (1000 times cheaper than analytical APIs), and can handle high QPS, making them ideal for providing API support for consumer-facing dApps.
- Data Lake API
Data Lake APIs are analytical APIs for interacting with ZettaBlock's data lake. Users can use them to submit any SQL queries to the data lake and retrieve results accordingly. For large results, streaming mode can also be used to access and write files.
- Pre-built API
Pre-built APIs are a collection of GraphQL APIs co-created by the ZettaBlock team and community. These pre-built APIs are designed to meet common query and metric needs, somewhat similar to Dune's Spellbook, avoiding the need to reinvent the wheel.
3. On-Chain Data + Off-Chain Data
When conducting data analysis, a common issue is the fragmentation of on-chain and off-chain data, making it difficult to integrate them for comprehensive analysis. Zettablock has made breakthroughs in this area, supporting the aggregation of off-chain data, partly through community contributions.
For example, the protocol_mappings table from the Polygon team, and partly from recognized data sources like CoinGecko (e.g., prices.usd table).
At the same time, various teams and analysts can also upload data to their independent data lakes within Zettablock, facilitating the combination of on-chain and off-chain data for more comprehensive and complete analyses while maintaining data privacy.
In the future, Zettablock will also launch an API marketplace, allowing developers to benefit from it. The basic process will be: developers use the datasets provided by Zettablock to perform SQL queries, such as "how many addresses holding BYAC for over a year have actively purchased this NFT," and after completing the query, automatically generate an API through the API generator, placing it in Zettablock's API marketplace.
Other users interested in this query can directly use the API and pay for it, with creators automatically receiving commissions, making it possible for individual developers' good ideas and knowledge skills to be monetized.
In Summary
ZettaBlock is a full-stack real-time Web3 data infrastructure for indexing, analyzing, and connecting on-chain and off-chain data. Developers can easily build real-time, reliable, data-intensive applications using ZettaBlock, serving as a cornerstone for data products.
08 Summary of Seven Data Products
Discussion
From the above introduction, it is not difficult to see that the development of emerging data products mainly revolves around three functions:
Entity Identification
Capital Flow
API
Entities abstract multiple associated addresses together, significantly enhancing the quality of data analysis, providing much more value than analyzing a single address.
In addition to publicly disclosed addresses such as institutions and exchanges, more hidden addresses rely on data platforms to dig them out. This aspect heavily depends on each platform's capabilities, with the depth of algorithmic digging and the breadth of entity identification being points of competition.
It is evident that there will be some discrepancies in address aggregation and entity identification among different platforms, which may confuse users. Without the ability to compare and verify data, users seem to have no choice but to "blindly" trust one platform.
In fact, it is indeed the case that users can only obtain a vague correctness. Borrowing Kofi's words, it is difficult to say which platform's data is correct.
The visualization of capital flows facilitates investment analysis for investors, especially when combined with entity identification and address aggregation functions, allowing for the analysis of addresses worth tracking and setting alerts.
The API services of data analysis platforms are another major competitive point, all vying to become API factories. The growing demand for on-chain data from users will lead to more data DApps and small tools emerging, and the flexible nature of APIs is particularly suitable for building these DApps on the foundation of data analysis platforms, further allowing data analysis platforms to play the role of infrastructure while also absorbing traffic brought by data products.
The relationship between emerging data products and existing products is one of complementarity and competition. On one hand, due to the expanding market, existing products may not cover enough ground, and new products fill the gaps in new segmented tracks; on the other hand, if new products and existing products are in the same track, their competition mainly revolves around the following aspects:
Technology. Whether new products possess more advanced technology, such as faster data processing capabilities, more powerful analytical functions, and simpler development interfaces. These advantages can make new products more competitive and attract user adoption;
Market Share. Existing products may already hold a certain market share, having a considerable number of users and customers, forming a stable profit model; how new products can open up the market through unique features, better user experiences, and more aggressive marketing is key to their success;
Brand Effect. Existing products may have already established a good brand reputation, enjoying high recognition and word-of-mouth. These advantages can make it easier for existing products to gain user favor and attention, leading to network effects.
The importance of on-chain data stems from the gradual maturity of blockchain technology and the vigorous development of applications. Flexibly utilizing and combining various data products will provide us with a new perspective in the crypto world.
By analyzing on-chain data, users can obtain more comprehensive and accurate information to assist in research or investment; through on-chain transparency and truthfulness, data can also serve as a guiding light in navigating the dark forest, illuminating the path ahead and protecting oneself.
References
[1] Becoming an On-Chain Data Analyst
https://sixdegreelab.gitbook.io/mastering-chain-analytics/00_introduction
[2] My take on the Crypto Data Landscape 2022
https://twitter.com/zk7hao/status/1576492616715116547
[3] Detailed Explanation of 0xScope Protocol: How to Become a "Sculptor" of Web3 User Profiles?
https://www.panewslab.com/zh/articledetails/5e5ep35r.html
[4] 0xScope------Revolutionizing Web3 Data Analysis Paradigms with Knowledge Graphs https://mirror.xyz/0xB134928B00c6c76b939D8715a6dc1e1dAe5B5b6e/q8lWLYL0Ei4CL3sTvdbhldZNoNp-FkbMw8QP0lvai3M
[5] The EigenPhi Way
https://eigenphi-1.gitbook.io/classroom/eigenphis-methodologies/the-eigenphi-way
[6] Welcome to ZettaBlock
https://zettablock.readme.io/docs
[7] GeniiData Documentation
https://geniidata.notion.site