How to use abstract thinking to interpret on-chain data for business purposes

TL;DR

After 14 years of industry development, it has gradually shifted from initial speculation to practical applications. Blockchain data analysis can be conducted from three levels: on-chain macro, project protocols, and address analysis. On-chain macro can compare metrics across different chains. Project protocols require a deep understanding of business logic. Address analysis can involve multi-dimensional tagging. Future directions worth paying attention to include Bitcoin Layer 2 scaling solutions, Ethereum staking data, and account abstraction multi-signature addresses. Overall, there is enormous potential for growth in the blockchain data market.

Introduction

If we consider the formal deployment of Bitcoin as the birth year of the industry, the blockchain industry has evolved over 14 years from mere speculation and trading to a technology concept with practical application scenarios. Especially after the concept of Decentralized Finance (DeFi) was recognized and accepted by users, value has returned to the chain, and on-chain data has gradually become the focus of attention for investors and developers.

Front page article title from The Times, January 3, 2009 - Chancellor on the brink of second bailout for banks

Although the scale of blockchain data is still relatively limited compared to the massive data volumes in the current internet, and the raw data appears relatively simple, the actual analysis and interpretation process often requires analysts and developers to spend a significant amount of time parsing and utilizing the data due to the relatively free input and the presence of a large amount of difficult-to-understand bytecode. Based on work experience, I believe blockchain data can be categorized from a business perspective for better understanding:

On-chain macro
Project protocols
Address analysis

The blockchain network can be divided into three levels from macro to micro, with the network layer consisting of multiple protocols, each of which is composed of activities from multiple addresses. Currently, most blockchain data analysis products targeting consumers focus on specific scenarios within these three levels. Next, I will elaborate on the business logic and application forms corresponding to each level.

On-chain Macro

From the network level, it can be further subdivided into:

Bitcoin (UTXO model)
Ethereum and Ethereum Virtual Machine (EVM)
Other non-EVM public chains (e.g., Solana developed in Rust, modular public chain Cosmos ecosystem, Move language system inherited from Libra, etc.).

Typically, for comparison, we can examine four metrics: number of users, number of transactions, transaction value, and transaction fees, and conduct secondary analysis based on these metrics. Here are a few simple examples:

Assess the activity level of developers on the network based on the number of users deploying contracts and the number of transactions;
Calculate transactions per second (TPS) based on the time intervals of transactions to evaluate the network's performance in processing transactions;
Calculate the ratio of transaction amounts to the number of transactions to obtain the average amount per transaction; an excessive number of low-value transactions can burden the network;
Observe the total transaction fees over a period to assess the network's popularity; unlike the number of transactions, a low point in transaction fees indicates that users have a lower urgency to transact.

ethereum-stablecoin-on-layer2

Data Source: Dune

For data users, network-level data can provide assistance when choosing among various public chains, allowing them to select the most suitable public chain for development or use based on their own circumstances and seize the best opportunities to participate.

Project Protocols

The classification of project protocols is quite broad, including DeFi, Game, Non-Fungible Token (NFT), Decentralized Identity (DID), etc., with new categories continuously emerging. Therefore, I will not delve into a specific category here, but rather share a few experiences regarding the analysis of project protocol data:

Typically, a complete protocol consists of multiple business contracts, most of which require in-depth reading of documentation (clear and timely updated documentation is crucial) and combining it with one's own usage to better understand the project.
Products in the same domain tend to have similar business logic; for example, the core business of all DEXs revolves around trading and liquidity. Understanding leading products makes it relatively easier to analyze other projects in the same field. From the perspective of the project team, they are familiar with their own data but often wish to understand more about competitors and the industry landscape, making vertical domain data very valuable.
Currently, most projects contain a lot of off-chain data, such as team and financing information, social media data, user website operation data, internal order information, etc. Some of this data is public, while some is not, which can limit analysis. However, as the industry develops, more business data will gradually move on-chain, as one of the purposes of using blockchain is to achieve greater transparency.

Layer1_2 生态项目数量

Data Source: RootData

A typical example is during DeFi Summer when SushiSwap challenged UniSwap; the on-chain trading volumes and transaction counts of both were once similar. However, a deeper analysis revealed that UniSwap had a significantly higher number of unique users compared to SushiSwap, meaning most of SushiSwap's trading and liquidity came from a smaller number of users. The reason here is that the issuance mechanism of Sushi Token stimulated capital inflow, but later, due to the unsustainable economic model, the capital flowed back to Uniswap. Similar situations are currently reflected in the data of OpenSea and Blur, where the former has a majority of retail trades, while the latter has a majority of professional user trades. (Note! There is no value judgment on the projects here; it merely illustrates that user behavior differences can be reflected in the data.)

nft-volum nft-trades

Data Source: Dune

Address Analysis

From the perspective of popular EVM-based public chains, addresses are currently divided into two types: Externally Owned Accounts (EOA) and Contract Accounts (CA). Regarding the existing business forms of data products for addresses, I believe there are mainly:

Asset dashboards (commonly used to display wallet asset status)
Transaction records (often used to show badges and reward proofs, such as airdrops or DIDs)
Tagging systems (multi-dimensional tags for recommendations or risk control)

debank

Data Source: DeBank

Here, I will mainly discuss the dimension of tags. Currently, tags are very crucial in consumer-end data products. For example, for users, the address 0xd8dA6BF26964aF9D7eEd9e03E53415D37aA96045 is incomprehensible at first glance, but when displayed as vitalik.eth (the founder of Ethereum), it can be immediately recognized. Of course, this is just one of many tagging dimensions. I have summarized several dimensions of address tags:

Entity tags (indicating who)
Behavior tags (what has been done)
Status tags (current or past status)
Prediction tags (what might be done in the future)
Other tags (user-defined and hard-to-classify tags)

label

Data Source: OKLink

Currently, most data products simply display entity tags and then show the flow of funds through behavior and status tags, with insufficient deep mining. For example, when a transaction is initiated, it could display the age, assets, and number of counterpart addresses, alerting users to potential risks; or based on past trading behavior, recommend similar projects to users, such as suggesting what NFTs are currently being minted by the most addresses to those who have participated in multiple NFT mints, thus saving users' search time. Rich data support can provide more powerful algorithmic services for products.

Personal Views

Finally, I would like to discuss three directions in business data that I am particularly interested in over the next 1-2 years:

Bitcoin Layer 2 (including data generated by other scaling solutions)
Ethereum Staking (Beacon Chain data)
Account Abstraction (data on account abstraction and multi-signature addresses based on the ERC-4337 proposal)

Bitcoin Layer 2

For solutions like Ordinals that assign numbers to the smallest unit "sat" of the Bitcoin network, opinions within the Bitcoin community vary, but its popularity has added imaginative space and miner income (transaction fees) to the Bitcoin ecosystem. In terms of block space and transaction volume, Ordinals once caused transaction fees to exceed block revenue, but the Bitcoin network clearly cannot accommodate more users for asset transactions. Even if the peer-to-peer payment narrative of Bitcoin has been replaced by the consensus of digital gold, with the halving of block rewards, the Bitcoin network's hash rate will also face significant challenges. Reduced income and increased competition will inevitably eliminate some hash power. When block rewards become negligible, transaction fees will become the primary source of income for miners. If network transaction volume and fees do not steadily increase, this translates to unstable miner income, which can affect the diversity and robustness of the network. In this context, future credible scaling becomes particularly important, with the Lightning Network being the solution that has gained considerable consensus within the community.

Ethereum Staking

As the foundational value storage of the entire Ethereum ecosystem, the data from the Beacon Chain is one of the data businesses carrying the most funds. However, due to the structural differences between the consensus layer and the execution layer, existing data platforms have not yet effectively presented the flow relationship of funds between the two. Currently, Ethereum's staking rate is around 20%, which is relatively low in the POS consensus mechanism, especially since the Shanghai upgrade opened up staking withdrawals, the net inflow of staking has been gradually increasing. Therefore, I believe this market segment has the potential to absorb stagnant funds in the long term and has enormous growth potential.

staking

Data Source: beaconcha.in

Account Abstraction

From the current data analysis perspective, most project protocols only consider EOA addresses as user accounts. However, with the increasing importance of asset security and usage thresholds, programmable accounts have been proposed for abstraction. From a business perspective, the analysis logic for CA as user accounts has undergone some changes. CA cannot actively initiate transactions in EVM, so an EOA is needed to call CA, which can then call other CAs. This EOA can be a different address or not one of the CA's multi-signature addresses. For these transactions, the analysis logic will change. Of course, the ERC-4337 is still in draft form, so most developers have only heard of it in articles and conferences and have not yet started using it. In the on-chain data business, this is still a relatively early vertical track.

Data Source: Dune

Finally, I would like to make a somewhat imprecise analogy: if the data market of an industry ultimately accounts for 8% of the total scale of that industry, then the current $1 trillion market cap (the crypto industry experienced a tenfold increase from a low of $200 billion to $2 trillion from early 2020 to the end of 2021) could accommodate about $80 billion. There is still significant room for user and capital growth in the future. The data track has currently only completed the decentralization of data storage, while data computation, validation, and processing, among many other stages, require more creativity.

Project Introduction and Analysis

Introducing you to the cutting-edge projects in the cryptocurrency market.

Topic or theme

ChainCatcher reminds readers to view blockchain rationally, enhance risk awareness, and be cautious of various virtual token issuances and speculations. All content on this site is solely market information or related party opinions, and does not constitute any form of investment advice. If you find sensitive information in the content, please click "Report", and we will handle it promptly.

How to use abstract thinking to interpret on-chain data for business purposes

TL;DR

Introduction

On-chain Macro

Project Protocols

Address Analysis

Personal Views

Project Introduction and Analysis

Recommended Reading