Why is data availability so important for Layer 2?

Uncle Jian
2024-01-10 18:18:09
Collection
【Abstract】What is data availability? What data availability issues does L2 face? Why is there so much controversy surrounding the data availability layer L2? This article will focus on these questions in an attempt to unveil the mystery of data availability.

Dankrad Feist, a researcher at the Ethereum Foundation, once stated in a tweet that not using Ethereum for data availability does not qualify as L2. If we follow his reasoning, many chains would be excluded from the L2 category, such as Arbitrum Nova, Polygon, and Mantle.

So, what exactly is data availability? What data availability issues does L2 face? Why is there so much controversy surrounding data availability layers in L2? This article will focus on these questions and attempt to unveil the mystery of data availability.

What is Data Availability

In simple terms, data availability refers to the stage where block producers publish all transaction data of a block to the network so that validators can download it.

If a block producer publishes complete data and allows validators to download it, we say the data is available; if it conceals some data, preventing validators from downloading the complete data, we say the data is unavailable.

The Difference Between Data Availability and Data Retrievability

Typically, we easily confuse data availability with data retrievability, but they are quite different.

  • Data availability pertains to the stage when a block has been produced but not yet added to the blockchain through consensus, so data availability is not related to historical data but rather to whether newly published data can be confirmed through consensus.
  • Data retrievability involves the stage after data has been confirmed through consensus and permanently stored on the blockchain, which is the ability to retrieve historical data. In Ethereum, nodes that store all historical data are called archive nodes.

Therefore, the co-founder of L2BEAT once stated in a lengthy tweet that full nodes are not obligated to provide us with historical data; the only reason we can access it is that full nodes are kind enough to do so.

He also mentioned that the term "Data Availability" can lead to misunderstandings about its role and should be replaced with "Data Publishing," a sentiment that has been echoed by the founder of Celestia.

Data Availability Issues in L2

Although the concept of data availability originated from Ethereum, our current focus is on data availability at the L2 level.

In L2, sequencers are the block producers who need to publish sufficient transaction data for validators to verify the validity of transactions. (For more information about sequencers, please read the previous article in the Insight Weekly titled “Research Report | Principles, Current Status, and Future of Sequencers”)

However, this process faces two issues: ensuring the security of the validation mechanism and reducing the cost of publishing data. The following will elaborate on these.

Ensuring the Security of the Validation Mechanism

We know that OP Rollup uses fraud proofs to verify the validity of transactions, while ZK Rollup uses validity proofs.

  • For OP Rollup: If the sequencer does not publish complete retrievable block data, challengers in the fraud proof will be unable to initiate valid challenges.
  • For ZK Rollup: Although validity proofs themselves do not require data availability, ZK Rollup as a whole still needs data availability. Without retrievable block data, users will not know their balances and may likely lose assets.

To ensure secure validation, current L2 sequencers generally publish both the state data and transaction data of L2 on Ethereum, relying on Ethereum for settlement and data availability.

Therefore, the data availability layer essentially serves as the place where L2 publishes transaction data, and currently, mainstream L2s treat Ethereum as their data availability layer.

Reducing the Cost of Publishing Data

Today's L2s simply have data availability and settlement occur on Ethereum. While this provides sufficient security, it also incurs significant costs. This is the second issue L2 faces: how to reduce the cost of publishing data.

The total gas fees paid by users to L2 mainly consist of the gas for executing transactions on L2 and the gas for submitting data to L1. The former is negligible, while the latter constitutes the bulk of user fees, with the transaction data published to ensure data availability making up the majority of the data submitted by L2 to L1, while the proof data verifying transaction validity only accounts for a small portion.

Thus, to make L2 overall cheaper, it is essential to lower the cost of publishing data. So, how can costs be reduced? There are mainly two methods:

  • Lower the cost of publishing data on L1, such as the upcoming EIP-4844 upgrade on Ethereum. For those interested in EIP-4844, you can read the previous article in Insight Weekly titled “Web3 Science Popularization | Easily Understand the Major Benefits of Layer 2: EIP-4844”;
  • Similar to Rollup, separate transaction execution from L1, and data availability can also be separated from L1 to reduce costs, meaning not using Ethereum as the data availability layer.

Controversies Surrounding L2's Data Availability Layer

To discuss the controversies surrounding L2's data availability layer, we must start with modular blockchains. A modular blockchain decouples the core functions of the overall blockchain into relatively independent parts and expands the performance of a single blockchain through various combinations of specialized networks.

Although there is some debate about the layering of modular blockchains, it is generally accepted to divide modular blockchains into four layers: Execution, Settlement, Consensus, and Data Availability. The functions of each module are illustrated in the following image.

Modular blockchains are akin to LEGO blocks, allowing for customization and the use of the best blocks to build a good model, alleviating the "impossible triangle" problem of blockchains.

However, currently, aside from separating the execution layer from Ethereum, the functions of the other three layers still occur on Ethereum. But due to cost considerations, many L2s are also preparing to separate the data availability layer from Ethereum, using Ethereum only as the settlement and consensus layer.

Interestingly, Ethereum seems reluctant to allow L2s to obtain data availability from elsewhere. Ethereum Foundation researcher Dankrad Feist once stated in a tweet that not using Ethereum as a data availability layer does not qualify as a Rollup, and therefore not as L2.

Additionally, the latest definition of L2 by L2BEAT also points out that scaling solutions that do not publish data on L1 are not L2, as using off-chain data availability solutions cannot guarantee that operators will provide the published data.

Of course, there is still no definitive conclusion about what constitutes L2. The insistence of members from the Ethereum Foundation and L2BEAT that L2 should keep the data availability layer on Ethereum seems to stem from security considerations, but is there an underlying fear of undermining Ethereum's status?

Ethereum's vision is to become a supercomputer platform, but later, to enhance network performance, it had to develop Rollups, leading many ecosystems to migrate to cheaper L2s. However, since security is provided by Ethereum, its status has not been significantly affected. But if L2 also separates the data availability layer related to data publishing from Ethereum, it essentially weakens the reliance on Ethereum's security, gradually distancing itself from Ethereum, which poses a threat to Ethereum's status.

Nonetheless, this does not hinder the vigorous development of projects related to data availability layers. In the next article about data availability, the author will provide a detailed introduction to the major data availability solutions currently available in the market and their specific related projects. Stay tuned.

References:

【1】Ethereum Documentation: Data Availability

【2】Misunderstandings About Data Availability: DA = Data Publishing ≠ Historical Data Retrieval

【3】Expelling Validium? A New Perspective on Layer 2 from the Proponent of Danksharding

【4】Data Availability Checks

【5】A Note on Data Availability and Erasure Coding

【6】IOSG Ventures: Dissecting the Data Availability Layer, the Overlooked LEGO Block in the Modular Future


Disclaimer: All content on this site may involve project risk issues and is for informational and reference purposes only, not constituting any investment advice. Please view it rationally, establish a correct investment philosophy, and enhance risk awareness. It is recommended to comprehensively consider various relevant factors, including but not limited to personal purchasing purposes and risk tolerance before engaging and holding. Copyright Notice: The copyright of quoted information belongs to the original media and authors. Without the consent of J Club, other media, websites, or individuals may not reproduce articles from this site. J Club reserves the right to pursue legal responsibility for the aforementioned actions.

Related tags
ChainCatcher reminds readers to view blockchain rationally, enhance risk awareness, and be cautious of various virtual token issuances and speculations. All content on this site is solely market information or related party opinions, and does not constitute any form of investment advice. If you find sensitive information in the content, please click "Report", and we will handle it promptly.
banner
ChainCatcher Building the Web3 world with innovators