EIP-7594 - PeerDAS - Peer Data Availability Sampling

Created 2024-01-12
Status Review
Category Networking
Type Standards Track
Authors
Requires

Abstract

PeerDAS (Peer Data Availability Sampling) is a networking protocol that allows beacon nodes to perform data availability sampling (DAS) to ensure that blob data has been made available while downloading only a subset of the data. PeerDAS utilizes gossip for distribution, discovery for finding peers of particular data custody, and peer requests for sampling.

Motivation

DAS is a method of scaling data availability beyond the levels of EIP-4844 by not requiring all nodes to download all data while still ensuring that all of the data has been made available.

Providing additional data availability helps bring scale to Ethereum users in the context of layer 2 systems called "roll-ups" whose dominant bottleneck is layer 1 data availability.

Specification

We extend the blobs introduced in EIP-4844 using a one-dimensional erasure coding extension. Each row consists of the blob data combined with its erasure code. It is subdivided into cells, which are the smallest units that can be authenticated with their respective blob's KZG commitments. Each column, associated with a specific gossip subnet, consists of the cells from all rows for a specific index. Each node is responsible for maintaining and custodying a deterministic set of column subnets and data as a function of their node ID.

Nodes find and maintain a diverse peer set and sample columns from their peers to perform DAS every slot.

A node can reconstruct the entire data matrix if it acquires at least 50% of all the columns. If a node has less than 50%, it can request the necessary columns from the peer nodes.

The detailed specifications are on ethereum/consensus-specs.

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 and RFC 8174.

Networking

This EIP introduces cell KZG proofs, which are used to prove that a KZG commitment opens to a cell at the given index. This allows downloading only specific cells from a blob, while still ensuring data integrity with respect to the corresponding KZG commitment, and is therefore a key component of data availability sampling. However, computing the cell proofs for a blob is an expensive operation, which a block producer would have to repeat for many blobs. Since proof verification is much cheaper than proof computation, and the proof size is negligible compared to cell size, we instead require blob transaction senders to compute the proofs themselves and include them in the EIP-4844 transaction pool wrapper for blob transactions.

To this end, during transaction gossip responses (PooledTransactions), the wrapper is modified to:

rlp([tx_payload_body, wrapper_version, blobs, commitments, cell_proofs])

cell_proofs = [cell_proof_0, cell_proof_1, ...]

The tx_payload_body, blobs and commitments are as in EIP-4844, while the proofs field is replaced by cell_proofs, and a wrapper_version is added. These are defined as follows:

Note that, while cell_proofs contain the proofs for all cells, including the extension cells, the blobs themselves are sent without being extended (CELLS_PER_EXT_BLOB / 2 cells per blob). This is to avoid sending redundant data, which can quickly be computed by the receiving node. In other words, cell_proofs[i * CELLS_PER_EXT_BLOB + j] is the proof for cell j of compute_cells(blobs[i]), where compute_cells(blob) outputs all cells of blob, including the extension cells.

The node MUST validate tx_payload_body and verify the wrapped data against it. To do so, ensure that:

Rationale

TBD

Backwards Compatibility

Test Cases

Reference Implementation

Security Considerations

Needs discussion.

Copyright

Copyright and related rights waived via CC0.