EIP-8136 - Cell-Level Deltas for Data Column Broadcast

Created 2025-01-23
Status Draft
Category Networking
Type Standards Track
Authors
  • Daniel Knopik (@dknopik) <daniel at dknopik.de>

  • Marco Munizaga (@MarcoPolo) <git at marcopolo.io>

  • Sukun Tarachandani (@sukunrt) <sukunrt at gmail.com>

Requires

Abstract

Cell-Level Deltas for Data Column Broadcast optimizes PeerDAS (EIP-7594) by allowing more efficient transfers of blob data columns across the network. Instead of having to exchange full data columns, peers exchange only the cells they need within a column. This becomes especially useful when the majority of cells within a column are already present from the local mempool. This optimization is backwards compatible and can be progressively deployed. It does not require a hard fork. Nodes that do not implement this optimization still receive and transmit full data columns.

Motivation

In the vast majority of cases, all or nearly all blobs referenced in a block are available in the mempool. Leveraging the blob data a node already has locally lets a node avoid wasting bandwidth for cells it already has. In the current design, if even a single blob is not present in the local mempool, the Consensus Layer will have to wait to receive the full data column from the network before passing its data availability checks. With this optimization, the client need only wait to receive the missing cells.

Specification

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 and RFC 8174.

Cell-Level Deltas uses Gossipsub's Partial Messages Extension to exchange cell bitmaps and request/provide cells.

Editor's Note: Update the libp2p spec link to a proper commit once the PR is merged.

In an effort to maintain a single source of truth, the specification is defined in the ethereum/consensus-specs repo.

Editor's Note: Update the Consensus Spec link to a proper commit once the PR is merged.

Rationale

The design is guided by two main factors:

  1. Minimal semantic changes.
  2. Backwards compatible (no hard fork required).

These two factors lead the design into something that extends the existing gossipsub behavior rather than introducing a new protocol. The mesh and gossip properties of gossipsub are unchanged. If a node's gossipsub peers support this extension they can both make use of it. Otherwise, the behavior falls back to traditional gossipsub.

The biggest difference the extension introduces is defaulting to a pull-based dissemination strategy of cells rather than a push based one. Eager pushing of data is still possible, but it is not the default behavior.

Alternative Designs

  1. Smaller gossipsub messages.

This design makes the gossipsub message unit be a single cell rather than a full column. While it would transfer data at the cell level, without having bitmaps as a first class concept it would introduce overhead for a cell per message. Furthermore, informing peers about cells a node has (via IDONTWANT) would always race against the peers pushing the same cells to the node. The default-push strategy is too aggressive for the case when almost all blobs are public. The message ID is content based (e.g. a hash), and lacks context as to what block this cell is part of, making it impossible for a node to request cells it doesn't have or prioritize cells from a block it's processing.

Changing the semantics of the message ID may be possible as a workaround for some of the above issues, but this would still require a separate gossipsub topic that could not be used as a backwards compatible replacement to the existing gossipsub topic. Changing message ID semantics may also break gossipsub implementations that assume a message ID and message are 1:1.

  1. A new RPC method.

This seems like the simplest option at first glance, but it ignores the work already happening in gossipsub. If we merely add a new RPC method with no gossipsub changes, we risk only increasing network and compute load rather than reducing it. Because we are now doing an extra RPC on top of existing gossipsub work. To fix this, gossipsub and the RPC method need to be cognizant of each other. Eventually this leads to the proposed design, a gossipsub extension of partial messages with application defined semantics.

  1. A new protocol separate from gossipsub.

While this may be a long term direction worth exploring, this is deemed too big a change to make to leverage this optimization. Especially if we'd like to deploy this without a hard fork.

Backwards Compatibility

No backward compatibility issues.

Nodes form gossipsub meshes and gossip with the same rules as before. This optimization only takes effect when both nodes in a gossipsub exchange support this extension. If either side does not support this extension, both nodes behave the same as before, exchanging full gossipsub messages to each other.

Gossipsub scores peers well if they provide timely messages, and penalizes peers if they provide invalid messages. This extension does not change that, peer scores should behave the same with and without this extension.

Security Considerations

In the default case, this adds minor latency to cell dissemination compared to the standard gossipsub's default push behavior in exchange for more efficient bandwidth usage. However, over time the eager cell push policy of nodes can be refined to match and even improve dissemination latency. This latency is a key metric that will be monitored during and after rollout.

There are also a couple implementation specific pitfalls client implementers and gossipsub implementers should be aware of.

  1. While it is more efficient for a node to mesh with peers that support this EIP, it risks dividing the network if those peers are preferred. Implementations SHOULD NOT discriminate against peers that do not support this extension.
  2. Related to the above, a peer's score should be roughly equivalent whether they support partial messages or not. A peer should not score higher if they provide cells one at a time versus all at once.
  3. Implementations should be resilient to peers spamming messages with different Group IDs. Implementations SHOULD reserve space for mesh peer messages in case of adversarial non-mesh peers.

This optimization is expected to roll out gradually with the ability to roll back if needed.

Copyright

Copyright and related rights waived via CC0.