This EIP defines a standard format for transmitting the deposit contract Merkle tree in a compressed form during weak subjectivity sync. This allows newly syncing consensus clients to reconstruct the deposit tree much faster than downloading all historical deposits. The format proposed also allows clients to prune deposits that are no longer needed to participate fully in consensus (see Deposit Finalization Flow).
To reconstruct the deposit Merkle tree, most client implementations require beacon nodes to download and store every deposit log since the launch of the deposit contract. However, this approach requires beacon nodes to store far more deposits than necessary to participate in consensus. Additionally, this leads to increased sync times for new nodes, which is particularly evident during weak subjectivity sync. This simplistic approach also prevents historical contract logs from being pruned from full nodes, a prospect frequently discussed in the context of limiting state growth.
Consensus clients MAY continue to implement the deposit Merkle tree however they choose. However, when transmitting the tree to newly syncing nodes, clients MUST use the following format:
class DepositTreeSnapshot:
finalized: List[Hash32, DEPOSIT_CONTRACT_DEPTH]
deposit_root: Hash32
deposit_count: uint64
execution_block_hash: Hash32
execution_block_height: uint64
Where finalized
is a variable-length list (of maximum size DEPOSIT_CONTRACT_DEPTH
) containing the hashes defined in the Deposit Finalization Flow section below. The fields deposit_root
, deposit_count
, and execution_block_hash
store the same information as the Eth1Data
object that corresponds to the snapshot, and execution_block_height
is the height of the execution block with hash execution_block_hash
. Consensus clients MUST make this structure available via the Beacon Node API endpoint:
/eth/v1/beacon/deposit_snapshot
During deposit processing, the beacon chain requires deposits to be submitted along with a Merkle path to the deposit root. This is required exactly once for each deposit. When a deposit has been processed by the beacon chain and the deposit finalization conditions have been met, many of the hashes along the path to the deposit root will never be required again to construct Merkle proofs on chain. These unnecessary hashes MAY be pruned to save space. The image below illustrates the evolution of the deposit Merkle tree under this process alongside the corresponding DepositTreeSnapshot
as new deposits are added and older deposits become finalized:
The format in this specification was chosen to achieve several goals simultaneously:
The proposed DepositTreeSnapshot
structure includes both execution_block_hash
and execution_block_height
for convenience to consensus node implementors. While only one of these fields is strictly necessary, different clients may have already designed their block cache logic around one or the other. Sending only one of these would force some consensus clients to query the execution engine for the other information, but as this is happening in the context of a newly syncing consensus node, it is very likely that the execution engine will not be synced, especially post-merge. The deposit_root
field is also not strictly necessary, but by including it, newly syncing consensus nodes can cheaply validate any received snapshot against itself (see the calculate_root()
method in the Reference Implementation).
The deposit contract can only provide the tree at the head of the chain. Because the beacon chain's view of the deposit contract lags behind the execution chain by ETH1_FOLLOW_DISTANCE
, there are almost always deposits which haven't yet been included in the chain that need proofs constructed from an earlier version of the tree than exists at the head.
In principle, a node could scan backwards through the chain starting from the weak subjectivity checkpoint to locate a suitable Deposit
, and then extract the rightmost branch of the tree from that. The node would also need to extract the execution_block_hash
from which to start syncing new deposits from the Eth1Data
in the corresponding BeaconState
. This approach is less desirable for a few reasons:
This proposal is fully backwards compatible.
Test cases are included in test_cases.yaml. Each case is structured as follows:
class DepositTestCase:
deposit_data: DepositData # These are all the inputs to the deposit contract's deposit() function
deposit_data_root: Hash32 # The tree hash root of this deposit (calculated for convenience)
eth1_data: Eth1Data # An Eth1Data object that can be used to finalize the tree after pushing this deposit
block_height: uint64 # The height of the execution block with this Eth1Data
snapshot: DepositTreeSnapshot # The resulting DepositTreeSnapshot object if the tree were finalized after this deposit
This EIP also includes other files for testing:
If these files are downloaded to the same directory, the test cases can be run by executing pytest
in that directory.
This implementation lacks full error checking and is optimized for readability over efficiency. If tree
is a DepositTree
, then the DepositTreeSnapshot
can be obtained by calling tree.get_snapshot()
and a new instance of the tree can be recovered from the snapshot by calling DepositTree.from_snapshot()
. See the Deposit Finalization Conditions section for discussion on when the tree can be pruned by calling tree.finalize()
.
Generating proofs for deposits against an earlier version of the tree is relatively fast in this implementation; just create a copy of the finalized tree with copy = DepositTree.from_snapshot(tree.get_snapshot())
and then append the remaining deposits to the desired count with copy.push_leaf(deposit)
. Proofs can then be obtained with copy.get_proof(index)
.
from __future__ import annotations
from typing import List, Optional, Tuple
from dataclasses import dataclass
from abc import ABC,abstractmethod
from eip_4881 import DEPOSIT_CONTRACT_DEPTH,Hash32,sha256,to_le_bytes,zerohashes
@dataclass
class DepositTreeSnapshot:
finalized: List[Hash32, DEPOSIT_CONTRACT_DEPTH]
deposit_root: Hash32
deposit_count: uint64
execution_block_hash: Hash32
execution_block_height: uint64
def calculate_root(self) -> Hash32:
size = self.deposit_count
index = len(self.finalized)
root = zerohashes[0]
for level in range(0, DEPOSIT_CONTRACT_DEPTH):
if (size & 1) == 1:
index -= 1
root = sha256(self.finalized[index] + root)
else:
root = sha256(root + zerohashes[level])
size >>= 1
return sha256(root + to_le_bytes(self.deposit_count))
def from_tree_parts(finalized: List[Hash32],
deposit_count: uint64,
execution_block: Tuple[Hash32, uint64]) -> DepositTreeSnapshot:
snapshot = DepositTreeSnapshot(
finalized, zerohashes[0], deposit_count, execution_block[0], execution_block[1])
# A real implementation should store the deposit_root from the eth1_data passed to
# DepositTree.finalize() instead of relying on calculate_root() here. This allows
# the snapshot to be validated using calculate_root().
snapshot.deposit_root = snapshot.calculate_root()
return snapshot
@dataclass
class DepositTree:
tree: MerkleTree
mix_in_length: uint
finalized_execution_block: Optional[Tuple[Hash32, uint64]]
def new() -> DepositTree:
merkle = MerkleTree.create([], DEPOSIT_CONTRACT_DEPTH)
return DepositTree(merkle, 0, None)
def get_snapshot(self) -> DepositTreeSnapshot:
assert(self.finalized_execution_block is not None)
finalized = []
deposit_count = self.tree.get_finalized(finalized)
return DepositTreeSnapshot.from_tree_parts(
finalized, deposit_count, self.finalized_execution_block)
def from_snapshot(snapshot: DepositTreeSnapshot) -> DepositTree:
# decent validation check on the snapshot
assert(snapshot.deposit_root == snapshot.calculate_root())
finalized_execution_block = (snapshot.execution_block_hash, snapshot.execution_block_height)
tree = MerkleTree.from_snapshot_parts(
snapshot.finalized, snapshot.deposit_count, DEPOSIT_CONTRACT_DEPTH)
return DepositTree(tree, snapshot.deposit_count, finalized_execution_block)
def finalize(self, eth1_data: Eth1Data, execution_block_height: uint64):
self.finalized_execution_block = (eth1_data.block_hash, execution_block_height)
self.tree.finalize(eth1_data.deposit_count, DEPOSIT_CONTRACT_DEPTH)
def get_proof(self, index: uint) -> Tuple[Hash32, List[Hash32]]:
assert(self.mix_in_length > 0)
# ensure index > finalized deposit index
assert(index > self.tree.get_finalized([]) - 1)
leaf, proof = self.tree.generate_proof(index, DEPOSIT_CONTRACT_DEPTH)
proof.append(to_le_bytes(self.mix_in_length))
return leaf, proof
def get_root(self) -> Hash32:
return sha256(self.tree.get_root() + to_le_bytes(self.mix_in_length))
def push_leaf(self, leaf: Hash32):
self.mix_in_length += 1
self.tree = self.tree.push_leaf(leaf, DEPOSIT_CONTRACT_DEPTH)
class MerkleTree():
@abstractmethod
def get_root(self) -> Hash32:
pass
@abstractmethod
def is_full(self) -> bool:
pass
@abstractmethod
def push_leaf(self, leaf: Hash32, level: uint) -> MerkleTree:
pass
@abstractmethod
def finalize(self, deposits_to_finalize: uint, level: uint) -> MerkleTree:
pass
@abstractmethod
def get_finalized(self, result: List[Hash32]) -> uint:
# returns the number of finalized deposits in the tree
# while populating result with the finalized hashes
pass
def create(leaves: List[Hash32], depth: uint) -> MerkleTree:
if not(leaves):
return Zero(depth)
if not(depth):
return Leaf(leaves[0])
split = min(2**(depth - 1), len(leaves))
left = MerkleTree.create(leaves[0:split], depth - 1)
right = MerkleTree.create(leaves[split:], depth - 1)
return Node(left, right)
def from_snapshot_parts(finalized: List[Hash32], deposits: uint, level: uint) -> MerkleTree:
if not(finalized) or not(deposits):
# empty tree
return Zero(level)
if deposits == 2**level:
return Finalized(deposits, finalized[0])
left_subtree = 2**(level - 1)
if deposits <= left_subtree:
left = MerkleTree.from_snapshot_parts(finalized, deposits, level - 1)
right = Zero(level - 1)
return Node(left, right)
else:
left = Finalized(left_subtree, finalized[0])
right = MerkleTree.from_snapshot_parts(finalized[1:], deposits - left_subtree, level - 1)
return Node(left, right)
def generate_proof(self, index: uint, depth: uint) -> Tuple[Hash32, List[Hash32]]:
proof = []
node = self
while depth > 0:
ith_bit = (index >> (depth - 1)) & 0x1
if ith_bit == 1:
proof.append(node.left.get_root())
node = node.right
else:
proof.append(node.right.get_root())
node = node.left
depth -= 1
proof.reverse()
return node.get_root(), proof
@dataclass
class Finalized(MerkleTree):
deposit_count: uint
hash: Hash32
def get_root(self) -> Hash32:
return self.hash
def is_full(self) -> bool:
return True
def finalize(self, deposits_to_finalize: uint, level: uint) -> MerkleTree:
return self
def get_finalized(self, result: List[Hash32]) -> uint:
result.append(self.hash)
return self.deposit_count
@dataclass
class Leaf(MerkleTree):
hash: Hash32
def get_root(self) -> Hash32:
return self.hash
def is_full(self) -> bool:
return True
def finalize(self, deposits_to_finalize: uint, level: uint) -> MerkleTree:
return Finalized(1, self.hash)
def get_finalized(self, result: List[Hash32]) -> uint:
return 0
@dataclass
class Node(MerkleTree):
left: MerkleTree
right: MerkleTree
def get_root(self) -> Hash32:
return sha256(self.left.get_root() + self.right.get_root())
def is_full(self) -> bool:
return self.right.is_full()
def push_leaf(self, leaf: Hash32, level: uint) -> MerkleTree:
if not(self.left.is_full()):
self.left = self.left.push_leaf(leaf, level - 1)
else:
self.right = self.right.push_leaf(leaf, level - 1)
return self
def finalize(self, deposits_to_finalize: uint, level: uint) -> MerkleTree:
deposits = 2**level
if deposits <= deposits_to_finalize:
return Finalized(deposits, self.get_root())
self.left = self.left.finalize(deposits_to_finalize, level - 1)
if deposits_to_finalize > deposits / 2:
remaining = deposits_to_finalize - deposits / 2
self.right = self.right.finalize(remaining, level - 1)
return self
def get_finalized(self, result: List[Hash32]) -> uint:
return self.left.get_finalized(result) + self.right.get_finalized(result)
@dataclass
class Zero(MerkleTree):
n: uint64
def get_root(self) -> Hash32:
if self.n == DEPOSIT_CONTRACT_DEPTH:
# Handle the entirely empty tree case. This is included for
# consistency/clarity as the zerohashes array is typically
# only defined from 0 to DEPOSIT_CONTRACT_DEPTH - 1.
return sha256(zerohashes[self.n - 1] + zerohashes[self.n - 1])
return zerohashes[self.n]
def is_full(self) -> bool:
return False
def push_leaf(self, leaf: Hash32, level: uint) -> MerkleTree:
return MerkleTree.create([leaf], level)
def get_finalized(self, result: List[Hash32]) -> uint:
return 0
The upcoming switch to PoS will require newly synced nodes to rely on valid weak subjectivity checkpoints because of long-range attacks. This proposal relies on the weak subjectivity assumption that clients will not bootstrap with an invalid WS checkpoint.
Care must be taken not to send a snapshot which includes deposits that haven't been fully included in the finalized checkpoint. Let state
be the BeaconState
at a given block in the chain. Under normal operation, the Eth1Data
stored in state.eth1_data
is replaced every EPOCHS_PER_ETH1_VOTING_PERIOD
epochs. Thus, finalization of the deposit tree proceeds with increments of state.eth1_data
. Let eth1data
be some Eth1Data
. Both of the following conditions MUST be met to consider eth1data
finalized:
state
has state.eth1_data == eth1data
state
has state.eth1_deposit_index >= eth1data.deposit_count
When these conditions are met, the tree can be pruned in the reference implementation by calling tree.finalize(eth1data, execution_block_height)
Copyright and related rights waived via CC0.