EIP-2584 - Trie format transition with overlay trees

Created 2020-04-03
Status Stagnant
Category Core
Type Standards Track
Authors

Simple Summary

This EIP proposes a method to convert the state trie format from hexary to binary: new values are directly stored in a binary trie “laid over” the hexary trie. Meanwhile, the hexary trie is converted to a binary trie in the background. When the process is finished, both layers are merged.

Abstract

This EIP describes a four phase process to complete the conversion.

Motivation

There is a long running interest in switching the state trie from a hexary format to a binary format, for reasons pertaining to proof and storage sizes. The conversion process poses a catch-up issue, caused by the sheer size of the full state: it can not be translated in a reasonable time (i.e. on the same order of magnitude as the block time).

Specification

This specification follows the notation introduced by the Yellow Paper. Prior to reading it is advisable to be familiar with the Yellow Paper.

Binary tries

This EIP assumes that a binary trie is defined like the MPT, except that:

Let ß be the function that, given a hexary trie, computes the equivalent representation of that trie in the aforementioned binary trie format.

Phase 1

Let h₁ be the previously agreed-upon block height at which phase 1 starts, and h₂ the block at which phase 2 starts. For each block of height h₁ ≤ h < h₂:

  1. A conversion process is started in the background, to turn the hexary trie into its binary equivalent. The end goal of this process is the calculation of the root hash of the converted binary trie, denoted Hᵣ². The root of the hexary trie is hereafter called Hᵣ¹⁶. Formally, this process is written as Hᵣ² ≡ ß(Hᵣ¹⁶).
  2. Block headers contain a new Hₒ field, which is the root of the overlay binary trie;
  3. Hᵣ ≡ P(H)ᵣ¹⁶, i.e. as long as the conversion from hexary to binary is not complete, the hexary trie root is the same as that of its parent block.

The following is changed in the execution environment:

Phase 1 ends at block height h₂, which is set far enough from h₁ to offer miners enough time to perform the conversion.

Phase 2

This phase differs from the previous one on the following points:

Account deletion is performed according to the following rules:

When the overlay trie is empty, phase 2 ends and phase 3 begins.

Phase 3

Phase 3 is the same as phase 2, except for the following change:

Rationale

Methods that have been discussed until now include a "stop the world" approach, in which the chain is stopped for the significant amount of time that is required by the conversion, and a "copy on write" approach, in which branches are converted upon being accessed. The approach suggested here has the advantage that the chain continues to operate normally during the conversion process, and that the tree is fully converted to a binary format, in a predictable time.

Backwards Compatibility

This requires a fork and will break backwards compatibility, as the hashes and block formats will necessarily be different. This will cause a fork in clients that don't implement the overlay tree, and those that do not accept the new binary root. No mitigation is proposed, as this is a hard fork.

Test Cases

Implementation

A prototype version of the conversion process (phase 1) is available for geth in this PR.

Security Considerations

There are three attack vectors that I can foresee:

Community feedback

Copyright

Copyright and related rights waived via CC0.