ERC-5202 - Blueprint contract format

Created 2022-06-23
Status Final
Category ERC
Type Standards Track
Authors
Requires

Abstract

Define a standard for "blueprint" contracts, or contracts which represent initcode that is stored on-chain.

Motivation

To decrease deployer contract size, a useful pattern is to store initcode on chain as a "blueprint" contract, and then use EXTCODECOPY to copy the initcode into memory, followed by a call to CREATE or CREATE2. However, this comes with the following problems:

Specification

The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in this document are to be interpreted as described in RFC 2119.

A blueprint contract MUST use the preamble 0xFE71<version bits><length encoding bits>. 6 bits are allocated to the version, and 2 bits to the length encoding. The first version begins at 0 (0b000000), and versions increment by 1. The value 0b11 for <length encoding bits> is reserved. In the case that the length bits are 0b11, the third byte is considered a continuation byte (that is, the version requires multiple bytes to encode). The exact encoding of a multi-byte version is left to a future ERC.

A blueprint contract MUST contain at least one byte of initcode.

A blueprint contract MAY insert any bytes (data or code) between the version byte(s) and the initcode. If such variable length data is used, the preamble must be 0xFE71<version bits><length encoding bits><length bytes><data>. The <length encoding bits> represent a number between 0 and 2 (inclusive) describing how many bytes <length bytes> takes, and <length bytes> is the big-endian encoding of the number of bytes that <data> takes.

Rationale

Backwards Compatibility

No known issues

Test Cases

0xFE710000
0xFE710107FFFFFFFFFFFFFF00

Here, 0xFE71 is the magic header, 0x01 means version 0 + 1 length bit, 0x07 encodes the length in bytes of the data section. These are followed by the data section, and then the initcode. For illustration, the above code with delimiters would be 0xFE71|01|07|FFFFFFFFFFFFFF|00.

0xFE71020100FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF00

Delimited, that would be 0xFE71|02|0100|FF...FF|00.

Reference Implementation

from typing import Optional, Tuple

def parse_blueprint_preamble(bytecode: bytes) -> Tuple[int, Optional[bytes], bytes]:
    """
    Given bytecode as a sequence of bytes, parse the blueprint preamble and
    deconstruct the bytecode into:
        the ERC version, preamble data and initcode.
    Raises an exception if the bytecode is not a valid blueprint contract
    according to this ERC.
    arguments:
        bytecode: a `bytes` object representing the bytecode
    returns:
        (version,
         None if <length encoding bits> is 0, otherwise the bytes of the data section,
         the bytes of the initcode,
        )
    """
    if bytecode[:2] != b"\xFE\x71":
        raise Exception("Not a blueprint!")

    erc_version = (bytecode[2] & 0b11111100) >> 2

    n_length_bytes = bytecode[2] & 0b11
    if n_length_bytes == 0b11:
        raise Exception("Reserved bits are set")

    data_length = int.from_bytes(bytecode[3:3 + n_length_bytes], byteorder="big")

    if n_length_bytes == 0:
        preamble_data = None
    else:
        data_start = 3 + n_length_bytes
        preamble_data = bytecode[data_start:data_start + data_length]

    initcode = bytecode[3 + n_length_bytes + data_length:]

    if len(initcode) == 0:
        raise Exception("Empty initcode!")

    return erc_version, preamble_data, initcode

The following reference function takes the desired initcode for a blueprint as a parameter, and returns EVM code which will deploy a corresponding blueprint contract (with no data section):

def blueprint_deployer_bytecode(initcode: bytes) -> bytes:
    blueprint_preamble = b"\xFE\x71\x00"  # ERC5202 preamble
    blueprint_bytecode = blueprint_preamble + initcode

    # the length of the deployed code in bytes
    len_bytes = len(blueprint_bytecode).to_bytes(2, "big")

    # copy <blueprint_bytecode> to memory and `RETURN` it per EVM creation semantics
    # PUSH2 <len> RETURNDATASIZE DUP2 PUSH1 10 RETURNDATASIZE CODECOPY RETURN
    deploy_bytecode = b"\x61" + len_bytes + b"\x3d\x81\x60\x0a\x3d\x39\xf3"

    return deploy_bytecode + blueprint_bytecode

Security Considerations

There could be contracts on-chain already which happen to start with the same prefix as proposed in this ERC. However, this is not considered a serious risk, because the way it is envisioned that indexers will use this is to verify source code by compiling it and prepending the preamble.

As of 2022-07-08, no contracts deployed on the Ethereum mainnet have a bytecode starting with 0xFE71.

Copyright

Copyright and related rights waived via CC0.