Four new instructions are introduced, that allow to read EOF container's data section: DATALOAD
loads 32-byte word to stack, DATALOADN
loads 32-byte word to stack where the word is addressed by a static immediate argument, DATASIZE
loads data section size and DATACOPY
copies a segment of data section to memory.
Clear separation between code and data is one of the main features of EOF1. Data section may contain anything, e.g. compiler's metadata, but to make it useful for smart contracts, EVM has to have instructions that allow to read from data section. Previously existing instructions for bytecode inspection (CODECOPY
, CODESIZE
etc.) are deprecated in EOF1 and cannot be used for this purpose.
The DATALOAD
, DATASIZE
, DATACOPY
instruction pattern follows the design of existing instructions for reading other kinds of data (i.e. returndata and calldata).
DATALOADN
is an optimized version of DATALOAD
, where data offset to read is set at compilation time, and therefore need not be validated at run-time, which makes the instruction cheaper.
We introduce four new instructions on the same block number EIP-3540 is activated on:
DATALOAD
(0xd0)DATALOADN
(0xd1)DATASIZE
(0xd2)DATACOPY
(0xd3)If the code is legacy bytecode, all of these instructions result in an exceptional halt. (Note: This means no change to behaviour.)
If the code is valid EOF1, the following execution rules apply:
DATALOAD
offset
, from the stack.[offset:offset+32]
segment from the data section and pushes it as 32-byte value to the stack.offset + 32
is greater than the data section size, bytes after the end of data section are set to 0.DATALOADN
offset
, encoded as a 16-bit unsigned big-endian value.[offset:offset+32]
segment from the data section and pushes it as 32-byte value to the stack.[offset:offset+32]
is guaranteed to be within data bounds by code validation.
DATASIZE
DATACOPY
mem_offset
, offset
, size
.mem_offset + size
and deducts memory expansion cost.3 + 3 * ((size + 31) // 32)
gas for copying.[offset:offset+size]
segment from the data section and writes it to memory starting at offset mem_offset
.offset + size
is greater than data section size, 0 bytes will be copied for bytes after the end of the data section.We extend code section validation rules (as defined in EIP-3670).
offset
of any DATALOADN
is such that offset + 32
is greater than data section size, as indicated in the container header before deployment.RJUMP
, RJUMPI
and RJUMPV
immediate argument value (jump destination relative offset) validation: code section is invalid in case offset points to one of two bytes directly following DATALOADN
instruction.Existing instructions for reading other kinds of data implicitly pad with zeroes on out of bounds access, with the only exception of return data copying.
It is beneficial to avoid exceptional failures, because compilers can employ optimizations like removing a code that copies data, but never accesses this copy afterwards, but such optimization is possible only if instruction never has other side effects like exceptional abort.
EXTDATACOPY
EXTCODECOPY
instruction is deprecated and rejected in EOF contracts and does not copy contract code when being called in legacy with an EOF contract as target. A replacement instruction EXTDATACOPY
has been considered, but decided against in order to reduce the scope of changes.
Data-only contracts which previously relied on EXTCODECOPY
are thereby discouraged, but if there is a strong need, support for them can be easily brought back by introducing EXTDATACOPY
in a future upgrade.
This change poses no risk to backwards compatibility, as it is introduced only for EOF1 contracts, for which deploying undefined instructions is not allowed, therefore there are no existing contracts using these instructions. The new instructions are not introduced for legacy bytecode (code which is not EOF formatted).
TBA
Copyright and related rights waived via CC0.