Three new EVM jump instructions are introduced (RJUMP
, RJUMPI
and RJUMPV
) which encode destinations as signed immediate values. These can be useful in the majority of (but not all) use cases and offer a cost reduction.
A recurring discussion topic is that EVM only has a mechanism for dynamic jumps. They provide a very flexible architecture with only 2 (!) instructions. This flexibility comes at a cost however: it makes analysis of code more complicated and it also (partially) resulted in the need to have the JUMPDEST
marker.
In a great many cases control flow is actually static and there is no need for any dynamic behaviour, though not every use case can be solved by static jumps.
There are various ways to reduce the need for dynamic jumps, some examples:
This change does not attempt to solve these, but instead introduces a minimal feature set to allow compilers to decide which is the most adequate option for a given use case. It is expected that compilers will use RJUMP
/RJUMPI
almost exclusively, with the exception of returning to the caller continuing to use JUMP
.
This functionality does not preclude the EVM from introducing other forms of control flow later on. RJUMP
/RJUMPI
can efficiently co-exists with a higher-level declaration of functions, where static relative jumps should be used for intra-function control flow.
The main benefit of these instruction is reduced gas cost (both at deploy and execution time) and better analysis properties.
We introduce three new instructions on the same block number EIP-3540 is activated on:
RJUMP
(0xe0) - relative jumpRJUMPI
(0xe1) - conditional relative jumpRJUMPV
(0xe2) - relative jump via jump tableIf the code is legacy bytecode, all of these instructions result in an exceptional halt. (Note: This means no change to behaviour.)
If the code is valid EOF1:
RJUMP relative_offset
sets the PC
to PC_post_instruction + relative_offset
.RJUMPI relative_offset
pops a value (condition
) from the stack, and sets the PC
to PC_post_instruction + ((condition == 0) ? 0 : relative_offset)
.RJUMPV max_index relative_offset+
pops a value (case
) from the stack, and sets the PC
to PC_post_instruction + ((case > max_index) ? 0 : relative_offset[case])
.The immediate argument relative_offset
is encoded as a 16-bit signed (two's-complement) big-endian value. Under PC_post_instruction
we mean the PC
position after the entire immediate value.
The immediate encoding of RJUMPV
is more special: the unsigned 8-bit max_index
value determines the maximum index in the jump table. The number of relative_offset
values following is max_index+1
. This allows table sizes up to 256. The encoding of RJUMPV
must have at least one relative_offset
and thus it will take at minimum 4 bytes. Furthermore, the case > max_index
condition falling through means that in many use cases, one would place the default path following the RJUMPV
instruction. An interesting feature is that RJUMPV 0 relative_offset
is an inverted-RJUMPI
, which can be used in many cases instead of ISZERO RJUMPI relative_offset
.
We also extend the validation algorithm of EIP-3670 to verify that each RJUMP
/RJUMPI
/RJUMPV
has a relative_offset
pointing to an instruction. This means it cannot point to an immediate data of PUSHn
/RJUMP
/RJUMPI
/RJUMPV
. It cannot point outside of code bounds. It is allowed to point to a JUMPDEST
, but is not required to.
Because the destinations are validated upfront, the cost of these instructions are less than their dynamic counterparts: RJUMP
should cost 2, and RJUMPI
and RJUMPV
should cost 4.
We chose relative addressing in order to support code which is relocatable. This also means a code snippet can be injected. A technique seen used prior to this EIP to achieve the same goal was to inject code like PUSHn PC ADD JUMPI
.
We do not see any significant downside to relative addressing and it allows us to also deprecate the PC
instruction.
The signed 16-bit immediate means that the largest jump distance possible is 32767. In the case the bytecode at PC=0
starts with an RJUMP
, it will be possible to jump as far as PC=32770
.
Given MAX_CODE_SIZE = 24576
(in EIP-170) and MAX_INITCODE_SIZE = 49152
(in EIP-3860), we think the 16-bit immediate is large enough.
A version with an 8-bit immediate would only allow moving PC
backward by 125 or forward by 127 bytes. While that seems to be a good enough distance for many for-loops, it is likely not good enough for cross-function jumps, and since the 16-bit immediate is the same size as what a dynamic jump would take in such cases (3 bytes: JUMP PUSH1 n
), we think having less instructions is better.
Should there be a need to have immediate encodings of other size (such as 8-bits, 24-bits or 32-bits), it would be possible to introduce new opcodes, similarly to how multiple PUSH
instructions exist.
PUSHn JUMP
sequencesIf we chose absolute addressing, then RJUMP
could be viewed similar to the sequence PUSHn JUMP
(and RJUMPI
similar to PUSHn JUMPI
). In that case one could argue that instead of introducing a new instruction, such sequences should get a discount, because EVMs could optimise them.
We think this is a bad direction to go:
Both of these are risky. Furthermore we think that EVM implementations should be free to chose what optimisations they apply, and the savings do not need to be passed down at all cost.
Additionally it requires a potentially significant change to the current implementations which depend on a streaming one-by-one execution without a lookahead.
The goal was not to completely replace the current control flow system of the EVM, but to augment it. There are many cases where dynamic jumps are useful, such as returning to the caller.
It is possible to introduce a new mechanism for having a pre-defined table of valid jump destinations, and dynamically supplying the index within this table to accomplish some form of dynamic jumps. This is very useful for efficiently encoding a form of "switch-cases" statements. It could also be used for "return to caller" cases, however it is likely inefficient or awkward.
JUMPDEST
JUMPDEST
serves two purposes:
JUMPDEST
s), and for JIT/AOT translation.This functionality is not needed for static jumps, as the analysers can easily tell destinations from the static jump immediates during jumpdest-analysis.
There are two benefits here:
JUMPDEST
also means a saving of 200 gas during deployment, for each jump destination.JUMPDEST
itself cost 1 gas and is "executed" during jumping.RJUMPV
fallback caseIf no match is found (i.e. the default case) in the RJUMPV
instruction execution will continue without branching. This allows for gaps in the arguments to be filled with 0
s, and a choice of implementation by the programmer. Alternate options would include exceptional aborts in case of no match.
This change poses no risk to backwards compatibility, as it is introduced at the same time EIP-3540 is. The new instructions are not introduced for legacy bytecode (code which is not EOF formatted).
RJUMP
/RJUMPI
/RJUMPV
with JUMPDEST
as targetrelative_offset
is positive/negative/0
RJUMP
/RJUMPI
/RJUMPV
with instruction other than JUMPDEST
as targetrelative_offset
is positive/negative/0
RJUMPV
with various valid table sizes from 1 to 256RJUMP
as a final instruction in code sectionRJUMP
/RJUMPI
/RJUMPV
with truncated immediateRJUMPI
/RJUMPV
as a final instruction in code sectionRJUMP
/RJUMPI
/RJUMPV
target outside of code section boundsRJUMP
/RJUMPI
/RJUMPV
target push dataRJUMP
/RJUMPI
/RJUMPV
target another RJUMP
/RJUMPI
/RJUMPV
immediate argumentRJUMP
/RJUMPI
/RJUMPV
in legacy code aborts executionRJUMP
relative_offset
is positive/negative/0
RJUMPI
relative_offset
is positive/negative/0
condition
equals 0
condition
does not equal 0
RJUMPV 0 relative_offset
case
equals 0
case
does not equal 0
RJUMPV
with table containing positive, negative, 0
offsetscase
equals 0
case
does not equal 0
case
outside of table bounds (case > max_index
, fallback case)case
> 255TBA
Copyright and related rights waived via CC0.