Five new EVM jump instructions are introduced (RJUMP
, RJUMPI
, RJUMPV
, RJUMPSUB
, and RJUMPSUBV
) which encode destinations as signed immediate values. These can be useful in almost all JUMP
and JUMPI
use cases and offer improvements in cost, performance, and static analysis.
A recurring discussion topic is that the EVM only has a mechanism for dynamic jumps. They provide a very flexible architecture with only 2 (!) instructions. This flexibility comes at a cost however: it makes analysis of code more complicated, often intractable, and it also (partially) resulted in the need to have the JUMPDEST
marker.
In a great many cases control flow is actually static and there is no need for any dynamic behaviour, though not every use case can be solved by static jumps.
There are various ways to reduce the need for dynamic jumps, some examples:
This proposal introduces a minimal feature set - including the above - to allow compilers to use RJUMP
/RJUMPI
/RJUMPSUB
exclusively.
This functionality does not preclude the EVM from introducing other forms of control flow later on. RJUMP
/RJUMPI
can efficiently co-exists with a higher-level declaration of functions, where static relative jumps should be used for intra-function control flow. This functionality also does not preclude the use of legacy code.
The main benefit of these instruction is reduced gas cost (both at deploy and execution time), better performance and better static analysis properties.
We introduce five new instructions, each taking either a two-byte relative offset or else a table of offsets as immediate arguments.
1) relative jump:
* RJUMP (0xe0) relative_offset
* sets the PC
to PC_post_instruction + relative_offset
, where
* relative_offset
is encoded as a 16-bit signed (two's-complement) big-endian value.
2) conditional relative jump:
* RJUMPI (0xe1) relative_offset
* pops a value (condition
) from the stack, and
* sets the PC
to PC_post_instruction + ((condition == 0) ? 0 : relative_offset)
.
3) relative jump via jump table:
* RJUMPV (0xe2) max_index relative_offset+
* pops a value (case
) from the stack, and
* sets the PC
to PC_post_instruction + ((case > max_index) ? 0 : relative_offset[case])
.
4) relative jump to subroutine:
* RJUMPSUB (0xe3) relative_offset
* pushes PC_post_instruction
to the return stack,
and
* sets the PC
to PC_post_instruction + relative_offset
, an ENTERSUB
, as if with CALLSUB
.
5) relative jump to subroutine via jump table:
* RJUMPSUBV (0xe4) max_index relative_offset+
* pops a value (
case) from the stack,
* pushes
PC_post_instructionto the
return stack, and
* sets the
PCto
PC_post_instruction + ((case > max_index) ? 0 : relative_offset[case]), an
ENTERSUB, as if with
CALLSUB`.
Note that the destination of the RJUMPSUB
and RJUMPSUBV
opcodes MUST be an ENTERSUB
and control is returned to a calling RJUMPSUB
/ RJUMPSUBV
via RETURNSUB
(See EIP-7979) -- these two forms of jump provide immediate access to the underlying "return-to-caller" mechanism of the CALLSUB
instruction.
The immediate argument relative_offset
is encoded as a 16-bit signed (two's-complement) big-endian value. Under PC_post_instruction
we mean the PC
position after the entire immediate value.
The immediate encoding of RJUMPV
and RjUMPSUBV
are more special: the unsigned 8-bit max_index
value determines the maximum index in the jump table. The number of relative_offset
values following is max_index+1
. This allows table sizes up to 256. The encoding of RJUMPV
must have at least one relative_offset
and thus it will take at minimum 4 bytes. Furthermore, the case > max_index
condition falling through means that in many use cases, one would place the default path following the RJUMPV
instruction. An interesting feature is that RJUMPV 0 relative_offset
is an inverted-RJUMPI
, which can be used in many cases instead of ISZERO RJUMPI relative_offset
.
We also extend the validation algorithm of EIP-7979 to verify that every RJUMP
/RJUMPI
/RJUMPV
/RJUMPSUB
/RJUMPSUBV
has a relative_offset
pointing to an instruction. This means it cannot point to an immediate data of PUSHn
/RJUMP
/RJUMPI
/RJUMPV
and cannot point outside of code bounds. Further, all and only RJUMPSUB
and RJUMPSUBV
MUST have a relative offset
pointing to an `ENTERSUB, whereas the others are allowed to point to a JUMPDEST, but are not required to.
The immediate destinations are checked during jumpdest analysis and need not be checked again, so the cost of these instructions can be less than their dynamic counterparts. We recommend that
RJUMP
should be 2,RJUMPI
and RJUMPV
should be 4. andRJUMPSUB
and RJUMPSUBV
should be 5.We chose relative addressing in order to support code which is relocatable. This also means a code snippet can be injected. A technique seen used prior to this EIP to achieve the same goal was to inject code like PUSHn PC ADD JUMPI
.
We do not see any significant downside to relative addressing and it allows us to also deprecate the PC
instruction.
The signed 16-bit immediate means that the largest jump distance possible is 32767. In the case the bytecode at PC=0
starts with an RJUMP
, it will be possible to jump as far as PC=32770
.
Given MAX_CODE_SIZE = 24576
(in EIP-170) and MAX_INITCODE_SIZE = 49152
(in EIP-3860), we think the 16-bit immediate is large enough.
A version with an 8-bit immediate would only allow moving PC
backward by 125 or forward by 127 bytes. While that seems to be a good enough distance for many for-loops, it is likely not good enough for cross-function jumps, and since the 16-bit immediate is the same size as what a dynamic jump would take in such cases (3 bytes: JUMP PUSH1 n
), we think having less instructions is better.
Should there be a need to have immediate encodings of other size (such as 8-bits, 24-bits or 32-bits), it would be possible to introduce new opcodes, similarly to how multiple PUSH
instructions exist.
PUSHn JUMP
sequencesIf we chose absolute addressing, then RJUMP
could be viewed similar to the sequence PUSHn JUMP
(and RJUMPI
similar to PUSHn JUMPI
). In that case one could argue that instead of introducing a new instruction, such sequences should get a discount, because EVMs could optimise them.
We think this is a bad direction to go, and we leave defining the semantics of existing PUSHn JUMP
sequences to EIP-7979 and do not attempt to optimize their gas cost here.
Both of these are risky. Furthermore we think that EVM implementations should be free to chose what optimisations they apply, and the savings do not need to be passed down at all cost.
Additionally it requires a potentially significant change to the current implementations which depend on a streaming one-by-one execution without a lookahead.
JUMPDEST
JUMPDEST
serves two purposes:
JUMPDEST
s), and for JIT/AOT translation.This functionality is not needed for static jumps, as the analysers can easily tell destinations from the static jump immediates during jumpdest-analysis.
There are two benefits here:
JUMPDEST
also means a saving of 200 gas during deployment, for each jump destination.JUMPDEST
itself cost 1 gas and is "executed" during jumping.RJUMPV
and RJUMPSUBV
fallback casesIf no match is found (i.e. the default case) in the RJUMPV
or RJUMPSUBV
instructions execution will continue without branching. This allows for gaps in the arguments to be filled with 0
s, and a choice of implementation by the programmer. Alternate options would include exceptional aborts in case of no match.
This change poses no risk to backwards compatibility.
RJUMP
/RJUMPI
/RJUMPV
with JUMPDEST
as targetrelative_offset
is positive/negative/0
RJUMP
/RJUMPI
/RJUMPV
with instruction other than JUMPDEST
as targetrelative_offset
is positive/negative/0
RJUMPV
/ RJUMPSUBV
with various valid table sizes from 1 to 256RJUMP
as a final instruction in code sectionRJUMP
/RJUMPI
/RJUMPV
with truncated immediateRJUMPI
/RJUMPV
as a final instruction in code sectionRJUMPSUB
/ RJUMPSUBV
target not ENTERSUB
RJUMP
/RJUMPI
/RJUMPV
/ RJUMPSUB
/ RJUMPSUBV
target outside of code section boundsRJUMP
/RJUMPI
/RJUMPV
/ RJUMPSUB
/ RJUMPSUBV
target push dataRJUMP
/RJUMPI
/RJUMPV
/ RJUMPSUB
/ RJUMPSUBV
target another RJUMP
/RJUMPI
/RJUMPV
immediate argumentRJUMP
relative_offset
is positive/negative/0
RJUMPI
relative_offset
is positive/negative/0
condition
equals 0
condition
does not equal 0
RJUMPV 0 relative_offset
case
equals 0
case
does not equal 0
RJUMPV
with table containing positive, negative, 0
offsets
case
equals 0
case
does not equal 0
case
outside of table bounds (case > max_index
, fallback case)case
> 255RJUMPSUB
relative_offset
is positive/negative/0
RJUMPSUBV 0 relative_offset
case
equals 0
case
does not equal 0
RJUMPSUBV
with table containing positive, negative, 0
offsetscase
equals 0
case
does not equal 0
case
outside of table bounds (case > max_index
, fallback case)case
> 255Adding new instructions with immediate arguments should be carefully considered when implementing the validation algorithm.
Static relative jump execution does not require runtime check of the jump destination. It greatly reduces execution cost. Therefore the gas cost of the new instructions can also be significantly reduced.
The RJUMPV
and RJUMPSUBV
instruction relative offset tables can have up to 256 one-byte entries, so reading an offset cannot be a potential DoS attack surface.
Copyright and related rights waived via CC0.