Five new EVM jump instructions are introduced (RJUMP, RJUMPI, RJUMPV, RJUMPSUB, and RJUMPSUBV) which encode destinations as signed immediate values. These can be useful in almost all JUMP and JUMPI use cases and offer improvements in cost, performance, and static analysis.
A recurring discussion topic is that the EVM only has a mechanism for dynamic jumps. They provide a very flexible architecture with only 2 (!) instructions. This flexibility comes at a cost however: it makes analysis of code more complicated, often intractable, and it also (partially) resulted in the need to have the JUMPDEST marker.
In a great many cases control flow is actually static and there is no need for any dynamic behaviour, though not every use case can be solved by static jumps.
There are various ways to reduce the need for dynamic jumps, some examples:
This proposal introduces a minimal feature set - including the above - to allow compilers to use RJUMP/RJUMPI/RJUMPSUB exclusively.
This functionality does not preclude the EVM from introducing other forms of control flow later on. RJUMP/RJUMPI can efficiently co-exists with a higher-level declaration of functions, where static relative jumps should be used for intra-function control flow. This functionality also does not preclude the use of legacy code.
The main benefit of these instruction is reduced gas cost (both at deploy and execution time), better performance and better static analysis properties.
We introduce five new instructions, each taking either a two-byte relative offset or else a table of offsets as immediate arguments.
1) relative jump:
* RJUMP (0xe0) relative_offset
* sets the PC to PC_post_instruction + relative_offset, where
* relative_offset is encoded as a 16-bit signed (two's-complement) big-endian value.
2) conditional relative jump:
* RJUMPI (0xe1) relative_offset
* pops a value (condition) from the stack, and
* sets the PC to PC_post_instruction + ((condition == 0) ? 0 : relative_offset).
3) relative jump via jump table:
* RJUMPV (0xe2) max_index relative_offset+
* pops a value (case) from the stack, and
* sets the PC to PC_post_instruction + ((case > max_index) ? 0 : relative_offset[case]).
4) relative jump to subroutine:
* RJUMPSUB (0xe3) relative_offset
* pushes PC_post_instruction to the return stack, and
* sets the PC to PC_post_instruction + relative_offset, an ENTERSUB, as if with CALLSUB.
5) relative jump to subroutine via jump table:
* RJUMPSUBV (0xe4) max_index relative_offset+
* pops a value (case) from the stack,
* pushesPC_post_instructionto thereturn stack, and
* sets thePCtoPC_post_instruction + ((case > max_index) ? 0 : relative_offset[case]), anENTERSUB, as if withCALLSUB`.
Note that the destination of the RJUMPSUB and RJUMPSUBV opcodes MUST be an ENTERSUB and control is returned to a calling RJUMPSUB / RJUMPSUBV via RETURNSUB (See EIP-7979) -- these two forms of jump provide immediate access to the underlying "return-to-caller" mechanism of the CALLSUB instruction.
The immediate argument relative_offset is encoded as a 16-bit signed (two's-complement) big-endian value. Under PC_post_instruction we mean the PC position after the entire immediate value.
The immediate encoding of RJUMPV and RjUMPSUBV are more special: the unsigned 8-bit max_index value determines the maximum index in the jump table. The number of relative_offset values following is max_index+1. This allows table sizes up to 256. The encoding of RJUMPV must have at least one relative_offset and thus it will take at minimum 4 bytes. Furthermore, the case > max_index condition falling through means that in many use cases, one would place the default path following the RJUMPV instruction. An interesting feature is that RJUMPV 0 relative_offset is an inverted-RJUMPI, which can be used in many cases instead of ISZERO RJUMPI relative_offset.
We also extend the validation algorithm of EIP-7979 to verify that every RJUMP/RJUMPI/RJUMPV/RJUMPSUB/RJUMPSUBV has a relative_offset pointing to an instruction. This means it cannot point to an immediate data of PUSHn/RJUMP/RJUMPI/RJUMPV and cannot point outside of code bounds. Further, all and only RJUMPSUB and RJUMPSUBV MUST have a relative offset pointing to an `ENTERSUB, whereas the others are allowed to point to a JUMPDEST, but are not required to.
The immediate destinations are checked during jumpdest analysis and need not be checked again, so the cost of these instructions can be less than their dynamic counterparts. We recommend that
RJUMP should be 2,RJUMPI and RJUMPV should be 4. andRJUMPSUB and RJUMPSUBV should be 5.We chose relative addressing in order to support code which is relocatable. This also means a code snippet can be injected. A technique seen used prior to this EIP to achieve the same goal was to inject code like PUSHn PC ADD JUMPI.
We do not see any significant downside to relative addressing and it allows us to also deprecate the PC instruction.
The signed 16-bit immediate means that the largest jump distance possible is 32767. In the case the bytecode at PC=0 starts with an RJUMP, it will be possible to jump as far as PC=32770.
Given MAX_CODE_SIZE = 24576 (in EIP-170) and MAX_INITCODE_SIZE = 49152 (in EIP-3860), we think the 16-bit immediate is large enough.
A version with an 8-bit immediate would only allow moving PC backward by 125 or forward by 127 bytes. While that seems to be a good enough distance for many for-loops, it is likely not good enough for cross-function jumps, and since the 16-bit immediate is the same size as what a dynamic jump would take in such cases (3 bytes: JUMP PUSH1 n), we think having less instructions is better.
Should there be a need to have immediate encodings of other size (such as 8-bits, 24-bits or 32-bits), it would be possible to introduce new opcodes, similarly to how multiple PUSH instructions exist.
PUSHn JUMP sequencesIf we chose absolute addressing, then RJUMP could be viewed similar to the sequence PUSHn JUMP (and RJUMPI similar to PUSHn JUMPI). In that case one could argue that instead of introducing a new instruction, such sequences should get a discount, because EVMs could optimise them.
We think this is a bad direction to go, and we leave defining the semantics of existing PUSHn JUMP sequences to EIP-7979 and do not attempt to optimize their gas cost here.
Both of these are risky. Furthermore we think that EVM implementations should be free to chose what optimisations they apply, and the savings do not need to be passed down at all cost.
Additionally it requires a potentially significant change to the current implementations which depend on a streaming one-by-one execution without a lookahead.
JUMPDESTJUMPDEST serves two purposes:
JUMPDESTs), and for JIT/AOT translation.This functionality is not needed for static jumps, as the analysers can easily tell destinations from the static jump immediates during jumpdest-analysis.
There are two benefits here:
JUMPDEST also means a saving of 200 gas during deployment, for each jump destination.JUMPDEST itself cost 1 gas and is "executed" during jumping.RJUMPV and RJUMPSUBV fallback casesIf no match is found (i.e. the default case) in the RJUMPV or RJUMPSUBV instructions execution will continue without branching. This allows for gaps in the arguments to be filled with 0s, and a choice of implementation by the programmer. Alternate options would include exceptional aborts in case of no match.
This change poses no risk to backwards compatibility.
RJUMP/RJUMPI/RJUMPV with JUMPDEST as targetrelative_offset is positive/negative/0RJUMP/RJUMPI/RJUMPV with instruction other than JUMPDEST as targetrelative_offset is positive/negative/0RJUMPV / RJUMPSUBV with various valid table sizes from 1 to 256RJUMP as a final instruction in code sectionRJUMP/RJUMPI/RJUMPV with truncated immediateRJUMPI/RJUMPV as a final instruction in code sectionRJUMPSUB / RJUMPSUBV target not ENTERSUBRJUMP/RJUMPI/RJUMPV / RJUMPSUB / RJUMPSUBV target outside of code section boundsRJUMP/RJUMPI/RJUMPV / RJUMPSUB / RJUMPSUBV target push dataRJUMP/RJUMPI/RJUMPV / RJUMPSUB / RJUMPSUBV target another RJUMP/RJUMPI/RJUMPV immediate argumentRJUMPrelative_offset is positive/negative/0RJUMPIrelative_offset is positive/negative/0condition equals 0condition does not equal 0 RJUMPV 0 relative_offsetcase equals 0case does not equal 0 RJUMPV with table containing positive, negative, 0 offsets
case equals 0case does not equal 0 case outside of table bounds (case > max_index, fallback case)case > 255RJUMPSUB
relative_offset is positive/negative/0RJUMPSUBV 0 relative_offsetcase equals 0case does not equal 0 RJUMPSUBV with table containing positive, negative, 0 offsetscase equals 0case does not equal 0 case outside of table bounds (case > max_index, fallback case)case > 255Adding new instructions with immediate arguments should be carefully considered when implementing the validation algorithm.
Static relative jump execution does not require runtime check of the jump destination. It greatly reduces execution cost. Therefore the gas cost of the new instructions can also be significantly reduced.
The RJUMPVand RJUMPSUBV instruction relative offset tables can have up to 256 one-byte entries, so reading an offset cannot be a potential DoS attack surface.
Copyright and related rights waived via CC0.