This is the smallest possible change to the EVM to support calls and returns.
This proposal introduces three new control-flow instructions to the EVM:
CALLSUB transfers control to the destination on the stack.ENTERSUB marks a CALLSUB destination.RETURNSUB returns to the PC after the most recent CALLSUB.Code can also be prefixed with MAGIC bytes. MAGIC code is validated at CREATE time to ensure that it cannot execute invalid instructions, jump to invalid locations, underflow stack, or, in the absence of recursion, overflow stack.
The complete control flow of MAGIC code can be traversed in time and space linear in the size of the code, enabling tools for validation, automated proofs of correctness, and ahead-of-time and just-in-time compilers.
These changes are backwards-compatible: all instructions behave as specified whether or not they appear in MAGIC code.
Note: Significant assistance from Anthropic's Claude is acknowledged, primarily for Vyper code.
The EVM currently lacks explicit call and return instructions. Instead, calls and returns must be synthesized using the dynamic JUMP instruction, which takes its destination from the stack. This creates two fundamental problems:
For detailed historical context, technical foundations of static control flow, and their impact on Ethereum's scaling roadmap, see EIP-8173: "Foundations of EVM Control Flow."
The key words MUST and MUST NOT in this Specification are to be interpreted as described in RFC 2119 and RFC 8174.
CALLSUB (0x..)Transfers control to a subsidiary operation.
destination on top of the stack.PC + 1 to the return stack.PC to destination.The gas cost is mid (8).
ENTERSUB (0x.. )The destination of every CALLSUB MUST be an ENTERSUB.
The gas cost is jumpdest (1).
RETURNSUB (0x.. )Returns control to the caller of a subsidiary operation.
return stack to PC.The gas cost is low (5).
MAGIC (0xEF.... )After this EIP has been activated, code beginning with the MAGIC bytes MUST be a valid program. Execution begins immediately after the MAGIC bytes.
Notes:
return stack do not need to be validated, since they are alterable only by CALLSUB and RETURNSUB.return stack. But the actual state of the return stack is not observable by EVM code or consensus-critical to the protocol. (For example, a node implementer may code CALLSUB to unobservably push PC on the return stack rather than PC + 1, which is allowed so long as RETURNSUB observably returns control to the PC + 1 location.)MAGIC values are still to be determined, but the MAGIC bytes will begin with 0xEF.A mid cost for CALLSUB is justified by it taking very little more work than the mid cost of JUMP — just pushing an integer to the return stack.
A jumpdest cost for ENTERSUB is justified by it being, like JUMPDEST, a mere label.
A low cost for RETURNSUB is justified by needing only to pop the return stack into the PC — less work than a jump.
Benchmarking will be needed to tell if the costs are well-balanced.
Execution is defined in the Yellow Paper as a sequence of changes to the EVM state. The conditions on valid code are preserved by state changes. At runtime, if execution of an instruction would violate a condition, the execution is in an exceptional halting state and cannot continue. The Yellow Paper defines six such states:
We would like to consider EVM code valid if and only if no execution of the program can lead to an exceptional halting state. In practice, we must test at runtime for the first three conditions — we don't know whether we will be called statically, how much gas there will be, or how deep a recursion may go. (However, we can validate that non-recursive programs do not overflow stack.) All of the remaining conditions MUST be validated statically.
To allow for efficient algorithms, our validation considers only the code's control flow and stack use, not its data and computations. This means we will reject programs with invalid code paths, even if those paths are not reachable.
valid codeCode beginning with MAGIC MUST be valid. The constraints on valid code are as follows:
All reachable opcodes must be valid:
They MUST have been defined in the Yellow Paper or a deployed EIP, they
The INVALID opcode is valid.
The JUMP and JUMPI instructions
MUST address a JUMPDEST,
MUST be immediately preceded by a PUSH instruction.
The CALLSUB instruction
MUST address an ENTERSUB,
MUST be immediately preceded by a PUSH instruction.
The number of items on the data stack and on the return stack
MUST always be positive and less than or equal to 1024.
For all paths from a CALLSUB to a RETURNSUB
the absolute difference between
The guarantee of constant stack height prevents stack underflow, prevents stack overflow for non-recursive programs, breaks cycles in the control flow, and enables one-pass traversal of control-flow. The net stack effect of a subroutine — the number of items it consumes minus the number it produces — may be any fixed value, positive, negative, or zero, so long as it is consistent across all paths.
Constraints on MAGIC code MUST be validated at CREATE time, in time and space linear in the size of the code. To this end a canonical EVM implementation of the validation algorithm MUST be placed on the blockchain and its address included in the MAGIC header. This validation code or its equivalent MUST be run on the output of the initialization code before it is executed by the interpreter, and failure of validation is an exceptional halting state. (Note that this mechanism could easily be extended to support user-defined validation contracts.)
Clients need not implement the validation algorithm — they can simply call the canonical contract on the blockchain. Clients are of course free to implement equivalent algorithms themselves.
Note: The JVM, Wasm, and .NET VMs enforce similar constraints for similar reasons.
The above is a purely semantic specification, placing no constraints on the syntax of bytecode beyond being an array of opcodes and immediate data. Subroutines here are not contiguous sequences of bytecode: they are subgraphs of the bytecode's full control-flow graph. The EVM is a simple state machine, and the control-flow graph for a program represents every possible change of state. Each instruction simply advances the machine one state, and the instructions and state have minimal syntactic structure. We only promise that valid code will not, as it were, jam up the gears of the machine.
Rather than enforce semantic constraints via syntax — as is done by higher-level languages — this proposal enforces them via validation: MAGIC code is proven valid at CREATE time.
With no syntactic constraints and minimal semantic constraints, we maximize opportunities for optimizations, including tail call elimination, multiple-entry calls, efficient register allocation, and inter-procedural optimizations. Since we want to support online compilation of EVM code to native code, it is crucial that the EVM code be as well optimized as possible by high-level-language compilers — upfront and offline.
By marking MAGIC contracts as valid we are promising that their control-flow is static, and the many tools that traverse the control flow can know this without inspecting the code for themselves. The intention is to mitigate the chicken-and-egg cycle of our tools working around the EVM's dynamic control flow to recover subroutines, while our compilers work around it to implement subroutines, with neither group much aware of the other's needs. All of which makes for a vicious cycle of needless implementation complexity and technical debt. With this proposal it will at least and at last become possible to write static EVM code and break the cycle.
Recovering a program's control-flow is a fundamental first step for many analyses. When all jumps are static, the number of analysis steps is linear in the number of instructions: a fixed number of paths must be explored for each jump. With dynamic jumps, every possible destination must be explored at every jump. Intuitively, the number of possible paths through code just explodes. At worst, the number of paths in the control flow can go up as the square of the number of instructions, and symbolic execution of the paths can go up exponentially. See EIP-8173 for a more detailed explanation and example exploit.
This behavior is not merely a theoretical concern. For Ethereum, it represents a denial-of-service vulnerability for many tools — including bytecode validation and AOT or JIT compilation at runtime. And even offline it renders many analyses impractical, intractable, or impossible.
This proposal aims to be the smallest possible change to the EVM. We introduce only two industry-standard call and return operations using three instructions — CALLSUB, ENTERSUB, and RETURNSUB — sufficient to eliminate the need for dynamic jumps. (Note: ENTERSUB and JUMPDEST are technically unnecessary, but their presence makes it easier to recover function and block boundaries in the control flow, so we leave deliberation on whether to remove them to a later EIP.)
Primarily backwards compatibility. Other reasons include:
Immediate arguments for jump destinations would improve performance but increase complexity. See EIP-8013: Static relative jumps and calls for the EVM for a complementary proposal using immediate arguments.
Code sections or other structural constraints would impose syntactic restrictions that inhibit optimization. See EIP-3540: EOF - EVM Object Format and EIP-4750: EOF - Functions for complementary proposals that provide more structure. Those proposals needed to introduce special-purpose opcodes to allow for important uses of cross-subroutine jumps.
Register machines like x86, ARM, and RISC-V typically use a single stack and dedicated registers for both data and return addresses. Stack machines from Turing's ACE through Forth, Java, Wasm, .NET and many others use separate data and return stacks. The EVM is a stack machine, and we adopt the same proven approach: a separate return stack isolated from the data stack. Another reason to maintain a separate stack is that the data stack items are 32 bytes, but jump destinations will not need more than one or two.
The return addresses, being on their own stack, are not accessible to EVM code. They cannot be read, modified, or moved by ordinary stack operations. This eliminates an entire class of vulnerabilities where code could corrupt its own control flow.
Because return addresses are controlled exclusively by CALLSUB and RETURNSUB, they are intrinsically safe to validate. Unlike data-stack values (which may depend on arbitrary computation), return-stack values are guaranteed to be valid PC values — we can validate all return addresses at compile time.
JUMP and JUMPI restricted in MAGIC code?Constraint 2 requires that JUMP and JUMPI in MAGIC code be immediately preceded by a PUSH instruction, making their destinations compile-time constants. This converts them from dynamic to static jumps, preserving their use for intra-subroutine branching — loops, conditionals — while ensuring that all control flow in MAGIC code is statically resolvable in a single linear pass.
CALLSUB, ENTERSUB, and RETURNSUB allowed in ordinary code?Primarily backwards compatibility. We want MAGIC code to run unaltered in ordinary contracts. Also, these instructions have legitimate uses in code that will not pass validation. For example, Vyper could make good use of calls and returns, but would still need dynamic jumps to implmenent selector tables. A future proposal like EIP-8013 could provide them, but in the interim that code cannot be MAGIC.
The difference these instructions make can be seen in this very simple code for calling a routine that squares a number. The distinct opcodes make it easier for both people and tools to understand the code, and there are modest savings in code size and gas costs as well.
SQUARE: | SQUARE:
jumpdest ; 1 gas | entersub ; 1 gas
dup ; 3 gas | dup : 3 gas
mul ; 5 gas | mul ; 5 gas
swap1 ; 3 gas | returnsub ; 5 gas
jump ; 8 gas |
|
CALL_SQUARE: | CALL_SQUARE:
jumpdest ; 1 gas | entersub ; 1 gas
push RTN_CALL ; 3 gas | push 2 ; 3 gas
push 2 ; 3 gas | push SQUARE ; 3 gas
push SQUARE ; 3 gas | callsub ; 8 gas
jump ; 8 gas | returnsub ; 5 gas
RTN_CALL: |
swap1 ; 3 gas |
jump ; 8 gas |
|
Size in bytes: 17 | Size in bytes: 12
Consumed gas: 50 | Consumed gas: 34
That's 29% fewer bytes and 32% less gas using CALLSUB versus using JUMP. So we can see that these instructions provide a simpler, more efficient mechanism. As code becomes larger and better optimized the gains become smaller, but code using CALLSUB always takes less space and gas than equivalent code without it.
Some real-time interpreter performance gains are reflected in the lower gas costs. But larger gains come from AOT and JIT compilers. The constraint that stack depths be constant means that in MAGIC code, a JIT can traverse the control flow in one pass, generating machine code on the fly, and an AOT can emit better code in linear time (The Wasm, JVM, and .NET VMs share this property.)
The EVM is a stack machine, but most real machines are register machines. Generating virtual register code for a faster interpreter yields significant gains (4X speedups are possible on JVM code), and generating good machine code yields orders of magnitude improvements. However, for most transactions, storage dominates execution time, and gas counting and other overhead always take their toll. So such gains would be most visible in contexts where overhead is minimal, such as L1 precompiles, some L2s, and some EVM-compatible chains.
Static control flow enables the construction of simpler, more efficient circuits for ZK verification. See EIP-8173 for details on how static control flow improves ZK-rollup and optimistic rollup efficiency, and enables future migrations to other execution environments like RISC-V.
These changes are backwards compatible. Opcode semantics are not affected by whether the contract is MAGIC, so valid code will execute identically in any contract. Validation of MAGIC code is done before the interpreter runs, such that the interpreter never sees MAGIC code that is not valid. There are no changes to the semantics of existing EVM code. (With the caveat that code with unspecified behavior might behave in different, unspecified ways. Such code was always broken.)
Therefore this proposal does not require a client to maintain two interpreters. Neither does this proposal require a client to implement validation code — it will already be on the blockchain. So the implementation can come down to a push and a jump to call, and a pop and another jump to return.
These changes do not preclude running the EVM in zero knowledge; neither do they foreclose EOF, RISC-V, or other changes. They can instead be a win.
Note: the bytecode strings in these tests use placeholder opcode values
0xB0=CALLSUB, 0xB1=ENTERSUB, 0xB2=RETURNSUB, which are to be
confirmed when final opcode assignments are made. The traces, gas totals,
and pass/fail outcomes are correct for the semantics defined in this EIP.
The Stack column shows the data stack before the instruction executes. The RStack column shows the return stack before the instruction executes.
This should call a subroutine, return from it, and stop.
Bytecode: 0x6004B000B1B2 (PUSH1 0x04, CALLSUB, STOP, ENTERSUB, RETURNSUB)
PC=0: PUSH1 imm=0x04 size=2
PC=2: CALLSUB size=1
PC=3: STOP size=1
PC=4: ENTERSUB size=1
PC=5: RETURNSUB size=1
| Pc | Op | Cost | Stack | RStack |
|---|---|---|---|---|
| 0 | PUSH1 | 3 | [] | [] |
| 2 | CALLSUB | 8 | [4] | [] |
| 4 | ENTERSUB | 1 | [] | [3] |
| 5 | RETURNSUB | 5 | [] | [3] |
| 3 | STOP | 0 | [] | [] |
Output: 0x
Consumed gas: 17
This should execute fine, going into two depths of subroutines.
Bytecode: 0x6005B000B16009B0B2B1B2 (PUSH1 0x05, CALLSUB, STOP, ENTERSUB, PUSH1 0x09, CALLSUB, RETURNSUB, ENTERSUB, RETURNSUB)
PC=0: PUSH1 imm=0x05 size=2
PC=2: CALLSUB size=1
PC=3: STOP size=1
PC=4: ENTERSUB size=1
PC=5: PUSH1 imm=0x09 size=2
PC=7: CALLSUB size=1
PC=8: RETURNSUB size=1
PC=9: ENTERSUB size=1
PC=10: RETURNSUB size=1
| Pc | Op | Cost | Stack | RStack |
|---|---|---|---|---|
| 0 | PUSH1 | 3 | [] | [] |
| 2 | CALLSUB | 8 | [5] | [] |
| 4 | ENTERSUB | 1 | [] | [3] |
| 5 | PUSH1 | 3 | [] | [3] |
| 7 | CALLSUB | 8 | [9] | [3] |
| 9 | ENTERSUB | 1 | [] | [3,8] |
| 10 | RETURNSUB | 5 | [] | [3,8] |
| 8 | RETURNSUB | 5 | [] | [3] |
| 3 | STOP | 0 | [] | [] |
Consumed gas: 34
This should fail because the destination is outside the code range.
Bytecode: 0x60FFB000B1B2 (PUSH1 0xFF, CALLSUB, STOP, ENTERSUB, RETURNSUB)
PC=0: PUSH1 imm=0xFF size=2 ← destination 255, code is only 6 bytes
PC=2: CALLSUB size=1
PC=3: STOP size=1
PC=4: ENTERSUB size=1
PC=5: RETURNSUB size=1
| Pc | Op | Cost | Stack | RStack |
|---|---|---|---|---|
| 0 | PUSH1 | 3 | [] | [] |
| 2 | CALLSUB | 8 | [0xFF] | [] |
Error: at pc=2, op=CALLSUB: invalid destination
This should fail at the first opcode because the return stack is empty.
Bytecode: 0xB2 (RETURNSUB)
| Pc | Op | Cost | Stack | RStack |
|---|---|---|---|---|
| 0 | RETURNSUB | 5 | [] | [] |
Error: at pc=0, op=RETURNSUB: invalid return stack
In this example, CALLSUB is the last byte of code. When the subroutine returns, it should hit the implicit STOP after the bytecode and not exit with error.
Bytecode: 0x600556B1B25B6003B0 (PUSH1 0x05, JUMP, ENTERSUB, RETURNSUB, JUMPDEST, PUSH1 0x03, CALLSUB)
PC=0: PUSH1 imm=0x05 size=2
PC=2: JUMP size=1
PC=3: ENTERSUB size=1
PC=4: RETURNSUB size=1
PC=5: JUMPDEST size=1
PC=6: PUSH1 imm=0x03 size=2
PC=8: CALLSUB size=1 ← last byte; returns to PC=9 (past end → implicit STOP)
| Pc | Op | Cost | Stack | RStack |
|---|---|---|---|---|
| 0 | PUSH1 | 3 | [] | [] |
| 2 | JUMP | 8 | [5] | [] |
| 5 | JUMPDEST | 1 | [] | [] |
| 6 | PUSH1 | 3 | [3] | [] |
| 8 | CALLSUB | 8 | [3] | [] |
| 3 | ENTERSUB | 1 | [] | [9] |
| 4 | RETURNSUB | 5 | [] | [9] |
| 9 | (implicit STOP) | 0 | [] | [] |
Consumed gas: 29
The following is a Vyper implementation of the validation algorithm for MAGIC code. It validates EVM bytecode against the five constraints defined above in time and space linear in the size of the code.
A canonical deployment of this contract MUST be placed on the blockchain and its address included in the MAGIC header. This contract or its equivalent MUST be called on the output of the initialization code before it is executed by the interpreter; failure of validation is an exceptional halting state.
Clients need not implement the validation algorithm — they can simply call the canonical contract on the blockchain.
Validation traces all reachable control flow from PC 0, carrying the data-stack depth, return-stack depth, and the data-stack depth at the most recent ENTERSUB entry. At each reachable instruction it checks:
JUMP, JUMPI, and CALLSUB are immediately preceded by a PUSH, whose value is the destination; that destination must be a JUMPDEST or ENTERSUB respectively.RETURNSUB only appears inside a subroutine with a non-empty return stack.When a destination has not yet been reached, its required type is recorded and checked when it is. A destination inside unreachable bytes will fail either Constraint 1 (invalid opcode) or the destination-type check when the traversal visits it. Unreachable bytes are never examined unless jumped to.
# pragma version ^0.4.0
# @title EIP-7979 MAGIC Code Validator
# @notice Validates that EVM bytecode satisfies the constraints for MAGIC code
# as defined in EIP-7979. Intended to be deployed on-chain and called
# at CREATE time before a MAGIC contract is executed.
# @dev Returns True if the bytecode is valid, False otherwise.
# Runs in one pass, in time and space linear in the size of the code.
# ---------------------------------------------------------------------------
# Constants
# ---------------------------------------------------------------------------
# Maximum bytecode size (EIP-170).
MAX_CODE_SIZE: constant(uint256) = 24576
# Maximum EVM data-stack or return-stack depth.
MAX_STACK_DEPTH: constant(uint256) = 1024
# Worklist bound: each PC is enqueued at most once per distinct incoming
# height triple, bounded by code size.
MAX_WORKLIST: constant(uint256) = MAX_CODE_SIZE * 2
# Sentinel: slot not yet visited.
UNVISITED: constant(int256) = -1
# Required destination type tags, stored in required_at[].
REQUIRE_NONE: constant(uint8) = 0
REQUIRE_JUMPDEST: constant(uint8) = 1
REQUIRE_ENTERSUB: constant(uint8) = 2
# ---------------------------------------------------------------------------
# Opcode symbolic constants
# (actual byte values to be confirmed when opcodes are assigned)
# ---------------------------------------------------------------------------
OPCODE_JUMP: constant(uint8) = 0x56
OPCODE_JUMPI: constant(uint8) = 0x57
OPCODE_JUMPDEST: constant(uint8) = 0x5B
OPCODE_CALLSUB: constant(uint8) = 0xB0 # TBD
OPCODE_ENTERSUB: constant(uint8) = 0xB1 # TBD
OPCODE_RETURNSUB: constant(uint8) = 0xB2 # TBD
# ---------------------------------------------------------------------------
# Structs
# ---------------------------------------------------------------------------
# One entry on the CFG traversal worklist.
struct Continuation:
pc: uint256 # program counter to process
data_height: int256 # data-stack depth on entry
ret_height: int256 # return-stack depth on entry
entersub_height: int256 # data-stack depth at most recent ENTERSUB
# (-1 means not currently inside a subroutine)
prev_push_pc: uint256 # PC of the immediately preceding PUSH, if any
prev_push_size: uint256 # total size of that PUSH (0 if none)
# ---------------------------------------------------------------------------
# Internal helper stubs
# ---------------------------------------------------------------------------
@internal
@pure
def opcode_info(op: uint8) -> (uint256, int256, int256, bool):
"""
Return (size, pops, pushes, is_terminator) for the given opcode.
size: total instruction length in bytes (1 + immediate data)
pops: items consumed from the data stack
pushes: items produced onto the data stack
is_terminator: True if control cannot fall through to pc + size
(STOP, RETURN, REVERT, SELFDESTRUCT, JUMP, JUMPI,
CALLSUB, RETURNSUB, INVALID, ...)
Returns (0, 0, 0, False) for an unknown or invalid opcode; the caller
treats size == 0 as a Constraint 1 violation.
"""
# --- STUB: replace with full opcode table ---
return (0, 0, 0, False)
@internal
@pure
def is_push(op: uint8) -> bool:
"""
Return True if the opcode is any PUSH variant (PUSH1 .. PUSH32, 0x60..0x7f).
"""
return op >= 0x60 and op <= 0x7f
@internal
@pure
def push_size(op: uint8) -> uint256:
"""
Return the number of immediate bytes for a PUSH instruction.
PUSH1 (0x60) has 1 immediate byte; PUSH32 (0x7f) has 32.
"""
return convert(op, uint256) - 0x5f
# ---------------------------------------------------------------------------
# Main validation entry point
# ---------------------------------------------------------------------------
@external
@pure
def validate(code: Bytes[MAX_CODE_SIZE]) -> bool:
"""
Validate bytecode for MAGIC code constraints (EIP-7979).
Returns True if and only if all five constraints are satisfied.
Only reachable instructions are checked. Unreachable bytes —
whether hidden data or dead code — are never visited and never
checked. This is the correct policy: contracts legitimately hide
data in unreachable bytes.
"""
code_len: uint256 = len(code)
if code_len == 0:
return False
# required_at[pc]: destination-type requirement set when a reachable
# JUMP, JUMPI, or CALLSUB targets pc. Checked when the CFG traversal
# first visits pc — including the case where the destination is inside
# unreachable bytes, which will fail Constraint 1 (invalid opcode) or
# the required_at type check below.
required_at: uint8[MAX_CODE_SIZE] = empty(uint8[MAX_CODE_SIZE])
# visited_data[pc]: data-stack height on first visit to pc, or
# UNVISITED if pc has not yet been reached. Used both as a
# reachability check and for join-point consistency (Constraint 4).
visited_data: int256[MAX_CODE_SIZE] = empty(int256[MAX_CODE_SIZE])
visited_ret: int256[MAX_CODE_SIZE] = empty(int256[MAX_CODE_SIZE])
visited_entersub: int256[MAX_CODE_SIZE] = empty(int256[MAX_CODE_SIZE])
for _i: uint256 in range(MAX_CODE_SIZE):
visited_data[_i] = UNVISITED
visited_ret[_i] = UNVISITED
visited_entersub[_i] = UNVISITED
worklist: Continuation[MAX_WORKLIST] = empty(Continuation[MAX_WORKLIST])
wl_head: uint256 = 0
wl_tail: uint256 = 0
# Seed: execution begins at PC 0 with empty stacks and no preceding PUSH.
worklist[0] = Continuation({
pc: 0,
data_height: 0,
ret_height: 0,
entersub_height: -1,
prev_push_pc: 0,
prev_push_size: 0,
})
wl_tail = 1
for _step: uint256 in range(MAX_WORKLIST):
if wl_head >= wl_tail:
break
cont: Continuation = worklist[wl_head]
wl_head += 1
pc: uint256 = cont.pc
dh: int256 = cont.data_height
rh: int256 = cont.ret_height
eh: int256 = cont.entersub_height
pp: uint256 = cont.prev_push_pc # PC of immediately preceding PUSH
ps: uint256 = cont.prev_push_size # size of that PUSH (0 if none)
# Implicit STOP past end of code: valid termination.
if pc >= code_len:
continue
# Join-point check: consistent heights required on all paths.
if visited_data[pc] != UNVISITED:
if visited_data[pc] != dh or visited_ret[pc] != rh:
return False # Constraint 4: inconsistent stack heights
if visited_entersub[pc] != eh:
return False # Constraint 5: inconsistent entersub height
continue # already fully explored from this state
visited_data[pc] = dh
visited_ret[pc] = rh
visited_entersub[pc] = eh
# Fetch and validate the opcode at this reachable PC.
op: uint8 = convert(slice(code, pc, 1)[0], uint8)
# Constraint 1: opcode must be valid and non-deprecated.
size: uint256 = 0
pops: int256 = 0
pushes: int256 = 0
is_term: bool = False
size, pops, pushes, is_term = self.opcode_info(op)
if size == 0:
return False
# Check any destination-type requirement recorded for this PC.
# Also catches jumps into unreachable bytes: if the destination was
# inside a PUSH immediate or other non-opcode byte, the opcode here
# will either be invalid (caught above) or not the required type.
if required_at[pc] == REQUIRE_JUMPDEST and op != OPCODE_JUMPDEST:
return False
if required_at[pc] == REQUIRE_ENTERSUB and op != OPCODE_ENTERSUB:
return False
# Constraint 4: stack bounds before this instruction.
if dh < 0 or dh > convert(MAX_STACK_DEPTH, int256):
return False
if rh < 0 or rh > convert(MAX_STACK_DEPTH, int256):
return False
new_dh: int256 = dh - pops + pushes
next_pc: uint256 = pc + size
if new_dh < 0 or new_dh > convert(MAX_STACK_DEPTH, int256):
return False
# Read destination from the immediately preceding PUSH, if any.
# pp is the PC of that PUSH, ps is its total size (1 + imm bytes).
have_push: bool = ps > 0 and pp + ps == pc
dest: uint256 = 0
if have_push:
imm_sz: uint256 = ps - 1
for b: uint256 in range(32):
if b >= imm_sz:
break
dest = dest * 256 + convert(
convert(slice(code, pp + 1 + b, 1)[0], uint8),
uint256
)
# Instruction-specific handling.
if op == OPCODE_ENTERSUB:
# Reset entersub_height for this subroutine entry; fall through.
if wl_tail >= MAX_WORKLIST:
return False
worklist[wl_tail] = Continuation({
pc: next_pc,
data_height: new_dh,
ret_height: rh,
entersub_height: new_dh, # eh resets here
prev_push_pc: 0,
prev_push_size: 0,
})
wl_tail += 1
elif op == OPCODE_CALLSUB:
# Constraints 2/3: must be immediately preceded by a PUSH.
if not have_push:
return False
if dest >= code_len:
return False
if required_at[dest] != REQUIRE_NONE and required_at[dest] != REQUIRE_ENTERSUB:
return False
required_at[dest] = REQUIRE_ENTERSUB
new_rh: int256 = rh + 1
if new_rh > convert(MAX_STACK_DEPTH, int256):
return False
if wl_tail >= MAX_WORKLIST:
return False
worklist[wl_tail] = Continuation({
pc: dest,
data_height: new_dh,
ret_height: new_rh,
entersub_height: eh, # updated when ENTERSUB is processed
prev_push_pc: 0,
prev_push_size: 0,
})
wl_tail += 1
elif op == OPCODE_RETURNSUB:
# Constraint 5: must be inside a subroutine.
if eh == -1:
return False
# Constraint 4: return stack must not underflow.
if rh <= 0:
return False
# Terminator; no successor.
elif op == OPCODE_JUMP:
if not have_push:
return False
if dest >= code_len:
return False
if required_at[dest] != REQUIRE_NONE and required_at[dest] != REQUIRE_JUMPDEST:
return False
required_at[dest] = REQUIRE_JUMPDEST
if wl_tail >= MAX_WORKLIST:
return False
worklist[wl_tail] = Continuation({
pc: dest,
data_height: new_dh,
ret_height: rh,
entersub_height: eh,
prev_push_pc: 0,
prev_push_size: 0,
})
wl_tail += 1
elif op == OPCODE_JUMPI:
if not have_push:
return False
if dest >= code_len:
return False
if required_at[dest] != REQUIRE_NONE and required_at[dest] != REQUIRE_JUMPDEST:
return False
required_at[dest] = REQUIRE_JUMPDEST
# Branch target:
if wl_tail >= MAX_WORKLIST:
return False
worklist[wl_tail] = Continuation({
pc: dest,
data_height: new_dh,
ret_height: rh,
entersub_height: eh,
prev_push_pc: 0,
prev_push_size: 0,
})
wl_tail += 1
# Fall-through:
if wl_tail >= MAX_WORKLIST:
return False
worklist[wl_tail] = Continuation({
pc: next_pc,
data_height: new_dh,
ret_height: rh,
entersub_height: eh,
prev_push_pc: 0,
prev_push_size: 0,
})
wl_tail += 1
elif not is_term:
# All other non-terminating instructions: fall through.
# If this instruction is a PUSH, record it for the successor.
next_pp: uint256 = 0
next_ps: uint256 = 0
if self.is_push(op):
next_pp = pc
next_ps = size
if wl_tail >= MAX_WORKLIST:
return False
worklist[wl_tail] = Continuation({
pc: next_pc,
data_height: new_dh,
ret_height: rh,
entersub_height: eh,
prev_push_pc: next_pp,
prev_push_size: next_ps,
})
wl_tail += 1
# Terminating instructions with no successors require no action.
return True
This proposal introduces no new security considerations beyond those already present in the EVM. Validated contracts will be more secure: they cannot execute invalid instructions, jump to invalid locations, underflow stack, or, in the absence of recursion, overflow stack.
Copyright and related rights waived via CC0.