EIP-8173 - Foundations of EVM Control Flow

Created	2026-02-16
Status	Draft
Type	Informational
Authors

Abstract

This Informational EIP provides foundational context for understanding control flow in the EVM. It covers the historical development of control flow mechanisms in computing, the technical foundations of control flow analysis, and the impact of static control flow on Ethereum's scaling roadmap.

This document serves as background material for proposals for static control flow — past and current proposals for subroutines and static jumps, container-format functions, call and return opcodes, and static relative jumps and calls — for discussions around RISC-V migration and ZK verification infrastructure, and for discussions still to come.

Motivation

Historical Context

Babbage 1833: Jumps and Conditional Jumps

In 1833 Charles Babbage began the design of the Analytical Engine: a steam-powered, mechanical, Turing-complete computer. Programs were to be encoded on punched cards that controlled a system of rods, gears, and other machinery to implement arithmetic, storage (for 1,000 40-digit decimal numbers), and conditional jumps. Jumps were supported by cards that shuffled the card deck forwards or backwards a fixed number of cards.

Lovelace 1843: Computer Science and Machine Intelligence

The first published description of the Analytical Engine was in French, by L. F. Menabrea, 1842^1. The English translator, Ada Augusta, Countess of Lovelace, made extensive notes on the science of computer programming, and published her translation in 1843. The notes include her famous program for iteratively computing Bernoulli numbers — arguably the world's first complete computer program — which used conditional jumps to implement the required nested loops.

In Lady Lovelace's notes we also find her prescient recognition of the Analytical Engine's power — "In enabling mechanism to combine together general symbols in successions of unlimited variety and extent, a uniting link is established between the operations of matter and the abstract mental processes of the most abstract branch of mathematical science."^1 Here also we find what Alan Turing later called "Lady Lovelace's Objection"^2 to the possibility of machine intelligence — "It can do whatever we know how to order it to perform. It can follow analysis; but it has no power of anticipating any analytical relations or truths. Its province is to assist us in making available what we are already acquainted with."^1

Turing, 1946: Calls and Returns

In 1945 Alan Turing began designing his Automatic Computing Engine^3 (ACE), completing the proposal in early 1946, in which he introduced the concept of calls and returns: "To start on a subsidiary operation we need only make a note of where we left off the major operation and then apply the first instruction of the subsidiary. When the subsidiary is over we look up the note and continue with the major operation."

The ACE used mercury delay-line memory, including a return stack holding return addresses. The smaller Pilot ACE was for a time the world's fastest computer.

Industry Practice: 1945 to present

Call and return facilities of various names … subroutines, procedures, functions, methods … and levels of complexity … link registers, return stacks, stack frames, register windows … have proven their worth across a long line of important machines over the last 80 years, including most of the machines I have programmed or implemented: physical machines including the Burroughs B5000, CDC 7600, IBM 360, PDP-11, VAX, Motorola 68000, and Intel x86, and virtual machines including those for Scheme, Forth, Pascal, Java, and WebAssembly.

Especially relevant to the EVM's design are the Java, WebAssembly (Wasm), and CLI (.NET) VMs. They share crucial common properties:

they are represented with portable bytecode
that can be directly interpreted
with static control flow
that can be validated before runtime
and can be compiled to machine code at runtime
in linear time.

The static control flow that supports linear-time compilers also supports any other code that needs to traverse the control flow of a program, traversing each path only once.

Control Flow

Among the most important reasons to traverse the control flow of a program is to extract the control flow graph. The EVM makes this quadratically difficult due to its dynamic control flow.

Control Flow Graphs

A control flow graph (CFG) is a directed graph representation of a program where:

Nodes represent blocks of instructions (sequences with exactly one entry and one exit)
Edges represent possible transfers of control between blocks

A complete and sound CFG represents all and only the possible paths of program execution and is a fundamental starting point for many downstream tasks, including many static analyses:

validating bytecode before execution
translating bytecode to other representations (virtual register code, machine code)
automated formal security analysis
constructing ZK proof systems
and many more

Dynamic Control Flow: The Problem

Dynamic jumps make dynamic control flow possible, make quadratically complex CFGs possible, and can make static analysis of control difficult to impossible. Dynamic jumps are not a problem for machines that run on physical hardware — their instructions are designed for speed. But they are uncommon in virtual machines whose code is the source for downstream tools. This is because, as we will illustrate below, building and traversing a dynamic control flow graph can take quadratic space and time. For this reason Java and Wasm do not support dynamic jumps and CLI carefully restricts them.

EVM: Dynamic Control Flow

For an example, consider these EVM programs and the easily generated series of longer programs like them ... the really long ones make for nice exploits. It's not important that at runtime gas isn't random or that the jump will most often fail; what matters is that because the jump destination is taken from the stack it is impossible to know a priori where the jumps go, so every path must be explored.

   jumpdest           jumpdest          jumpdest          ...
   gas                gas               gas               ...
   jump               jump              jump              ...
   jumpdest           jumpdest          jumpdest          ...
   gas                gas               gas               ...
   jump               jump              jump              ...
   jumpdest           jumpdest          jumpdest          ...
   stop               gas               gas               ...
                      jump              jump              ...
                      jumpdest          jumpdest          ...
                      stop              gas               ...
                                        jump              ...
                                        jumpdest          ...
                                        stop              ...
                                                          ...
                                                          ...
                                                          ...

The control flow graphs for these programs make the problem clear. Each block of instructions in a graph is a sequence from the above programs with one entry (a JUMPDEST) and one exit (a JUMP, or the final stop), and each arc is a transfer of control. Arcs on the left are backwards branches; arcs on the right are forwards branches. See how the tangle of arcs goes up fast, faster than the programs get longer.

Control flow graphs

To be precise, the number of arcs in these graphs can be as large as the number of blocks times one less than the number of blocks — that is, O(N²) — since every block's dynamic jump might target any block. At each block's exiting instruction (except the stop) the analyzer cannot know a priori where control will transfer, so every possible destination must be considered. Building the complete CFG therefore requires O(N²) time and space, and this is not just a theoretical worst case, as shown above. Downstream analyses that must traverse all paths through the CFG — such as formal verification or ZK circuit construction — can be worse still, since the number of distinct paths through a fully connected graph grows factorially with the number of blocks.

In Java, Wasm, and CLI it is simply impossible to have programs like these.

EVM: Calls and Returns

The Ethereum Virtual Machine does not provide explicit facilities for calls and returns. Instead, they must be synthesized using the dynamic JUMP instruction, which takes its argument from the stack and stores return addresses on the stack. So control flow must be dynamic, which creates the quadratic CFG problems explained above.

The cost is easiest to see in the shape Solidity emits for internal function calls. The EVM has no call instruction, so the compiler builds one: the return address is pushed as ordinary data, and the return is a jump. And the helper is worth having — deployed bytes cost gas, and contracts are capped at 24576 of them, so shared code pays. Here a helper squares a number for two callers:

CALL_A: push RTN_A      ; return address, as ordinary data
        push 2
        push SQUARE
        jump
RTN_A:  jumpdest
        ...
CALL_B: push RTN_B
        push 3
        push SQUARE
        jump
RTN_B:  jumpdest
        ...
SQUARE: jumpdest
        dup1
        mul
        swap1
        jump            ; return — but to where?

At run time the final jump goes wherever the pushed address says. An analyzer must answer before the code runs, and the address is data — so it must follow both returns. After one call it is in two places at once; after two calls, four; every call doubles the count. Twenty calls is a million paths, one of them real. To rule out the rest, the analyzer must track what the code keeps on the stack — the very bookkeeping an explicit call would have done for it.

With explicit calls and returns there is nothing to answer: every return goes back to its call. Twenty calls, one path.

For Ethereum, these behaviors are a denial-of-service vulnerability for any online static analysis, including bytecode validation and AOT compilation at contract creation time, and JIT compilation at runtime.

Even offline, dynamic jumps (and the lack of calls and returns) can cause static analyses of many contracts to become quadratically impractical, exponentially intractable, or even mathematically impossible. For examples, consider these abstracts from a few recent papers on the problem. The last paper resorts to neural nets to disassemble (most) Solidity programs. There is an entire academic literature of complex, incomplete solutions to problems that static control flow renders trivial.

"Ethereum smart contracts are distributed programs running on top of the Ethereum blockchain. Since program flaws can cause significant monetary losses and can hardly be fixed due to the immutable nature of the blockchain, there is a strong need of automated analysis tools which provide formal security guarantees. Designing such analyzers, however, proved to be challenging and error-prone."^4

"The EVM language is a simple stack-based language ... with one significant difference between the EVM and other virtual machine languages (like Java bytecode or CLI for .Net programs): the use of the stack for saving the jump addresses instead of having it explicit in the code of the jumping instructions. Static analyzers need the complete control flow graph (CFG) of the EVM program in order to be able to represent all its execution paths."^5

"Static analysis approaches mostly face the challenge of analyzing compiled Ethereum bytecode... However, due to the intrinsic complexity of Ethereum bytecode (especially in jump resolution), static analysis encounters significant obstacles."^6

"Analyzing contract binaries is vital ... comprising function entry identification and detecting its boundaries... Unfortunately, it is challenging to identify functions ... due to the lack of internal function call statements."^7

In my experience, to avoid the problems of dynamic control flow, VMs use static jumps, calls, and returns.

Static Control Flow and Ethereum Scaling

As laid out above, static control flow means that the destination of every jump or call is determinable a priori, before execution. This has concrete implications for Ethereum's scaling roadmap, particularly around ZK verification, rollups, and future execution layer changes.

Static Control Flow and Rollups

ZK-Rollups

To understand why static control flow matters for ZK-Rollups^8, we need to briefly understand how ZK systems verify computation:

Execution Traces: When a transaction executes, it produces an execution trace: a concrete, step-by-step record of every opcode executed, with the values of the stack, memory, and storage at each step. The trace records what actually happened — including exactly where every jump went — so it is always linear in the number of steps executed.
Circuits: A circuit is a mathematical model of a computation. For a ZK system, the circuit encodes the rules of the EVM: what states can follow from what prior states, how gas is consumed, which memory accesses are valid, etc. The circuit is a set of polynomial constraints that must all be satisfied. Crucially, the circuit encodes the rules for each opcode type (including the rule for JUMP), not the specific paths taken by any particular execution.
Witnesses: The witness is the private data the prover supplies to demonstrate that a specific execution was correct. It includes the execution trace itself plus auxiliary data. The public inputs are the pre- and post-state roots — what the state was before and after the transaction.
Proofs: A ZK proof is a cryptographic proof that the witness satisfies the circuit's constraints, without revealing the witness itself.

A ZK-Rollup sequencer or prover batches many transactions, generates a ZK proof that all transactions executed correctly, and submits that proof to L1, where it is quickly verified. The prover works from the actual execution trace — it does not explore or enumerate possible paths. The jump destinations are already known because they were recorded when the transaction ran.

The benefits of static control flow for ZK proving are therefore not about path exploration, but about the efficiency of the code being proven and the tractability of the analyses that surround it:

Leaner bytecode: Static jumps, explicit subroutine calls, and better stack discipline (eliminating SWAP/DUP workarounds) produce significantly smaller and simpler bytecode. Because the prover must process every opcode in the execution trace, fewer opcodes mean less work.
Deployment-time validation: With static control flow, bytecode can be fully validated once at deployment in linear time — including stack underflow, static stack heights, and reachability — rather than requiring expensive per-execution checks or quadratic offline analysis. This is where the O(N²) CFG problem described above directly impacts ZK infrastructure: building and analyzing the CFG of legacy EVM code before constructing a circuit or a formal model requires quadratic work; static control flow reduces this to linear.
Faster proof generation and lower hardware requirements follow directly from the above: fewer opcodes to prove and cheaper analysis reduce the compute and memory demands on rollup provers.

Optimistic Rollups

Optimistic Rollups assume transactions are valid but allow fraud proofs to dispute invalid state roots. A fraud proof re-executes the contested transaction and demonstrates that the submitted state root was wrong. Key implications of static control flow include:

Bytecode validation: Static control flow allows contract bytecode to be fully validated at deployment in linear time, establishing safety properties that can be relied on during dispute resolution.
Dispute resolution: When a fraud dispute arises, the dispute system must re-execute the contested transaction. The re-execution itself is deterministic regardless of control flow style, but static control flow makes the surrounding machinery — validators, formal checkers, and the interactive verification game — simpler to implement correctly and easier to reason about.
Interactive verification: Some optimistic rollup designs use multi-round interactive verification games that bisect execution to identify the disputed step. Clear, structured control flow makes the bisection protocol more tractable and reduces the risk of edge cases in the dispute game implementation.

Static Control Flow and Code Generation

As already discussed, static control flow enables contracts to be compiled to machine code before execution, just-in-time or ahead-of-time. This is an obvious win for non-ZK clients, whether on L1, L2, or EVM-compatible chains.

Static Control Flow and RISC-V Migration

There are ongoing discussions within the Ethereum research community about potentially replacing the EVM with a RISC-V execution environment. RISC-V has a standard instruction set architecture that is seeing increasing use in the ZK community. One current strategy for creating a ZK-EVM is to compile an EVM interpreter like evmone or revm to RISC-V for use in a ZK-VM. Supporting RISC-V directly eliminates the overhead of the EVM interpreter. An EVM with static control flow opens up another strategy — compile the EVM code to RISC-V code. That gives good RISC-V code in one linear-time pass, and better code in multiple passes, altogether linear time.

A missing piece in this puzzle is that RISC-V is a 32-bit or 64-bit architecture, but the current EVM is a 256-bit architecture. For that purpose there are current proposals for 64-bit EVM opcodes. It's also the case that the prover has the actual trace, and can tell how many bits are actually in use at each step of the computation, so the 256-bit registers might not be that big a problem.

Specification

Several EIPs, past and current, specify new EVM opcodes and semantics for static control flow: subroutines and static jumps, container-format functions, call and return opcodes, and static relative jumps and calls. They are all implemented with the standard Turing return stack architecture, and are for the most part compatible with each other. In particular, the simplest of them — three call and return opcodes — can be used to implement all of the others.

Rationale

Static control flow has been a cornerstone of efficient computation since Babbage and Turing. The EVM's reliance on dynamic jumps is an anomaly among virtual machines and a significant barrier to analysis, compilation, and scaling. Proposals to introduce explicit call/return opcodes and enforce static control flow bring the EVM in line with industry best practices and unlock a range of optimizations critical to Ethereum's scaling roadmap.

Static control flow is not a silver bullet. But it is a foundational piece that enables:

Better tooling and security analysis for language developers and auditors
Faster execution via compilation for non-ZK clients
Future migrations to other execution environments (RISC-V, etc.)
Efficient ZK proof generation for ZK-Rollups
Cleaner fraud proofs for Optimistic Rollups

By making control flow explicit and enforceable, the EVM becomes compatible with the full ecosystem of optimization and analysis techniques that other VMs and processor designs have leveraged for decades.

Security Considerations

This Informational proposal itself specifies no changes to the protocol. Therefore it has no direct security implications. It does not affect the security considerations of the proposals it describes; rather, it helps to motivate and contextualize them.

{
  "type": "article",
  "id": 1,
  "author": [
    {
      "family": "Menabrea",
      "given": "L.F."
    }
  ],
  "DOI": "10.1145/2809523.2809528",
  "title": "Sketch of the Analytical Engine Invented by Charles Babbage.",
  "original-date": {
    "date-parts": [
      [1842, 10, 1]
    ]
  },
  "URL": "https://doi.org/10.1145/2809523.2809528"
}
```

{
  "type": "article",
  "id": 2,
  "author": [
    {
      "family": "Turing",
      "given": "A.M."
    }
  ],
  "DOI": "10.1093/mind/LIX.236.433",
  "title": "Computing Machinery and Intelligence.",
  "original-date": {
    "date-parts": [
      [1950, 10, 1]
    ]
  },
  "URL": "https://doi.org/10.1093/mind/LIX.236.433"
}
```

{
  "type": "article",
  "id": 3,
  "author": [
    {
      "family": "Carpenter",
      "given": "B.E."
    }
  ],
  "DOI": "10.1093/comjnl/20.3.269",
  "title": "The other Turing machine.",
  "original-date": {
    "date-parts": [
      [1977, 1, 1]
    ]
  },
  "URL": "https://doi.org/10.1093/comjnl/20.3.269"
}
```

{
  "type": "article",
  "id": 4,
  "author": [
    {
      "family": "Schneidewind",
      "given": "Clara"
    }
   ],
   "DOI": "10.48550/arXiv.2101.05735",
   "title": "The Good, the Bad and the Ugly: Pitfalls and Best Practices in Automated Sound Static Analysis of Ethereum Smart Contracts.",
   "original-date": {
     "date-parts": [
     [2021, 1, 14]
    ]
  },
  "URL": "https://arxiv.org/abs/2101.05735"
}
```

{
  "type": "article",
  "id": 5,
  "author": [
    {
      "family": "Albert",
      "given": "Elvira"
    }
  ],
  "DOI": "10.48550/arXiv.2004.14437",
  "title": "Analyzing Smart Contracts: From EVM to a Sound Control Flow Graph.",
  "original-date": {
    "date-parts": [
      [2020, 4, 29]
    ]
  },
  "URL": "https://arxiv.org/abs/2004.14437"
}
```

{
  "type": "article",
  "id": 6,
  "author": [
    {
      "family": "Contro",
      "given": "Filippo"
    }
  ],
  "DOI": "10.48550/arXiv.2103.09113",
  "title": "EtherSolve: Computing an Accurate Control Flow Graph from Ethereum Bytecode.",
  "original-date": {
    "date-parts": [
      [2021, 3, 16]
    ]
  },
  "URL": "https://arxiv.org/abs/2103.09113"
}
```

{
  "type": "article",
  "id": 7,
  "author": [
    {
      "family": "He",
      "given": "Jiahao"
    }
  ],
  "DOI": "10.48550/arXiv.2301.12695",
  "title": "Neural-FEBI: Accurate Function Identification in Ethereum Virtual Machine Bytecode.",
  "original-date": {
    "date-parts": [
      [2023, 1, 30]
    ]
  },
  "URL": "https://arxiv.org/abs/2301.12695"
}
```

{
  "type": "article",
  "id": 8,
  "author": [
    {
      "family": "Jain",
      "given": "Akshita"
    }
  ],
  "DOI": "10.30696/JAC.XVIII.1.2024.297-315",
  "title": "Exploring the Efficacy of Rollups: A Comparative Study of Optimistic and ZK-Rollups and Their Popular Implementations.",
  "original-date": {
    "date-parts": [
      [2024, 1, 1]
    ]
  },
  "URL": "https://doi.org/10.30696/JAC.XVIII.1.2024.297-315"
}
```