Le Simulateur — dataalienist.com

The first number was 44,000. Events per second, running a PHOLD benchmark through a simulation engine that had been alive for about three hours. Three GenServers — a clock, a calendar, an entity manager — passing messages to each other for every simulated event. It was correct. It was clean. It was, by discrete-event simulation standards, absurdly slow.

The second number was 539,000. Same benchmark, same machine, same entities. One change: remove the GenServers from the hot path. Run the event loop as a single tail-recursive function operating on plain data structures. No mailboxes, no term copying, no scheduling overhead. Just a :gb_trees priority queue, a Map of entity states, and a function that pops, dispatches, pushes, and recurs.

The ratio between those numbers — 7.8× at 100 entities, 2.8× at ten thousand — is the entire thesis of this project, compressed into a benchmark.

The Isomorphism Nobody Talks About

Averill Law's Simulation Modeling and Analysis has 23,700 citations. It is the textbook. And if you read it as an Erlang developer, a strange recognition settles in. Law's simulation architecture — entities with state, an event calendar, a clock that advances discretely, resources that queue — isn't analogous to OTP. It is OTP.

Law's DES	OTP
Entity with state	GenServer
Event calendar	Priority queue
Simulation clock	Monotonic GenServer state
Entity failure → recovery	Supervisor restart
Parallel replications	`Task.async_stream`
Common random numbers	Functional `:rand` state
Hot model fix	Hot code reload

Sim-Diasca at Électricité de France proved this in 2010, running millions of Erlang actors for energy grid simulation. InterSCSimulator did the same for urban traffic. The precedent exists. What neither had was a statistical inference layer.

The Engine

sim_ex is a discrete-event simulation engine. Zero dependencies. Eleven tests. The core is eight modules:

Sim.Entity      # @behaviour: init/1, handle_event/3, statistics/1
Sim.Clock       # next-event time advance
Sim.Calendar    # :gb_trees priority queue, FIFO tie-breaking
Sim.EntityManager # registry + dispatch
Sim.Resource    # capacity-limited server (M/M/c)
Sim.Source      # arrival generator
Sim.Topology    # ETS shared state
Sim.Statistics  # Welford streaming + batch means CI

An M/M/1 queue — Law's Chapter 1, the hello-world of simulation — is five lines of configuration:

{:ok, result} = Sim.run(
  entities: [
    {:arrivals, Sim.Source, interarrival: {:exponential, 1.0}},
    {:server, Sim.Resource, service: {:exponential, 0.5}}
  ],
  initial_events: [{0.0, :arrivals, :generate}],
  stop_time: 50_000.0
)

Utilization converges to 0.5. Mean wait converges to the theoretical value. Same seed, same trajectory. The simulation is deterministic because :rand state is functional — no global mutable PRNG, no thread-local surprises.

The Two Modes

The first implementation was clean OTP. Three GenServers, three processes, proper Elixir. Pop an event from the Calendar process. Send it to the EntityManager process. EntityManager dispatches, gets new events back, sends them to Calendar. Three mailbox round-trips per simulated event.

At 10 microseconds per GenServer.call, that's 30 microseconds of overhead on events that take 5 microseconds of actual work. The infrastructure was six times more expensive than the simulation.

The Engine fixes this by running the entire loop in one process. No GenServer. No message passing. A tail-recursive function that pops from :gb_trees, calls module.handle_event/3 directly, inserts new events, and recurs. The result:

LPs	Engine (events/s)	GenServer (events/s)	Speedup
100	539,076	82,553	6.5×
1,000	157,598	64,928	2.4×
10,000	123,967	43,842	2.8×

Both modes remain. Engine for throughput. GenServer for interactive stepping, distributed simulation, live dashboards, fault tolerance. The choice is one keyword: mode: :engine or mode: :genserver.

The Lesson

This is the same insight we found building the NUTS sampler. The JIT boundary matters. EXLA.jit outputs returned on EXLA.Backend; copying them to BinaryBackend gave a 3× speedup. The leapfrog integrator belongs inside the JIT; the tree builder belongs outside.

Simulation has the same structure. The event dispatch loop belongs in a tight function. The entity lifecycle — creation, failure recovery, distribution across nodes, hot code reload — belongs in OTP processes.

The actor model is the right abstraction for the system. It is the wrong abstraction for the inner loop. Use processes for the outer structure. Use functions for the hot path. The BEAM's value is architectural, not computational. This is true for MCMC. It is true for DES. It may be true for everything.

There Is No Now

In 1978, Leslie Lamport published a paper that changed how we think about time in distributed systems. The title was plain: “Time, Clocks, and the Ordering of Events in a Distributed System.” The insight was not. Physical clocks cannot be trusted across machines. What matters is not when something happened, but what caused what. His logical clocks gave distributed systems a way to reason about causality without pretending that “now” means the same thing on two different computers.

Justin Sheehy — then CTO of Basho, the company behind Riak, one of the most important distributed databases built on the BEAM — drove this point home in his 2015 ACM Queue article “There is No Now.” He opened with Rear Admiral Grace Hopper handing each student a piece of wire 11.8 inches long: the maximum distance electricity can travel in one nanosecond. A physical argument against a comforting abstraction. Sheehy’s thesis: even Google’s Spanner, with GPS satellites and atomic clocks, does not give you “now.” TrueTime returns a range of uncertainty. The best Google can do is one to seven milliseconds of clock drift at any moment. If that is the best, the rest of us should stop pretending.

Simulation has the same problem. When two events happen at the same simulated time, which one goes first? In Arena and SimPy, the answer is insertion order — FIFO within a timestamp. This is arbitrary. If entity A’s event at t=10.0 causes entity B to react, and both are scheduled at t=10.0, FIFO might process B before A. The effect before the cause. Most simulation textbooks wave this away. Lamport would not.

Sim-Diasca, the Erlang simulation engine built at Électricité de France in 2010, solved this with a two-level timestamp: {tick, diasca}. When entity A handles an event at tick T, diasca D, any events it produces for other entities land at (T, D+1). Cause at diasca 0, effect at diasca 1, reaction at diasca 2. The tick advances only when no more diascas are pending — quiescence. No Lamport clocks needed. No vector clocks. The causal ordering is built into the timestamp structure itself.

sim_ex implements this. Three event forms:

# Causal reaction — same tick, next diasca
{:same_tick, target, payload}     # → (T, D+1)

# Schedule at a future tick
{:tick, future_tick, target, payload}  # → (future_tick, 0)

# Relative delay
{:delay, delta, target, payload}       # → (T + delta, 0)

The calendar key is {tick, diasca, seq} — a three-element tuple that sorts naturally in Erlang’s :gb_trees. {5, 2, _} always comes before {5, 3, _}, which always comes before {6, 0, _}. Quiescence detection is free: just pop the smallest key. If it’s a new tick, the old tick is done. No barrier protocol, no global synchronization, no coordination service. The data structure is the synchronization.

The lineage runs through the BEAM community like a wire. Lamport’s logical clocks (1978). Chandy and Misra’s distributed simulation (1979). Sim-Diasca on Erlang (2010). Sheehy at Basho, building Riak on the same runtime, writing about the same impossibility (2015). And now sim_ex, where the simulation engine and the distributed database share not just a theoretical heritage but a virtual machine. The BEAM has always understood that “now” is a distributed lie. It was built for telephone switches, where a dropped call is worse than a slow one, and where two switches must never disagree about who is talking to whom. Causal ordering is not a feature. It is the reason the runtime exists.

Le Quatrième

sim_ex is the fourth library. The quartet:

Library	Inference	For when you say
eXMC	NUTS/HMC	“I know the model”
smc_ex	O-SMC²	“The data never stops”
StochTree-Ex	BART	“I don't know the function”
sim_ex	DES	“I need to simulate it”

The first three answer questions about data. The fourth generates data by simulating systems. But here's what happens when you put them together: a simulation that fits posteriors over its own input parameters (eXMC), calibrates online from sensor data (smc_ex), discovers which inputs matter (StochTree-Ex), and runs the simulation itself (sim_ex). All in one runtime. All supervised. All hot-reloadable.

No commercial simulation engine offers this. Not AnyLogic. Not Simio. Not Arena. They have better GUIs. They have decades of domain libraries. But they cannot build a simulation that learns, because their runtimes weren't designed for it. Ours was designed for telephone switches, which turns out to be close enough.

sim_ex is at github.com/borodark/sim_ex. Zero dependencies. Twenty-six tests. 539,000 events per second. Tick-diasca causal ordering. GPSS-style DSL. The PHOLD benchmark is in benchmark/.