Two Functions I Didn’t Refactor

April 2026 sim_ex Elixir refactoring

The rule is one sentence: prefer pattern-matched defp clauses over internal case/if. Apply it to every private function in a 4,585-line simulation engine. Fourteen functions qualify. Twelve comply. Two refuse. The two that refuse teach you more about when pattern matching helps than the twelve that comply.

The Pass

sim_ex has four engine variants — Map-backed, ETS-backed, tick-diasca, and parallel — plus a DSL layer for resources and conveyors. Every engine has a tight loop that pops from a :gb_trees calendar, checks a stop condition, dispatches the event, and recurses. Every tight loop had the same shape: a case :gb_trees.is_empty(calendar) wrapping an if time > stop_time wrapping the dispatch. Four files, four copies.

These were mechanical. Lift the case into handle_pop(true, engine) and handle_pop(false, engine). Lift the if into a guard: step_forward({{time, _}, _, _}, %{stop_time: st} = engine) when time > st. Three clauses replace one nested block. The reader sees each scenario in isolation, in the function head, without tracing through branches. Twelve functions followed this template across ten files. The test suite — 112 tests, 23 property-based, including 700 adversarial command sequences from proper_statem — passed identically after every phase.

The Trick

The best refactor in the pass was not a template application. The parallel engine drains all events at a given {tick, diasca} from the calendar before dispatching them as a batch. The original code:

{{t, d, _seq}, _val} = :gb_trees.smallest(calendar)

if t == tick and d == diasca do
  {_key, {target, event}, calendar} = :gb_trees.take_smallest(calendar)
  drain_diasca(calendar, tick, diasca, [{target, event} | acc])
else
  {acc, calendar}
end

The refactored code:

defp drain_match({{tick, diasca, _seq}, _val}, calendar, tick, diasca, acc) do
  {_key, {target, event}, calendar} = :gb_trees.take_smallest(calendar)
  drain_diasca(calendar, tick, diasca, [{target, event} | acc])
end

defp drain_match(_peeked, calendar, _tick, _diasca, acc), do: {acc, calendar}

No guard. No comparison operator. The equality check is the pattern itself. Elixir binds tick and diasca from the function arguments, then requires the destructured calendar key to contain the same values. If the peeked event is at a different timestamp, the first clause does not match and the second clause — the fallthrough — terminates the drain.

This is not a reorganized if statement. An if statement asks a question and branches on the answer. A same-position binding does not ask a question. It is the answer. The clause exists only for the matching case. The non-matching case is the absence of the clause. That is a different relationship between the code and the condition, and it is the reason pattern matching is a design tool rather than control-flow syntax.

The same trick appeared in the resource scheduler: apply_capacity_change(cap, cap, state, _clock), where the first two arguments are the new and old capacity. If they are equal, both bind to cap and the state passes through unchanged. No case new_cap do ^old_cap -> .... No pinning operator. The equality is structural.

The First Function I Didn’t Refactor

handle_seize_preemptive/5 in Sim.DSL.Resource is fifty lines. It receives a seize request for a preemptive resource and makes a three-way decision: grant if capacity is available, preempt the worst holder if the incoming priority beats it, or enqueue.

I sketched two refactored versions. The first dispatched on a boolean (grant_or_queue(state.busy < state.capacity, ...)), then dispatched again on the find_worst_holder result (try_preempt_or_enqueue({worst_job, {_, worst_priority, _}}, ...) when priority < worst_priority). Four functions, five clauses, six parameters each.

It was worse. Each helper existed solely as a branch dispatcher, called from exactly one place, with no reuse value. The original fifty lines told a linear story: check capacity, check priority, act. The reader followed it top to bottom. The refactored version scattered the story across four function definitions, and the reader had to reconstruct the narrative by chasing call chains. The parameter list — job_id, requestor_id, priority, clock, state — was threaded through every helper identically, a ceremony that added width without adding meaning.

I wrote a comment in the code explaining why the function was exempt, and moved on. The comment is more useful than the refactor would have been. A future developer scanning the file will see: this was considered, this is why it stays. That is an architectural decision record in eleven lines, embedded where it matters, costing nothing.

The Second Function I Didn’t Refactor

find_run/6 in Sim.Warmup walks a list of relative changes, counting consecutive values below a threshold. Two nested ifs: is this change below the threshold? If so, have we seen enough consecutive sub-threshold changes? The function takes six arguments — the list, an index list, the threshold, the minimum steady-state length, a consecutive counter, and the run start index.

To lift the outer if change < threshold into a head-dispatched helper, I would need to pass all six arguments plus the boolean result of the comparison. Nine parameters through a function whose only purpose is to select between two branches of a counter update. The counter update itself is three lines.

This is iterative numerical logic. The if is not discriminating on a tagged result or a struct field — it is comparing a floating-point value to a threshold on every iteration of a list walk. Pattern matching has nothing to offer here that the if does not already provide.

Three Patterns That Work, Three That Don’t

WorksShapeExample
Boolean dispatchf(true, state) / f(false, state)handle_pop(:gb_trees.is_empty(cal), engine)
Tagged-result dispatchf({:ok, val}) / f(:not_found)resolve_schedule_hit(find_in_schedule(...))
Same-position bindingf(x, x, state)drain_match({{tick, diasca, _}, _}, cal, tick, diasca, acc)
Doesn’tShapeWhy
Numeric thresholdif x < limitComputed comparison on values, not structural discrimination
Linear decision treeif A do ... case B do ...Branches share state and form a narrative; fragmenting loses coherence
Inline ternaryif x > y, do: x, else: yOne-liner expression, already at maximum clarity

The Surprise

The most consequential change in the pass was not a pattern-match refactor. It was a deduplication.

Sim.Engine had a public step/1 function (used by proper_statem for stateful testing) and a private loop/1 function (the production tight loop). Both contained identical event-dispatch logic: pop from calendar, check stop time, dispatch, update state. Two copies, maintained in parallel, diverging silently whenever one was patched and the other wasn’t.

Lifting the case/if into advance_one/1 forced the question: who else needs this? The answer was step/1, which became a one-liner: def step(engine), do: advance_one(engine). The production loop and the test harness now share the same code path. A bug fix to one is a bug fix to both. This was not planned. It fell out of the refactoring because naming the pieces made the duplication visible.

That is, in the end, the argument for pattern-matched defp clauses. Not that they are shorter — they are longer. Not that they are faster — they are the same. The argument is that naming each scenario as its own function clause forces you to see the function’s structure from the outside, and from the outside, duplications and asymmetries that are invisible from within become obvious.

Left to Right

The pattern-match pass created a problem. Consider what the engine loop looked like after refactoring:

defp loop(engine), do: continue_loop(advance_one(engine))

defp advance_one(engine) do
  handle_pop(:gb_trees.is_empty(engine.calendar), engine)
end

defp handle_pop(false, engine) do
  step_forward(:gb_trees.take_smallest(engine.calendar), engine)
end

Every case and if is gone. Every clause has a clean head. But look at the call sites: continue_loop(advance_one(engine)), handle_pop(:gb_trees.is_empty(engine.calendar), engine), step_forward(:gb_trees.take_smallest(engine.calendar), engine). The data flows inside out. To read what happens, you start at the innermost parenthesis and work outward.

The pipe operator fixes this for the same reason the pattern-match refactor fixed the case/if: it makes the code’s structure match the reader’s direction of travel.

defp loop(engine), do: engine |> advance_one() |> continue_loop()

defp advance_one(engine) do
  engine.calendar |> :gb_trees.is_empty() |> handle_pop(engine)
end

defp handle_pop(false, engine) do
  engine.calendar |> :gb_trees.take_smallest() |> step_forward(engine)
end

Twenty-one call sites across eight files. Each one the same transformation: f(g(x), y) becomes x |> g() |> f(y). The data enters on the left and exits on the right. You read it the way time moves.

One call site was skipped. The conveyor’s board_if_capacity(map_size(state.in_transit) < state.capacity, clock, state) pipes a boolean comparison expression — (map_size(state.in_transit) < state.capacity) |> board_if_capacity(clock, state) — and that is ugly in a way that pattern matching the boolean true/false at the head is not. The pipe operator works when the flowing value is a thing — a state, a calendar, a queue — not when it is a predicate.

The two passes are one idea applied twice: make the code literal. The first pass made each clause literal — one scenario per head, no branching in the body. The second pass made each call chain literal — data flows left to right, no nesting to unwind. Both add lines. Both add clarity. The pattern-match pass created the named helpers that made the pipe pass possible, and the pipe pass justified the helpers by making them compose visibly.

The Accounting

Fourteen functions met the pattern-match rule. Twelve complied. Two got comments explaining why they didn’t. One got deduplicated by accident. Twenty-one call sites met the pipe rule. Twenty complied. One got skipped. The line count went up by about fifteen percent. The test suite did not move — 112 tests, 23 properties, zero failures, four consecutive runs.

The codebase is not more clever. It is more literal. Every clause says exactly one thing, in its head, and the reader does not have to enter the body to know which scenario it handles. Every call chain says exactly one thing, left to right, and the reader does not have to find the innermost parenthesis to know where the data starts.

The two functions I didn’t refactor are the ones I’m most confident about. The one call site I didn’t pipe is the one that proved the rule has an edge.