May 29, 2026 ·7 min read·Deep Dive

Hot-reloadable YAML strategies: composing trading rules as data

Adding a strategy to Cerberus Markets is a YAML file, not a deploy. The architecture that makes this possible: a Pydantic schema, a file watcher, and an invariant that's worth its weight: the rule evaluator is a pure function from configuration plus indicator values to signal matches.

By Igor Riera

In Cerberus Markets, strategies are data, not code. Adding a new strategy is editing a YAML file. Changing an existing strategy is editing a YAML file. Backtesting a tweak before going live is editing a YAML file and pressing a button. The application code never changes when the rules change.

This post is the architecture that makes that work, the invariants it depends on, and why the tradeoffs are worth it for this specific class of system.

The configuration

A strategy is a YAML file. Here’s the reference one the system ships with, an MA crossover with an RSI filter and volume confirmation:

name: ma_cross_rsi
pair: "*"   # one instance per watchlist pair, expanded at load
interval: 1h
status: paper

indicators:
  - id: ma_fast
    type: sma
    period: 20
  - id: ma_slow
    type: sma
    period: 50
  - id: rsi
    type: rsi
    period: 14
  - id: volume_avg
    type: volume_sma
    period: 20

entry:
  side: buy
  conditions:
    - "ma_fast > ma_slow"
    - "ma_fast_prev <= ma_slow_prev"   # the golden-cross edge
    - "rsi > 30"
    - "rsi < 70"
    - "volume > volume_avg * 1.2"

exit:
  side: sell
  conditions:
    - "ma_fast < ma_slow"
    - "ma_fast_prev >= ma_slow_prev"   # the death-cross edge
    - "OR rsi > 80"

sizing:
  type: fixed_fraction
  fraction: 0.05

The file is the strategy. There’s no per-strategy Python code, no per-strategy class hierarchy, no plugin system. The engine reads the file, validates it, and uses it.

A couple of things in there earn explanation. Conditions are strings evaluated against a namespace of indicator values. The _prev suffix is how a crossover gets expressed without per-strategy state: ma_fast > ma_slow is true on every bar of an uptrend, but ma_fast > ma_slow AND ma_fast_prev <= ma_slow_prev is only true on the bar where the fast line crosses the slow one — the edge, not the level. That single idiom is what stops the exit from re-firing a sell on every candle while price stays below the moving average.

(The DSL is deliberately small. It does comparisons and arithmetic on indicators today. Richer operands — excluding entries around a fresh exploit/hack news event, for example — are scaffolded in the grammar but not yet live; the parser accepts those lines and they evaluate False until the signal-event rule engine lands - so they’re written with an OR prefix to stay harmless. I’d rather ship the honest subset than claim a DSL I haven’t finished.)

The schema

The configuration is validated against a Pydantic v2 model on load and on every hot-reload. Validation is strict — extra="forbid", so unknown fields fail loudly; type mismatches fail loudly; and a set of semantic checks fail loudly too: the strategy name has to match [a-z0-9_]+, the pair has to be BASE/QUOTE (or the * wildcard), the interval has to be one of the supported timeframes, indicator ids have to be unique, and an entry’s side has to be buy.

The schema is the contract between the human editing the YAML and the engine consuming it. Every constraint expressed in the schema is a constraint I don’t have to remember when editing. A typo in an indicator id doesn’t silently misbehave; it gets caught the moment the file is loaded.

The hot reload

A file watcher (watchfiles) monitors the strategies directory. When a file changes, the loader reads it, parses it through the Pydantic model, and — only if validation succeeds — swaps the new strategy into the engine’s active registry. If parsing fails, the error is logged and the previously-loaded version stays active. The engine never sees an invalid configuration.

The hot reload is not a hack. It’s first-class behavior. The engine is designed so that the active strategies registry is the only piece of state that changes; everything else (data ingestion, signal aggregation, position tracking) is independent of which strategies are loaded. Swapping in a new strategy doesn’t require restarting any service, dropping any connections, or flushing any caches.

This matters because the iteration loop is short. I’m running this on a local machine while watching the dashboard, often during a single market session. The faster the loop, the more strategy variants I evaluate, and the faster the system converges on configurations that actually work.

The invariant that makes it work

The rule evaluator is a pure function from (configuration, indicator values) to (match decisions). The engine around it does the I/O - reading bars, writing signals, tracking positions - but the part that decides whether a strategy fires has no hidden state and no per-strategy side effects. Two runs of the same strategy against the same data produce the same signal events.

This invariant has consequences:

Backtesting is trivially correct. The backtester runs the same parsed conditions and indicator definitions as the live engine — a vectorized pass over historical bars instead of one bar at a time, but the same parsing and the same semantics. There’s no separate “backtest mode” with subtly different behavior. Backtest-to-live drift, the classic source of “the strategy worked great in testing and then lost money on day one,” is structurally prevented.

Strategy comparison is real. I can run many strategy variants against the same data stream and compare their signal sequences. Because the engine is pure, the comparison is meaningful — any difference in output is a difference in the strategies, not a difference in execution context.

Signals are reproducible. Every signal the engine emits is written to an append-only log with the strategy that produced it and a snapshot of the indicator values at that moment. The persisted copy of each config is versioned in the database, so I can tie a run back to the rules that were active. Reconstructing why the system did what it did is a query, not an archaeology project.

Why YAML and not code

The obvious alternative to YAML-as-strategy is Python-as-strategy: each strategy is a class implementing a Strategy protocol, the engine loads classes via a plugin system, strategies can use arbitrary Python.

I considered this. I rejected it. Here’s why.

Hot reload of code is dangerous. Live-reloading an already-validated YAML file is safe. Live-reloading Python code, even with importlib.reload, has classic failure modes: stale references, partial reload, type identity changes that break isinstance checks elsewhere. The blast radius of a bad reload is much larger when the reload is code.

Code is too expressive. A Python strategy can do anything: open files, make network calls, sleep, throw, mutate global state. The engine has no way to enforce purity on Python. It has every way to enforce purity on YAML — the YAML can’t express side effects, because the schema has no side-effecting operators.

Code is harder to validate. Pydantic validates the YAML on load. Validating Python requires running it, which means a bug in a strategy can take down the engine. The YAML-plus-schema design moves validation to load time and bounds the failure mode.

Code is harder to diff. A pull request changing a strategy in YAML is small, explicit, and human-readable. A Python class change has to be read carefully to see which conditions actually changed. The signal-to-noise ratio on review is much higher with declarative YAML.

The cost of YAML-as-strategy is expressiveness. There are strategies I can’t express in the current DSL — anything that needs per-bar state beyond standard indicators, or anything reaching outside the indicator-plus-signal world I’ve defined. Adding new primitives to the DSL is a deliberate code change I make when a class of strategies justifies it. The expressiveness tradeoff is intentional.

Where the pattern generalizes

The pattern isn’t trading-specific. It works anywhere you have:

A class of decisions made repeatedly by an engine, where the decision logic varies by configuration but the engine’s structure doesn’t.

A configuration that can be expressed declaratively — combinations of named primitives, bounded by a schema.

An invariant that “the decision logic is a pure function of its inputs” is enforceable and worth enforcing.

A need for safe iteration on the decision logic without redeploying the system.

Anywhere those four conditions hold, the YAML-plus-Pydantic-plus-hot-reload pattern earns its keep. I’ve reached for variants of it in restaurant-tech pricing rules and, years ago, in industrial control logic. Different domains, same underlying architecture.

Where it doesn’t fit: systems where the decision logic legitimately requires per-decision state, where the configuration is too complex for declarative expression, or where reloading rules at runtime is a safety concern rather than a benefit (most physical-world control systems fall here).

The summary

Strategies are data, not code. The rule evaluator is a pure function. The configuration is validated, hot-reloaded, and versioned. Backtesting and live execution share the same parsed rules because there’s no per-strategy code path to diverge.

The architecture is straightforward. The discipline of staying inside the constraints is harder than the architecture is. The payoff is iteration speed and structural correctness — the right combination to optimize for in any system where the strategy is the product and the engine is infrastructure.

That’s the case.