Chapter 03

Storage

Where rules live, how they get there, and how the engine reads them. Five DocumentForge collections, two equivalent publish paths, three cached resolution steps per request.

Overview

RuleForge keeps rules in DocumentForge, the sibling open-source document database. The engine never writes to that store at runtime — it only reads. Authoring tools (the AERO admin app, or the ruleforge publish CLI) are the only writers, and they produce identical state regardless of which path you take.

Five collections cover the entire surface. They're conventional names, hardcoded in the engine today, matching DocumentForge's seed and the AERO admin app's writes.

The five collections

CollectionPurposeMutability
rules One header per rule. The "main branch": id, endpoint, method, current version pointer. Mutable — bumped on each publish
ruleversions One immutable snapshot per (ruleId, version). Each holds the full rule JSON the engine evaluates. Immutable
environments One doc per environment (dev, staging, prod). Pins which version of each rule that environment runs. Mutable — updated on bind
referencesets One header per lookup table (price matrix, FX rates, tax rates, …). Mutable
referencesetversions Immutable versioned data: columns + rows. Immutable

Versions are immutable for a reason: in-flight evaluations of v3 never see v4 mid-walk, even if a publish happens. Environment bindings are the only things that change live, and they're cached for 30 s, so version cutovers happen on a TTL boundary or pod restart.

Document shapes

Exact JSON shapes the engine expects in each collection.

rules

{
  "id":             "rule-pnr-taxes",
  "name":           "PNR tax engine",
  "description":    "Per-pax tax itemisation",
  "tags":           ["tax"],
  "category":       "Tax",
  "endpoint":       "/v1/tax/pnr",
  "method":         "POST",
  "status":         "published",
  "currentVersion": 3,
  "updatedAt":      "2026-04-27T18:00:00.000Z",
  "updatedBy":      "andrew"
}

ruleversions

The id convention is rv-{ruleId}-{version}. The snapshot field is the full rule — same shape as documented in Rule Schema.

{
  "id":          "rv-rule-pnr-taxes-3",
  "ruleId":      "rule-pnr-taxes",
  "version":     3,
  "snapshot": {
    "id":             "rule-pnr-taxes",
    "name":           "PNR tax engine",
    "endpoint":       "/v1/tax/pnr",
    "method":         "POST",
    "currentVersion": 3,
    "inputSchema":    { /* JSON Schema */ },
    "outputSchema":   { /* JSON Schema */ },
    "nodes":          [ /* DAG nodes */ ],
    "edges":          [ /* edges */ ]
  },
  "publishedAt": "2026-04-27T18:00:00.000Z",
  "publishedBy": "andrew"
}

environments

{
  "id":   "env-staging",
  "name": "staging",
  "ruleBindings": {
    "rule-pnr-taxes":        3,
    "rule-bag-policy":       7,
    "rule-seat-assignments": 1
  },
  "refBindings": { },
  "active": true
}

Engine reads ruleBindings; refBindings is reserved for future per-env reference-set pinning.

referencesets

{
  "id":             "ref-tax-rates",
  "name":           "Air passenger tax rates",
  "currentVersion": 1,
  "updatedAt":      "2026-04-27T00:00:00.000Z"
}

referencesetversions

{
  "refId":   "ref-tax-rates",
  "version": 1,
  "columns": ["origin", "ageCategory", "code", "amount", "currency"],
  "rows": [
    { "origin": "LHR", "ageCategory": "ADT", "code": "GB1", "amount": 26, "currency": "GBP" },
    { "origin": "LHR", "ageCategory": "CHD", "code": "GB1", "amount": 13, "currency": "GBP" }
  ]
}

Publishing a rule

Two equivalent paths produce identical state. Each path performs the same three writes:

  1. Insert (or replace) a ruleversions doc with the snapshot.
  2. Update rules[ruleId] — bump currentVersion, set status: "published".
  3. Update environments[env].ruleBindings[ruleId] = version.

Path A — AERO admin app (visual editor)

Author drags nodes onto the canvas, configures via the inspector, clicks "Publish to environment". The admin's publish endpoint runs the three writes server-side.

Path B — ruleforge publish CLI

Same writes, scripted from a local rule JSON file:

ruleforge publish \
  --rule fixtures/rules/rule-pnr-taxes.v1.json \
  --env  staging \
  --df-base https://documentforge.onrender.com

This is the path used in CI, in scripted rollouts, and to seed demo fixtures. Idempotent on re-run.

Request resolution

Given an inbound POST /v1/tax/pnr, the engine performs three lookups, each cached:

// 1. Rule header by endpoint + method
SELECT id, currentVersion FROM rules
WHERE endpoint = '/v1/tax/pnr' AND method = 'POST'
// → ruleId = "rule-pnr-taxes"

// 2. Environment binding for that rule
GET environments/by/name/'staging'
// → ruleBindings["rule-pnr-taxes"] = 3

// 3. The immutable snapshot at that version
SELECT * FROM ruleversions
WHERE ruleId = 'rule-pnr-taxes' AND version = 3
// → snapshot ready to walk

The engine actually does these three lookups only at boot. At boot time, it enumerates every binding in the active environment and registers an HTTP route for each — so per-request the only work is the DAG walk. Adding a new endpoint = publish a rule + restart pods (or wait the env-binding TTL).

Caching

CacheTTLReason
(ruleId, version) → snapshotIndefiniteImmutable per version
(referenceId, version) → rowsIndefiniteImmutable per version
environments[env] → ruleBindings30 s (configurable)Mutable; the only thing that changes live
Auto-router endpoint registrationsBoot-timeNew endpoints require a restart

Steady-state behavior: after warmup, the engine never touches DocumentForge per request. See Performance for the warm-vs-cold benchmark numbers.

Local file source

For dev, CI, and offline work, the engine can read from a local directory instead of DocumentForge. Same engine code path; the only difference is which IRuleSource implementation is wired up.

RULEFORGE_RULE_SOURCE=local
RULEFORGE_FIXTURES_DIR=./fixtures/rules
RULEFORGE_REFS_DIR=./fixtures/refs

Layout:

fixtures/
  rules/
    _endpoint-bindings.json     // {"POST /v1/tax/pnr": "rule-pnr-taxes@1"}
    rule-pnr-taxes.v1.json      // the rule snapshot, same shape as ruleversions[*].snapshot
    rule-bag-policy.v7.json
  refs/
    ref-tax-rates.json          // { id, name, columns, rows, currentVersion }
    ref-cabin-class.json
  scenarios/                    // test request payloads (not read by the engine)

The bindings file is the analogue of the environments doc — it pins which version of each rule the engine should bind. Filenames encode the version: {ruleId}.v{N}.json.

Mirroring instances

For perf testing, dev work, or bringing up a fresh DocumentForge in a new environment, the CLI can copy collections between instances:

ruleforge mirror \
  --from https://documentforge.onrender.com \
  --to   http://localhost:5000

Idempotent: each source doc is matched to the target by logical id field — existing rows replaced, new rows inserted. Defaults to all eight AERO collections (rules, ruleversions, environments, referencesets, referencesetversions, nodetemplates, scenarios, connections); narrow with --collections.

Mirroring prod → a co-located dfdb sidecar gives you ~600× faster cold-path lookups (2.5 ms vs 1500 ms for cross-region) without any engine changes. See the Render deployment notes for the sidecar topology.

Multiple engines, one DocumentForge

Set RULEFORGE_COLLECTION_PREFIX on each engine instance to namespace its collection names. Two engines on a shared DocumentForge stay isolated — no rule-ID collisions, no shared cache invalidation surface.

# tax engine
RULEFORGE_COLLECTION_PREFIX=aerotoys.tax.
  → reads/writes aerotoys.tax.rules, aerotoys.tax.ruleversions,
    aerotoys.tax.environments, aerotoys.tax.referencesets,
    aerotoys.tax.referencesetversions

# offer engine, same DF
RULEFORGE_COLLECTION_PREFIX=aerotoys.offer.
  → aerotoys.offer.rules, aerotoys.offer.ruleversions, …

# single-tenant (default)
RULEFORGE_COLLECTION_PREFIX=
  → rules, ruleversions, environments, …

The CLI mirrors the same flag — pass --prefix aerotoys.tax. to ruleforge run, ruleforge publish, or ruleforge bench when targeting a prefixed namespace. The prefix is recorded by the source and surfaces on /admin/bindings for verification.

Choose a prefix once per engine and stick with it — changing the prefix means re-publishing every rule under the new collection names. Trailing punctuation (. or _) is conventional but not required.

Admin endpoints

Two ops surfaces sit alongside the rule routes. Both require the engine API key (X-AERO-Key or Authorization: Bearer); /health is the only fully public route.

Routing is dynamic. Every request hits a single catch-all that resolves (method, path) → rule live against the source. New endpoints become reachable on the next request after POST /admin/refreshno redeploy required, ever, even when adding paths that didn't exist when the engine booted.

GET /admin/bindings

Returns the live binding set and cache stats. Use this to confirm a publish landed and to inspect cache health.

curl -H 'X-AERO-Key: …' https://ruleforge.onrender.com/admin/bindings

{
  "bindings": [
    { "endpoint": "/v1/tax/pnr",      "method": "POST", "ruleId": "tax-pnr",      "version": 7 },
    { "endpoint": "/v1/tax/quote",    "method": "POST", "ruleId": "tax-quote",    "version": 3 },
    { "endpoint": "/v1/tax/refund",   "method": "POST", "ruleId": "tax-refund",   "version": 2 }
    // … 14 in total
  ],
  "bindingCount": 14,
  "cache": {
    "ruleSnapshots": 14,
    "rulesLastRefreshedAt": "2026-04-28T09:14:22Z",
    "referenceSets": 6,
    "refsLastRefreshedAt": "2026-04-28T09:14:22Z"
  },
  "routing": "dynamic"
}

POST /admin/refresh

Drops the rule-snapshot and reference-set caches and re-enumerates bindings from DocumentForge. The response echoes the new live binding list so callers can verify in one round-trip.

curl -X POST -H 'X-AERO-Key: …' https://ruleforge.onrender.com/admin/refresh

{
  "ok": true,
  "refreshedAt": "2026-04-28T09:21:07Z",
  "bindingCount": 15,
  "bindings": [
    // … 15 entries — the 14 you had plus the freshly-published one
  ],
  "note": "Source caches dropped and bindings re-enumerated. New endpoints AND new versions are live immediately — no redeploy required."
}

The engine logs Rebound POST /v1/… → ruleId@version for each live binding on every refresh, so the deploy stream gives you an audit trail.

Tax-team integration loop

Publishing a brand-new endpoint plus its rule and rolling it live in one sitting:

# 1. publish the rule version + endpoint binding to DocumentForge
ruleforge publish \
  --df https://documentforge.onrender.com \
  --df-key $DF_KEY \
  --prefix aerotoys.tax. \
  --env staging \
  --rule rules/tax-refund.v2.json

# 2. tell the running engine to re-read bindings + drop caches
curl -X POST -H "X-AERO-Key: $RF_KEY" \
  https://ruleforge.onrender.com/admin/refresh

# 3. fire the new endpoint — works immediately, no redeploy
curl -H "X-AERO-Key: $RF_KEY" \
  -H 'content-type: application/json' \
  -d @scenarios/refund-pnr.json \
  https://ruleforge.onrender.com/v1/tax/refund

The same loop covers version bumps to existing endpoints (step 1 publishes v8, step 2 swaps the live version, step 3 hits the unchanged URL with the new logic). The only thing that does still require a redeploy is upgrading the engine itself — code changes to RuleForge, not to your rule definitions.