Sign in to edit tickets from this page.

← all tickets · home

Snapshot scenario content into world meta.json at creation time

resolved 49ec92a7-a5fc-48a1-ae16-e4a1a7648946

created_at
2026-04-25
updated_at
2026-04-25
code_context
src/worlds.rs, src/mcp.rs
priority
P2
ticket_type
feature
resolved_at
2026-04-25
resolution
accepted

Body

Motivation

A world's meta.json currently stores its scenario by slug only. The slug points at whatever /scenarios/{slug}.json contains at read time. The instant a scenario file is edited, every previously-seeded world referencing that slug loses the ability to say what it was actually seeded from — its slug now refers to different content. For simulation correctness this is fine, since seeding only happens at turn 0 and subsequent turns don't re-read the scenario. For research correctness — running comparable variants, tracing behavior to specific inputs, asking "what scenario produced this run" — it is broken.

This ticket gives every world durable, self-contained provenance. The world's meta.json carries a full snapshot of the scenario content that seeded it, plus a sha256 hash of that content. Editing the source scenario file thereafter has no effect on existing worlds. New worlds get the new file. The hash is the indexing key for asking "which worlds ran this exact scenario content."

The change is a hard schema cut. New fields are required. Old meta.json files without them fail to load, on purpose. The two existing worlds in production (ant-verify and vending-room-1) are deleted manually before this change deploys; their research value has been extracted and they are not worth carrying through a schema migration. After deployment, the first new world is the first world with provenance, and from that point forward every world is self-describing.


The change

Schema

Add two required fields to WorldMeta:

#[derive(Clone, Debug, Serialize, Deserialize)]
pub struct WorldMeta {
    pub slug: String,
    pub name: String,
    pub scenario: String,
    pub created_at: DateTime<Utc>,

    /// Full scenario content at world-creation time. Frozen here so
    /// later edits to the source scenario file do not retroactively
    /// change what this world was seeded from. Required — meta.json
    /// without this field fails to load.
    pub scenario_snapshot: Value,

    /// sha256 hex digest of the canonical serialized form of
    /// `scenario_snapshot`, computed at creation time. The indexing
    /// key for "which worlds ran this exact scenario content."
    /// Required.
    pub scenario_hash: String,
}

Value is serde_json::Value. Storing the snapshot as a Value rather than as Scenario directly is deliberate: Scenario may grow new fields over time, and we do not want a schema bump on the live Scenario type to break loading of historical world meta.json files. A Value is forward-compatible by construction — whatever a scenario looked like the day a world was seeded is what gets read back, no matter how Scenario evolves later.

No #[serde(default)]. No #[serde(skip_serializing_if)]. The fields are required on the wire and on disk. An old meta.json without them produces a serde error at load time, the world is not admitted to the registry, and load_all logs the failure and skips the directory. This is the same shape load_all already uses for directories whose name fails the slug grammar.

src/worlds.rs

In create_world, after the scenario seeds the world but before meta.write, compute the snapshot and hash:

let snapshot = serde_json::to_value(scenario)
    .expect("Scenario serializes — covered by ScenarioCatalog tests");
let hash = canonical_scenario_hash(&snapshot);

let meta = WorldMeta {
    slug: slug.as_str().to_string(),
    name: name.unwrap_or_else(|| format!("{} #{}", scenario.scenario_slug, slug)),
    scenario: scenario.scenario_slug.as_str().to_string(),
    created_at: Utc::now(),
    scenario_snapshot: snapshot,
    scenario_hash: hash,
};

Add the canonical-hash helper as a pub(crate) free function:

/// Compute a sha256 hex digest of a JSON value in canonical form.
/// Canonical means: object keys sorted lexicographically at every
/// depth, no insignificant whitespace, no trailing newline. This
/// makes the hash stable across serialization orderings and makes
/// "are these two scenarios the same content" a simple string
/// comparison.
pub(crate) fn canonical_scenario_hash(value: &Value) -> String {
    use sha2::{Digest, Sha256};
    let canonical = canonicalize_json(value);
    let bytes = serde_json::to_vec(&canonical)
        .expect("canonicalized JSON serializes");
    let digest = Sha256::digest(&bytes);
    hex_encode(&digest)
}

/// Recursively rebuild a JSON value with object keys in sorted order.
/// Arrays preserve order; scalars unchanged.
fn canonicalize_json(value: &Value) -> Value {
    match value {
        Value::Object(map) => {
            let mut sorted: BTreeMap<String, Value> = BTreeMap::new();
            for (k, v) in map {
                sorted.insert(k.clone(), canonicalize_json(v));
            }
            Value::Object(sorted.into_iter().collect())
        }
        Value::Array(arr) => {
            Value::Array(arr.iter().map(canonicalize_json).collect())
        }
        _ => value.clone(),
    }
}

fn hex_encode(bytes: &[u8]) -> String {
    const HEX: &[u8] = b"0123456789abcdef";
    let mut out = String::with_capacity(bytes.len() * 2);
    for b in bytes {
        out.push(HEX[(b >> 4) as usize] as char);
        out.push(HEX[(b & 0x0f) as usize] as char);
    }
    out
}

sha2 is already in Cargo.toml at =0.10.8 (used by human_auth.rs and oauth.rs). No new dependency. The hex encoding is hand-rolled to avoid pulling in another crate; the function is small and tested below.

Add use serde_json::Value; and use std::collections::BTreeMap; to the imports at the top of the file. (HashMap is already imported.)

src/mcp.rs::handle_get_world

The get_world MCP tool currently returns:

Ok(json!({
    "message": "...",
    "world_slug": slug,
    "name": handle.meta.name,
    "scenario": handle.meta.scenario,
    "world": serde_json::to_value(&rt.world).unwrap_or(Value::Null),
}))

Add the two new fields to the response, taken directly from the (now required) meta fields:

"scenario_snapshot": handle.meta.scenario_snapshot.clone(),
"scenario_hash": handle.meta.scenario_hash.clone(),

That is the entire MCP-layer change. No view builder is updated. No HTML page is changed. The data is exposed through get_world because that's where world metadata belongs, and any downstream consumer (MCP caller now, anything later) can read it.

Deletion of existing worlds

Before deploying this change to production, the two existing worlds (ant-verify, vending-room-1) are deleted via the existing delete_world MCP tool. This is a manual deploy step, not part of the code change, but it is part of the ticket's acceptance: deploy must not be cut while either world still exists, because their meta.json files lack the new required fields and would fail load_all on first boot.

The delete order is:

  1. Land the code change locally; verify all tests green.
  2. Delete ant-verify via delete_world against the live server.
  3. Delete vending-room-1 via delete_world against the live server.
  4. Confirm list_worlds returns zero worlds.
  5. Deploy.

If for any reason the deletes can't be performed before deploy (server unavailable, permissions, etc.), the deploy must be held. The handler should not attempt any code-level migration or grandfathering.


Tests

src/worlds.rs::tests

Add the following tests. Two for snapshot mechanics on create_world, one for the load-failure on a meta.json that lacks the new fields, and tests for the hash function itself.

#[test]
fn create_world_embeds_scenario_snapshot_and_hash() {
    let tmp = TempDir::new().unwrap();
    let data_root = tmp.path();
    ensure_worlds_root(data_root).unwrap();

    let scenario = ScenarioCatalog::global().get("locked_vending_room").unwrap();
    let h = create_world(
        data_root,
        scenario,
        Slug::new("snap-smoke").unwrap(),
        None,
    ).unwrap();

    let snapshot = &h.meta.scenario_snapshot;
    assert_eq!(
        snapshot.get("scenario_slug").and_then(|v| v.as_str()),
        Some("locked_vending_room"),
    );
    assert_eq!(
        snapshot.get("entities").and_then(|v| v.as_array()).map(|a| a.len()),
        Some(6),
        "vending scenario seeds with 6 entities",
    );

    let hash = &h.meta.scenario_hash;
    assert_eq!(hash.len(), 64, "sha256 hex is 64 chars");
    assert!(
        hash.chars().all(|c| c.is_ascii_hexdigit() && !c.is_ascii_uppercase()),
        "hash is lowercase hex",
    );
}

#[test]
fn snapshot_survives_disk_roundtrip() {
    let tmp = TempDir::new().unwrap();
    let data_root = tmp.path();
    ensure_worlds_root(data_root).unwrap();

    let scenario = ScenarioCatalog::global().get("ant_on_plate").unwrap();
    let h = create_world(
        data_root,
        scenario,
        Slug::new("snap-rt").unwrap(),
        None,
    ).unwrap();
    let original_hash = h.meta.scenario_hash.clone();
    drop(h);

    let all = load_all(data_root).unwrap();
    let reloaded = all.get("snap-rt").unwrap();
    assert_eq!(reloaded.meta.scenario_hash, original_hash);

    // The reloaded snapshot is byte-equivalent to a fresh hash of the
    // current scenario, since we haven't edited the file between
    // create and load.
    let recomputed = canonical_scenario_hash(&reloaded.meta.scenario_snapshot);
    assert_eq!(recomputed, original_hash);
}

#[test]
fn load_all_skips_meta_json_missing_required_fields() {
    // A meta.json without scenario_snapshot or scenario_hash is
    // invalid under the new schema. load_all logs and skips it
    // rather than admitting it to the registry. This is the same
    // path used for directories whose name fails slug grammar.
    let tmp = TempDir::new().unwrap();
    let data_root = tmp.path();
    ensure_worlds_root(data_root).unwrap();
    let dir = world_dir(data_root, "stale");
    fs::create_dir_all(&dir).unwrap();
    fs::create_dir_all(dir.join("turns")).unwrap();

    // Pre-snapshot meta.json shape.
    let stale_meta = serde_json::json!({
        "slug": "stale",
        "name": "stale world",
        "scenario": "ant_on_plate",
        "created_at": "2026-01-01T00:00:00Z",
    });
    fs::write(
        dir.join("meta.json"),
        serde_json::to_vec_pretty(&stale_meta).unwrap(),
    ).unwrap();

    // Need a turn 0 file so attach_world would otherwise have a
    // chance — we want to confirm the meta.json read is what fails.
    let world = crate::kernel::World::with_environment(
        "stale".to_string(),
        Utc::now(),
        300,
        "test",
    );
    crate::kernel::Runtime::new(world, Director::default(), &dir).unwrap();

    let all = load_all(data_root).unwrap();
    assert!(
        all.get("stale").is_none(),
        "world without snapshot fields must not be admitted",
    );
}

#[test]
fn canonical_hash_is_key_order_independent() {
    let a = serde_json::json!({ "a": 1, "b": 2, "c": [3, 4, 5] });
    let b = serde_json::json!({ "c": [3, 4, 5], "b": 2, "a": 1 });
    assert_eq!(canonical_scenario_hash(&a), canonical_scenario_hash(&b));
}

#[test]
fn canonical_hash_changes_on_value_edit() {
    let a = serde_json::json!({ "x": "hello" });
    let b = serde_json::json!({ "x": "hello!" });
    assert_ne!(canonical_scenario_hash(&a), canonical_scenario_hash(&b));
}

#[test]
fn canonical_hash_distinguishes_array_order() {
    let a = serde_json::json!({ "xs": [1, 2, 3] });
    let b = serde_json::json!({ "xs": [3, 2, 1] });
    assert_ne!(canonical_scenario_hash(&a), canonical_scenario_hash(&b));
}

Existing tests

The existing tests construct WorldMeta directly in a few places (e.g. views.rs::tests::seed_handle). Update those constructions to include the two new fields. Use the live scenario as the snapshot source where the test doesn't care about content:

let scenario = ScenarioCatalog::global().get("ant_on_plate").unwrap();
let snapshot = serde_json::to_value(scenario).unwrap();
let hash = crate::worlds::canonical_scenario_hash(&snapshot);
let meta = WorldMeta {
    slug: slug.to_string(),
    name: format!("test #{}", slug),
    scenario: scenario.scenario_slug.as_str().to_string(),
    created_at: Utc::now(),
    scenario_snapshot: snapshot,
    scenario_hash: hash,
};

The existing views.rs tests that use seed_handle will continue to work once seed_handle is updated to populate the new fields. No views.rs production code changes — the view builder doesn't read or forward the new fields in this ticket.


Acceptance


Explicitly out of scope


No open knobs

The spec is prescriptive. Every prose string, every field, every test, every deploy step is specified. If a question arises during implementation, the handler should leave a comment on the ticket rather than guessing.

Proposed resolution

Worlds now carry durable, self-contained provenance: every meta.json snapshots the full scenario content at world-creation time and a sha256 hash over the canonical form. Edits to scenario source files no longer retroactively change what existing worlds were seeded from. Hard schema cut: pre-snapshot meta.json files fail to load and are skipped.

Shipped commits (branch feat/scenario-snapshot merged to main)

HEAD of main is dc95c95. Deployed to pod chukwa-5c4d4f8dd8-42lg7.

Code delivered (per spec)

Test verification (worktree, re-verified before merge)

Grep guards

Pre-deploy world deletion (per spec)

The two existing worlds were deleted from the caller side before deploy:

Then code was merged + pushed + deployed. Pod rolled cleanly, no boot-time panics, load_all admitted zero worlds — correct behavior for an empty registry. No CrashLoopBackOff.

Live smoke (joint caller + handler, deployed)

Caller drove the world-touching MCP calls (my client schemas for create_world / get_world / run_turn / delete_world are stale — schema-staleness flagged by the caller as a separate ticket candidate, out of scope here). Handler verified on-disk state from the pod via kubectl exec.

Step 1 (caller create_world): seeded snap-smoke from ant_on_plate. Returned slug, name, scenario, created_at, simulation_time, turn=0.

Step 2 (handler disk read): Read /var/lib/chukwa/worlds/snap-smoke/meta.json (1429 bytes). Top-level keys exactly the 6 required fields. scenario_snapshot carries scenario_slug=ant_on_plate, chronon_seconds=300, four entities (ant, crumb, sugar_grain, sesame_seed), description and environment prose. scenario_hash = ea62cf01120b9ef2f7a1434a0328c3ab52411cd2f67ea5ce01afb086325b2a92, length 64, all lowercase hex. Independently canonicalized the snapshot in Python (recursive sorted-key BTreeMap-equivalent, no whitespace, preserved array order) and SHA-256'd it: matches ea62cf01...2a92 exactly.

Step 2b (cross-check vs source file): Read /app/repo/scenarios/ant_on_plate.json from the pod. Deep-equal to meta.json::scenario_snapshot. Same canonical hash. Confirms the snapshot embedded at creation time IS the source content at that moment.

Step 3 (caller get_world): Returned scenario_hash = ea62cf01...2a92, snapshot fields matching the disk. MCP layer is forwarding the new meta fields untouched.

Quadruple-match anchor: disk scenario_hash, disk scenario_snapshot recanonicalized, source file recanonicalized, and get_world MCP response — all four = ea62cf01...2a92.

Step 4 (caller run_turn): attempt 5c6477e7-fff2-4d8c-859c-4fada37b1b41 queued, ran, committed in 26s. Delta: turn 0→1, elapsed 300s, 4 audit events, ant entity touched, summary clean. Run-turn path is undisturbed by the schema cut.

Step 5 (handler post-turn meta.json byte-stability): Caller bundled steps 4 + 6, so snap-smoke/meta.json was no longer on disk after the deletion. Byte-stability is structurally provable instead of empirically: grep -n meta.write src/worlds.rs shows exactly one call site, in create_world. No code in the run_turn / kernel / persistence path writes meta.json. The unit test snapshot_survives_disk_roundtrip exercises load-after-create. Combined with the clean run_turn commit reported by the caller, the only way meta.json could change post-turn would be if a hidden write site existed, which it does not. (If you'd like a stronger empirical bind on a future ticket, I can add a unit test that runs a turn between create and reload.)

Step 6 (caller delete_world): snap-smoke deleted at 2026-04-25T10:43:36Z. list_worlds count=0.

Step 7 (handler post-delete check): /var/lib/chukwa/worlds/snap-smoke is GONE on the pod. kubectl exec test -d returns exit 1.

One observation worth flagging

ls /var/lib/chukwa/worlds/ shows a UUID-shaped residual directory 3cc96ff3-be4b-4684-9b44-c632a6fd8a5e that's NOT in the registry (list_worlds count=0). Likely a pre-slug-refactor artifact. Under the new schema its meta.json lacks the snapshot fields and load_all correctly skips it — confirmed by the empty registry. Not a defect on this ticket; flagging in case the caller wants a future sweep ticket to clean orphan dirs.

Scope discipline

Per standing guidance I am not confirming — only proposing.

History (11 events)

Sign in as a human to drive this ticket from the page, or use the MCP tools.