Sign in to edit tickets from this page.

← all tickets · home

Promote cognition setup to scenario data

resolved 3ff9de23-13f5-4e39-84e5-ee186ed92082

created_at
2026-04-25
updated_at
2026-04-25
code_context
src/scenarios.rs, src/minds.rs, src/kernel.rs, src/worlds.rs, scenarios/ant_on_plate.json, scenarios/locked_vending_room.json
priority
P2
ticket_type
feature
resolved_at
2026-04-25
resolution
accepted

Body

Motivation

The three system prompts that drive every turn — perceive, intend, adjudicate — currently live as hardcoded constants in src/minds.rs. The adjudication JSON schema, the user-prompt template for adjudication, the corrective retry template, and the retry budget are also hardcoded. Every world that runs against this codebase, regardless of scenario, uses the same cognition.

This means the most consequential variables in the simulation — the prompts that shape every perception, every intent, and every outcome decision — are invisible to scenario authoring. Two scenarios differ only in their entities and environment; the cognition that interprets them is identical. This collapses the dimension of variation that matters most for the research we want to do.

This ticket moves the cognition setup into scenario data as a required field. Each scenario file gains a cognition object containing the three system prompts, the adjudication user template, the corrective retry template, the adjudication JSON schema, and the retry budget. Every scenario must specify a cognition; there is no default. The kernel reads the cognition from the world (which carries it in the seeded snapshot) at turn time instead of from compiled-in constants.

The two existing scenario files (ant_on_plate.json, locked_vending_room.json) are updated to include their cognition objects — populated with the exact current values — so behavior on those two scenarios is unchanged after this lands. Any new scenario must specify its own; no fallback to a default exists.

The change is a hard schema cut. The Scenario struct, the on-disk scenario file format, the World struct, and the on-disk world turn-file format all gain new required fields. There is no backwards compatibility for any of them. All worlds in production are deleted before deploy, the same way they would be for any other schema cut.

Tests that construct World directly go through the ScenarioCatalog to obtain a real cognition. There is no test placeholder, no test-only constructor, no Option<Cognition> to support cheap test fixtures. Tests use real data; that's their job.


Caller / handler back-and-forth — read this before starting

This ticket WILL require live coordination between the handler and the caller during the smoke phase. Specify this up front so nobody is surprised by it.

The handler's MCP client schemas are known stale on create_world, get_world, run_turn, and delete_world — confirmed in two prior tickets. The handler cannot drive world-touching MCP calls from their session. Every such call in the smoke must come from the caller; the handler verifies on-disk state via kubectl exec.

If at any point the handler hits a permissions block (e.g., guardrails refusing destructive operations on production state), the handler MUST post a comment on this ticket immediately naming the block and what would unblock it. Surfacing blocks fast is mandatory; silent stalling is not acceptable. The caller will respond.

Pacing expectation: ~10 caller-handler round trips for the full smoke, each ~30s of caller MCP work plus ~30s of handler verification. Total smoke time roughly 30-60 minutes plus deploy time. Faster than the prior schema cut because the pattern is established.

The detailed smoke protocol is in the Acceptance section below.


The change

src/scenarios.rs — Cognition type and integration

Add a new type:

#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct Cognition {
    /// System prompt for the perceive step.
    pub perceive_system: String,

    /// System prompt for the intend step.
    pub intend_system: String,

    /// System prompt for the adjudicate step.
    pub adjudicate_system: String,

    /// User-message template for the initial adjudication prompt.
    /// Supports three substitution tokens, replaced via plain string
    /// replacement (no templating engine):
    ///   {world}   — rendered world snapshot
    ///   {agent}   — rendered acting agent
    ///   {intent}  — the agent's stated intent
    /// All three tokens must appear at least once in the template;
    /// load-time validation enforces this.
    pub adjudicate_user_template: String,

    /// User-message template for the corrective retry prompt sent
    /// when an adjudication is rejected. Supports one substitution
    /// token:
    ///   {complaint}  — the rejection message
    /// Must appear at least once; validated at load time.
    pub adjudicate_corrective_template: String,

    /// JSON schema enforced on the adjudicator's output. Stored as
    /// a free-form `Value` so future cognition objects can extend
    /// the schema without a Scenario type change.
    pub adjudication_schema: Value,

    /// Maximum number of corrective retries permitted before an
    /// adjudication is considered failed. The adjudicator gets
    /// (1 + retry_budget) attempts total.
    pub adjudication_retry_budget: u32,
}

There is no Cognition::default(), no Cognition::placeholder(), no test-only constructor. The only way to obtain a Cognition is to deserialize one from a scenario file via the catalog, or to construct one inline by literally specifying every field. The latter is permitted but never necessary in the test changes below.

Add the field to Scenario:

pub struct Scenario {
    pub scenario_slug: Slug,
    pub description: String,
    pub chronon_seconds: i64,
    pub environment: String,
    pub entities: Vec<Entity>,
    pub cognition: Cognition,
}

The intermediate ScenarioFile struct used for deserialization gains the same cognition: Cognition field. Required, no #[serde(default)].

In Scenario::seed, the world is built from the scenario's chronon_seconds, environment, and entities — same as today. Add: the new World.cognition field is populated from self.cognition.clone() at seed time.

In ScenarioCatalog::load, after parsing each ScenarioFile, validate the cognition's templates contain their required tokens:

fn validate_cognition(slug: &str, cognition: &Cognition) -> Result<(), String> {
    for token in ["{world}", "{agent}", "{intent}"] {
        if !cognition.adjudicate_user_template.contains(token) {
            return Err(format!(
                "scenario {:?}: adjudicate_user_template must contain {}",
                slug, token,
            ));
        }
    }
    if !cognition.adjudicate_corrective_template.contains("{complaint}") {
        return Err(format!(
            "scenario {:?}: adjudicate_corrective_template must contain {{complaint}}",
            slug,
        ));
    }
    Ok(())
}

Validation failures during catalog load are panics, same as the existing filename-mismatch and slug-grammar checks. A bad scenario file is a build-time fault.

src/kernel.rs — World gains a cognition field

Add to World:

pub struct World {
    pub slug: String,
    pub simulation_time: DateTime<Utc>,
    pub chronon_seconds: i64,
    pub turn: u64,
    pub environment: String,
    pub entities: HashMap<String, Entity>,

    /// The cognition that interprets this world. Set once at
    /// seeding from the scenario; never mutated by the kernel.
    pub cognition: Cognition,
}

No #[serde(default)]. Required on disk in turn files. Old turn files without it fail to deserialize, and attach returns an error. The world is not admitted; the registry skips it and logs.

World::with_environment changes signature to accept a cognition:

pub fn with_environment(
    slug: impl Into<String>,
    simulation_time: DateTime<Utc>,
    chronon_seconds: i64,
    environment: impl Into<String>,
    cognition: Cognition,
) -> Self

World::new (the legacy convenience constructor that takes no environment) is deleted. It exists only to support tests that don't care about environment prose; under the same principle, we don't carry constructors that exist only for test convenience. Any current caller of World::new becomes a World::with_environment call with whatever environment string the test wants and a real cognition obtained from the catalog. (At time of writing, World::new is unused in production code; only kernel/persistence tests reference it.)

src/minds.rs — Read prompts and schema from the world's cognition

Change all three cognition-step functions to read prompts from the world's cognition field:

pub fn perceive(world: &World, agent: &Entity) -> Result<String, LlmError> {
    let system = &world.cognition.perceive_system;
    // ... existing user formatting unchanged ...
    llm::complete_text(system, &user).map(normalize_text)
}

pub fn intend(world: &World, agent: &Entity, perception: &str) -> Result<String, LlmError> {
    let system = &world.cognition.intend_system;
    // ... existing user formatting unchanged ...
    llm::complete_text(system, &user).map(normalize_text)
}

pub fn adjudicate(
    world: &World,
    agent: &Entity,
    intent: &str,
) -> Result<AdjudicationOutcome, AdjudicationError> {
    let system = &world.cognition.adjudicate_system;
    let schema = world.cognition.adjudication_schema.clone();
    let budget = world.cognition.adjudication_retry_budget;

    let initial_user = world.cognition.adjudicate_user_template
        .replace("{world}", &render_world(world))
        .replace("{agent}", &render_entity(agent))
        .replace("{intent}", intent);

    // ... loop unchanged ...
    // Inline closure replaces the old corrective_prompt fn:
    let corrective = |complaint: &str| -> String {
        world.cognition.adjudicate_corrective_template
            .replace("{complaint}", complaint)
    };
    // ... use `corrective(&rejection)` where corrective_prompt(&rejection) was called ...
}

intend now takes world as an argument because it needs access to the cognition. The kernel call site changes accordingly (minds::intend(agent, perception)minds::intend(&w, agent, perception)). The intend function does not currently use the world for anything else; it will still ignore everything except the cognition. That's fine.

Delete:

The validate_adjudication function is unchanged; it's structural validation, not cognition data.

src/kernel.rs — call sites in run_turn

Three updates inside the run_turn loop:

  1. minds::perceive(&w, agent) — unchanged signature, still works.
  2. minds::intend(agent, perception)minds::intend(&w, agent, perception) — adds the world.
  3. minds::adjudicate(&w, &agent_snapshot, &intent) — unchanged signature, still works (already takes the world).

The kernel does not branch on cognition contents; it just passes the world through. The kernel never reads world.cognition directly.

src/worlds.rs — World snapshot serialization

The scenario_snapshot field on WorldMeta already stores the full serialized scenario content. With this ticket, that snapshot now includes the cognition as a nested object — automatic, since it's just another field on Scenario. No worlds.rs production code change required.

The world's persisted turn files (turn_NNNNNN.json written by TurnStore) now include the cognition field on every committed world snapshot. This is a turn-file format change. Old turn files don't have the field; they fail to deserialize on attach. Same hard-cut behavior enforced for meta.json.

Scenario file updates

scenarios/ant_on_plate.json and scenarios/locked_vending_room.json each gain a cognition object containing exactly the values currently hardcoded in src/minds.rs:

"cognition": {
  "perceive_system": "You write a short perception for one agent in a prose simulation. Return plain text only, no markdown. Describe only what the agent could reasonably notice from the world state provided. Mention relevant nearby props and obvious physical constraints. Keep it to 1-4 sentences.",
  "intend_system": "You are choosing one immediate next-turn intention for an agent in a simulation. Return exactly one plain-text sentence in first person. Do not narrate outcomes that have not happened yet. Prefer the most achievable action that advances the goal this turn. Prefer reachable food or progress over distant or impossible targets.",
  "adjudicate_system": "You are the simulation director. Decide what actually happens this turn. Return JSON only. Never create or delete entities. Only mutate the acting agent, the world environment, or existing entities listed in the snapshot. If the intent is physically impossible, narrate the failed attempt and leave the unreachable target unchanged.",
  "adjudicate_user_template": "World snapshot:\n{world}\n\nActing agent:\n{agent}\n\nIntent:\n{intent}\n\nDecide the outcome for exactly one turn.\nRules:\n- `agent_state_after` must describe the acting agent's resulting physical state.\n- `agent_memory_append` should be one concise past-tense sentence, or an empty string.\n- `environment_after` must be null when unchanged.\n- `entity_mutations` may reference only ids already present in the world snapshot.\n- If nothing else changes, use an empty `entity_mutations` array.",
  "adjudicate_corrective_template": "Your previous response was rejected. {complaint}\n\nReturn a corrected JSON object matching the same schema. Keep everything else about the adjudication; only fix what was wrong.",
  "adjudication_schema": {
    "type": "object",
    "additionalProperties": false,
    "properties": {
      "narration": { "type": "string" },
      "agent_state_after": { "type": "string" },
      "agent_memory_append": { "type": "string" },
      "environment_after": { "type": ["string", "null"] },
      "entity_mutations": {
        "type": "array",
        "items": {
          "type": "object",
          "additionalProperties": false,
          "properties": {
            "entity_id": { "type": "string" },
            "state": { "type": "string" }
          },
          "required": ["entity_id", "state"]
        }
      }
    },
    "required": ["narration", "agent_state_after", "agent_memory_append", "environment_after", "entity_mutations"]
  },
  "adjudication_retry_budget": 2
}

Both scenario files get this same cognition object byte-identical. The values reflect the current hardcoded behavior. After this ticket lands, the two existing scenarios behave exactly as they did before; the only difference is that their cognition is now data instead of code.

The user template's "Rules:" prose moves verbatim from minds.rs::adjudicate's initial_user format string into the template. Same wording, same line breaks, same bullet structure.

Test fixture migrations

Every test fixture that constructs a World directly uses the scenario catalog to obtain a real cognition. There is no test-only constructor.

In src/kernel.rs::tests::mini_world:

fn mini_world() -> World {
    let scenario = ScenarioCatalog::global()
        .get("ant_on_plate")
        .expect("ant_on_plate scenario in catalog");
    let mut w = World::with_environment(
        "kernel-transitions-test",
        Utc.with_ymd_and_hms(2026, 1, 1, 12, 0, 0).unwrap(),
        60,
        "A plain room.",
        scenario.cognition.clone(),
    );
    w.add_entity(
        Entity::agent("alice", "Alice", "standing still.", "observe.", "").unwrap(),
    ).unwrap();
    w.add_entity(Entity::prop("crumb", "Crumb", "a small bread crumb.").unwrap())
        .unwrap();
    w
}

In src/minds.rs::tests::world_with_ant_and_crumb:

fn world_with_ant_and_crumb() -> World {
    let scenario = crate::scenarios::ScenarioCatalog::global()
        .get("ant_on_plate")
        .expect("ant_on_plate scenario in catalog");
    let mut world = World::with_environment(
        "minds-test",
        Utc.with_ymd_and_hms(2026, 1, 1, 12, 0, 0).unwrap(),
        300,
        "plate",
        scenario.cognition.clone(),
    );
    world.add_entity(Entity::agent("ant", "Ant", "center", "eat", "").unwrap())
        .unwrap();
    world.add_entity(Entity::prop("crumb", "Crumb", "east").unwrap())
        .unwrap();
    world
}

In src/views.rs::tests::seed_handle: similar — clone the cognition from ScenarioCatalog::global().get("ant_on_plate").

In src/worlds.rs::tests::load_all_skips_meta_json_missing_required_fields: the test constructs a World to satisfy Runtime::new so that a turn 0 file exists alongside the malformed meta.json. Same pattern — clone the cognition from the catalog.

The World::new legacy constructor is unused in production and only referenced by tests that pre-date environment prose. Delete it. Update its callers (if any) to World::with_environment with explicit empty environment and a real cognition.

Test code is not exempt from honest construction. If a test wants a World, it gets one the way production gets one: via the catalog.


Tests

src/scenarios.rs::tests

Add catalog-load validation tests:

#[test]
fn validate_cognition_accepts_complete_templates() {
    let cognition = good_cognition();
    assert!(validate_cognition("test", &cognition).is_ok());
}

#[test]
fn validate_cognition_rejects_missing_world_token() {
    let mut cognition = good_cognition();
    cognition.adjudicate_user_template = "{agent} {intent}".to_string();
    let err = validate_cognition("test", &cognition).unwrap_err();
    assert!(err.contains("{world}"));
}

#[test]
fn validate_cognition_rejects_missing_agent_token() {
    let mut cognition = good_cognition();
    cognition.adjudicate_user_template = "{world} {intent}".to_string();
    let err = validate_cognition("test", &cognition).unwrap_err();
    assert!(err.contains("{agent}"));
}

#[test]
fn validate_cognition_rejects_missing_intent_token() {
    let mut cognition = good_cognition();
    cognition.adjudicate_user_template = "{world} {agent}".to_string();
    let err = validate_cognition("test", &cognition).unwrap_err();
    assert!(err.contains("{intent}"));
}

#[test]
fn validate_cognition_rejects_missing_complaint_token() {
    let mut cognition = good_cognition();
    cognition.adjudicate_corrective_template = "Try again.".to_string();
    let err = validate_cognition("test", &cognition).unwrap_err();
    assert!(err.contains("{complaint}"));
}

fn good_cognition() -> Cognition {
    Cognition {
        perceive_system: "p".into(),
        intend_system: "i".into(),
        adjudicate_system: "a".into(),
        adjudicate_user_template: "{world} {agent} {intent}".into(),
        adjudicate_corrective_template: "{complaint}".into(),
        adjudication_schema: serde_json::json!({}),
        adjudication_retry_budget: 2,
    }
}

Note: good_cognition() is local to this test module, used only to build inputs to validate_cognition. It is not exposed; it is not a "placeholder" available to other test fixtures; it cannot be misused as a stand-in for real cognition elsewhere. It exists in this one module for the express purpose of varying one field at a time to test validation rules.

Update existing seed tests to assert on the cognition field:

let scenario = ScenarioCatalog::global().get("ant_on_plate").unwrap();
assert!(!scenario.cognition.perceive_system.is_empty());
assert!(scenario.cognition.adjudicate_user_template.contains("{world}"));
assert_eq!(scenario.cognition.adjudication_retry_budget, 2);

let world = scenario.seed("test-slug".into(), Utc::now());
assert_eq!(world.cognition.adjudication_retry_budget, 2);
assert!(!world.cognition.perceive_system.is_empty());

Other test files

The four test fixtures (mini_world, world_with_ant_and_crumb, seed_handle, the worlds-test World construction) update per the migration above. The actual tests they support are unchanged; they continue to exercise the same code paths.


Acceptance

Code-level

Pre-deploy worlds purge (caller drives)

The handler asks the caller to purge worlds. The caller does the following and posts a confirmation comment when complete:

The handler does NOT attempt these calls themselves. Their MCP client schemas are stale; they will fail. The handler's job is to wait for the caller's confirmation comment, then proceed with deploy.

Deploy

The handler merges, pushes, and deploys after the worlds-purge confirmation lands. Posts a deploy confirmation comment when the pod is rolled, healthy, and list_worlds count = 0 immediately post-rollout (verified via kubectl exec or any mechanism the handler has).

Live smoke (caller and handler in lockstep)

The handler does NOT have working MCP for world-touching calls. Every create_world, get_world, run_turn, delete_world is driven by the caller. The handler verifies on-disk state via kubectl exec. Each step requires both parties; do not skip ahead.

  1. Caller: create_world(scenario="ant_on_plate", slug="cog-smoke-ant"). Post the response.
  2. Handler: Read /var/lib/chukwa/worlds/cog-smoke-ant/turn_000000.json from the pod. Verify the cognition field is present with all seven sub-fields populated. Verify the contents byte-equal scenarios/ant_on_plate.json::cognition from the embedded scenarios directory. Post confirmation.
  3. Caller: run_turn(slug="cog-smoke-ant"). Poll get_turn_status until committed. Post the attempt_id and committed status.
  4. Handler: Read the audit events from disk. Confirm: turn committed cleanly (no adjudication_rejected events for this attempt), perception/intent/adjudication events present, agent state changed sensibly. Post confirmation.
  5. Caller: delete_world(slug="cog-smoke-ant"). Post confirmation.
  6. Caller: create_world(scenario="locked_vending_room", slug="cog-smoke-vending"). Post the response.
  7. Handler: Read /var/lib/chukwa/worlds/cog-smoke-vending/turn_000000.json. Verify cognition field byte-equals scenarios/locked_vending_room.json::cognition. Post confirmation.
  8. Caller: run_turn(slug="cog-smoke-vending"). Poll until committed. Post results.
  9. Handler: Read the audit events. Confirm perception text does NOT mention stock counts or the slot-E defect (verifying perceive_system is steering filtering correctly). Confirm turn committed cleanly. Post confirmation.
  10. Caller: delete_world(slug="cog-smoke-vending"). Post confirmation.
  11. Handler: Verify both world directories are gone. Propose resolution.
  12. Caller: Audit the proposed resolution and accept.

The acceptance bar for steps 4 and 9 is "behavior on the two existing scenarios is indistinguishable from before, modulo LLM stochasticity," not "byte-identical traces." LLM sampling means we will not get identical words; what we want is the same character of behavior.

Block-surfacing requirement

If the handler hits a permissions block at any point during deploy or smoke (guardrails refusing destructive operations, stale schemas refusing tool calls, anything that stops forward motion), the handler MUST post a comment on this ticket immediately stating:

Silent stalling is not acceptable. The caller checks this ticket periodically; visible blocks get unblocked fast.


Explicitly out of scope


No open knobs

The spec is prescriptive. Every field, every prose string, every test, every smoke step is specified. If a question arises during implementation, the handler should leave a comment on the ticket rather than guessing.

Proposed resolution

Cognition is now scenario data. Every world carries the prompts/schema/budget that interpret it; the kernel reads from world.cognition; behavior on both existing scenarios is indistinguishable from the prior hardcoded version through the live run path.

Shipped commits (branch feat/cognition-in-scenarios merged to main)

HEAD of main is b7654ef. Deployed to pod chukwa-6dcd95c698-htq6t.

Code delivered (per spec)

Test verification (worktree, re-run before merge)

All 12 grep guards pass

The acceptance section's grep guards in src/:

The 6 prose-string matches now live in scenarios/*.json (where they belong, per spec).

Pre-deploy worlds purge

Registry was already empty (count=0) when I checked before merge. No caller-side purge needed.

Deploy

Live smoke (joint caller + handler, deployed)

12-step lockstep protocol per the spec. Caller drove all four world-touching MCP calls; I verified on disk via kubectl exec. All steps complete:

$ kubectl -n chukwa exec chukwa-6dcd95c698-htq6t -- ls -la /var/lib/chukwa/worlds/
total 8
drwxr-xr-x 2 chukwa chukwa 4096 Apr 25 12:21 .
drwxrwxrwx 5 root   root   4096 Apr 24 20:52 ..
$ test -d cog-smoke-ant: GONE
$ test -d cog-smoke-vending: GONE

Both worlds removed cleanly. list_worlds count=0.

Acceptance bar (per ticket)

"behavior on the two existing scenarios is indistinguishable from before, modulo LLM stochasticity, not byte-identical traces"

Met for both ant_on_plate and locked_vending_room:

The substrate is intact. The same LLM character of output emerges through the new data-driven cognition path as through the prior hardcoded one.

Block-surfacing demonstrated

This ticket exercised the block-comment rule from feedback memory feedback_post_block_on_ticket.md. No handler-side blocks during this ticket cycle, but the rule was applied successfully on prior tickets (d57b0450, 6b3644db) and the memory remains in MEMORY.md for future use.

Process notes for the record

Per standing guidance I am not confirming — only proposing.

History (14 events)

Sign in as a human to drive this ticket from the page, or use the MCP tools.