Sign in to edit tickets from this page.

← all tickets · home

Semantic entity IDs + retry-with-feedback for adjudication

resolved ba4c3707-27ef-4370-9112-62ee66a5d821

created_at
2026-04-23
updated_at
2026-04-23
code_context
src/kernel.rs, src/minds.rs, src/llm.rs, src/persistence.rs, src/scenarios.rs
priority
P2
ticket_type
feature
resolved_at
2026-04-23
resolution
accepted

Body

Two changes that compose cleanly and should land together. Part A replaces UUID entity identifiers with semantically meaningful strings so the LLM can reference entities naturally. Part B wraps minds::adjudicate in a retry-with-feedback loop so structurally or semantically invalid responses are returned to the model with specific corrections rather than killing the turn outright. Both are motivated by the same observation: the director's output goes through a serde-strict JSON schema, and when it fails the model never learns what it did wrong.

These interact: with semantic IDs, the retry loop's feedback messages become short and actionable ("valid entity ids: ant, crumb, sesame_seed, sugar_grain") rather than having to reproduce UUID rosters.

================================================================ PART A — Semantic entity IDs

Retire Entity::entity_id: Uuid. Replace with Entity::id: String, a normalized semantic identifier that is unique per-world and stable across turns. The UUID gave us uniqueness and stability but at the cost of being something no LLM can reliably transcribe. A semantic id gives us all four properties we actually want:

  1. Stable (never changes across turns)
  2. Unique within a world (kernel enforces at add_entity)
  3. Semantically meaningful (crumb, ant, plate.crumb)
  4. Normalizable (case-insensitive, whitespace-tolerant)

Grammar

An id is a dot-separated sequence of parts; each part is a short token over a restricted alphabet.

id := part ("." part)* part := [a-z0-9_-]+ -- non-empty; no internal whitespace

All of these are valid: ant, crumb, sugar_grain, ant-alice, first_ant, crumb_east, plate.crumb, plate.crumb_east, room.plate.crumb, ant.1, thing_42.

Invalid: "" (empty), "ant." (empty trailing part), ".ant", "ant..alpha", "first ant!" (unsupported char).

Normalization

Applied on every id input — both at add_entity time and when the model returns an id inside EntityStateMutation. Same function, called in both places. No "fallback" semantics; normalization IS the canonical form.

Steps, in order:

  1. Trim leading/trailing whitespace.
  2. Lowercase everything.
  3. Replace each run of internal whitespace with a single _.
  4. Validate against the grammar. Reject with a message that cites the offending character on failure.

Outcomes: "crumb" -> crumb "Crumb" -> crumb " CRUMB " -> crumb "first ant" -> first_ant "Ant Alpha" -> ant_alpha "plate.crumb" -> plate.crumb "PLATE.Crumb" -> plate.crumb "ant.1" -> ant.1 "" -> error (empty) "ant." -> error (empty trailing part) "first ant!" -> error (unsupported char '!') "crumb__east" -> crumb__east (double underscore is allowed; a distinct id from crumb_east)

No silent collapse of repeated separators. If a scenario author writes crumb__east, they get what they typed. Collision checks at add_entity catch actual duplicates.

Methodology — choosing ids

Ordinals are allowed by the grammar for rare cases where no descriptor is natural, but they're a last resort.

Data model

Before: pub struct Entity { pub entity_id: Uuid, pub name: String, pub state: String, pub kind: EntityKind, }

After: pub struct Entity { pub id: String, // normalized semantic id; required; unique per world pub name: String, // display name for prose (can contain spaces, case) pub state: String, pub kind: EntityKind, }

World::entities stays HashMap<String, Entity> — the key is now the semantic id directly instead of a UUID stringified. World::add_entity normalizes, validates, collision-checks, and returns Result.

Every call site that took &Uuid now takes &str:

MCP get_entity tool already takes entity_id as a caller-supplied string; only its description changes ("UUID of the entity" -> "semantic entity id (e.g. 'crumb' or 'plate.crumb'); case-insensitive on input"). No signature change.

Ant scenario with the new shape

src/scenarios.rs:

Scenario::AntOnPlate => { let mut world = World::with_environment(now, DEFAULT_CHRONON_SECONDS, "A small circular white plate, well-lit by an overhead lamp. ..."); world.add_entity(Entity::agent("ant", "Ant", "at the center of the plate, feeling hungry.", "find and eat food.", "")).unwrap(); world.add_entity(Entity::prop("crumb", "Crumb", "a small bread crumb resting 3cm east of center.")).unwrap(); world.add_entity(Entity::prop("sugar_grain", "Sugar grain", "a sugar crystal on the plate's western edge, 5cm from center.")).unwrap(); world.add_entity(Entity::prop("sesame_seed", "Sesame seed", "suspended in the air 2cm above the plate's center; ...")).unwrap(); world }

Rendered snapshot in the adjudicate prompt:

Model first-turn response, natural form: { "narration": "You set out east, reach the crumb, and eat it.", "agent_state_after": "beside where the crumb was, still hungry but less so.", "agent_memory_append": "Turn 1: ate the crumb.", "environment_after": null, "entity_mutations": [ { "entity_id": "crumb", "state": "consumed" } ] }

No retry needed. Zero transcription burden.

================================================================ PART B — Retry-with-feedback for adjudicate

What's broken today

minds::adjudicate is one-shot. Build prompt, call complete_json, return Result. If the model returns malformed JSON, serde fails and complete_json returns LlmError::Serialization. If the model returns well-formed JSON that references an entity_id the world doesn't contain, apply_adjudication catches it after mutations have already started landing on the working copy. Either way the whole turn attempt dies and the model never learns what it did wrong.

complete_json already retries once, narrowly, for infrastructure (HTTP 400 where the router rejects response_format: json_schema — re-sends with the schema inlined). That stays; it's unrelated.

Goal: when the model produces something structurally invalid or semantically wrong, return to the model with the original conversation plus a corrective user message naming exactly what was wrong. Bounded retry budget. Every rejection audit-logged. Budget exhaustion fails the turn with the full conversation attached.

Where the loop lives

Inside minds::adjudicate. Not inside run_turn (cognitive-layer policy doesn't belong in kernel scheduling). Not inside llm::complete_json (transport doesn't know what a valid Adjudication means — the validation rules depend on &World).

New shapes in llm.rs

Expose multi-turn conversations:

#[derive(Debug, Clone, Serialize)] pub struct ChatMessage { pub role: String, // "system" | "user" | "assistant" pub content: String, }

One new low-level function that returns the raw assistant text alongside the parse result, so a caller running a retry loop can show the model what it previously said:

pub struct JsonCompletion { pub raw_text: String, pub parsed: Result<T, String>, // String = serde's complaint }

pub fn chat_json_raw<T: DeserializeOwned>( messages: &[ChatMessage], schema_name: &str, schema: Value, ) -> Result<JsonCompletion, LlmError>

complete_json becomes a short wrapper that constructs [system, user] and calls chat_json_raw. Existing call sites untouched. The existing schema-inline infrastructure fallback for router 400s stays inside chat_json_raw — same behavior, invisible to callers.

New shapes in minds.rs

#[derive(Debug, Clone)] pub struct FailedAdjudicationAttempt { pub attempt_number: u32, // 1-indexed pub raw_response: String, // what the model said pub rejection: String, // why we rejected it }

#[derive(Debug, Clone)] pub enum AdjudicationError { // Retries exhausted. Every attempt in attempts, oldest first. RetriesExhausted { attempts: Vec }, // Transport / infrastructure failure — retry-with-feedback can't fix. // Bubbles up with one attempt burned. Transport(LlmError), }

pub struct AdjudicationOutcome { pub adjudication: Adjudication, pub attempts: Vec, // rejections before success }

The split matters: a router outage (LlmError::Transport, HttpStatus 5xx, InvalidResponse shape-level failures) gets one attempt and bubbles up. Only model-content failures (serde parse failures on valid transport, or semantic validation failures) burn the retry budget.

Retryable with feedback:

Non-retryable (bubble up as AdjudicationError::Transport):

Validator

fn validate_adjudication( world: &World, agent_id: &str, draft: &Adjudication, ) -> Result<(), String> { if draft.agent_state_after.trim().is_empty() { return Err("agent_state_after must be a non-empty prose sentence.".into()); } for (i, mutation) in draft.entity_mutations.iter().enumerate() { // Normalize. Grammar violations are model-correctable: show valid ids. let normalized = match entity_id::normalize(&mutation.entity_id) { Ok(n) => n, Err(e) => return Err(format!( "entity_mutations[{i}].entity_id {:?} is not a valid entity id: {}.
Valid entity ids in this world:\n{}", mutation.entity_id, e, render_entity_roster(world))), }; if !world.entities.contains_key(&normalized) { return Err(format!( "entity_mutations[{i}].entity_id {:?} (normalized to {:?}) does not
match any entity in this world. Valid entity ids:\n{}", mutation.entity_id, normalized, render_entity_roster(world))); } } Ok(()) }

fn render_entity_roster(world: &World) -> String { let mut rows: Vec<_> = world.entities.values() .map(|e| format!(" - {} ({})", e.id, e.name)) .collect(); rows.sort(); rows.join("\n") }

Error messages are written to be directly useful to the model.

Retry loop

const ADJUDICATION_RETRY_BUDGET: u32 = 2; // 3 attempts total; env-overridable const ADJUDICATION_BUDGET_ENV: &str = "CHUKWA_ADJUDICATE_RETRY_BUDGET";

pub fn adjudicate( world: &World, agent: &Entity, intent: &str, ) -> Result<AdjudicationOutcome, AdjudicationError> { let system = ADJUDICATION_SYSTEM_PROMPT; let initial_user = render_adjudication_user_prompt(world, agent, intent); let schema = adjudication_schema();

  let mut messages = vec![
      ChatMessage::system(system),
      ChatMessage::user(initial_user),
  ];
  let mut attempts: Vec<FailedAdjudicationAttempt> = Vec::new();
  let budget = read_budget_env().unwrap_or(ADJUDICATION_RETRY_BUDGET);

  for attempt_number in 1..=(budget + 1) {
      let completion = llm::chat_json_raw::<Adjudication>(
          &messages, "adjudication", schema.clone(),
      ).map_err(AdjudicationError::Transport)?;

      match completion.parsed {
          Ok(draft) => match validate_adjudication(world, &agent.id, &draft) {
              Ok(()) => return Ok(AdjudicationOutcome { adjudication: draft, attempts }),
              Err(complaint) => {
                  attempts.push(FailedAdjudicationAttempt {
                      attempt_number,
                      raw_response: completion.raw_text.clone(),
                      rejection: complaint.clone(),
                  });
                  if attempt_number > budget {
                      return Err(AdjudicationError::RetriesExhausted { attempts });
                  }
                  messages.push(ChatMessage::assistant(completion.raw_text));
                  messages.push(ChatMessage::user(corrective_prompt(&complaint)));
              }
          },
          Err(serde_complaint) => {
              let rejection = format!(
                  "Your previous response did not match the required JSON schema: {}",
                  serde_complaint);
              attempts.push(FailedAdjudicationAttempt {
                  attempt_number,
                  raw_response: completion.raw_text.clone(),
                  rejection: rejection.clone(),
              });
              if attempt_number > budget {
                  return Err(AdjudicationError::RetriesExhausted { attempts });
              }
              messages.push(ChatMessage::assistant(completion.raw_text));
              messages.push(ChatMessage::user(corrective_prompt(&rejection)));
          }
      }
  }
  unreachable!()

}

fn corrective_prompt(complaint: &str) -> String { format!("Your previous response was rejected. {complaint}\n\n
Return a corrected JSON object matching the same schema.
Keep everything else about the adjudication; only fix what was wrong.") }

apply_adjudication

Stays as-is except for one change: it normalizes the entity_id before lookup, so even a successful-validation draft with "Crumb" ends up mutating the crumb entity. Belt-and-suspenders — the validator already guaranteed normalization works, but doing it again here keeps the invariant local and means we're never looking up a non-normalized key.

for mutation in &adjudication.entity_mutations { let normalized = entity_id::normalize(&mutation.entity_id) .expect("validator guaranteed normalization succeeds"); let entity = world.get_mut(&normalized).ok_or_else(|| { io::Error::new(io::ErrorKind::NotFound, format!("unreachable: validator passed but entity {} missing", normalized)) })?; // ... existing mutation code ... }

Audit trail

One new event type in persistence.rs:

pub fn log_adjudication_rejected( &self, world_id: Uuid, turn: u64, attempt_id: Uuid, attempt_status: &str, entity_id: &str, adjudication_attempt: u32, raw_response: &str, rejection: &str, ) -> io::Result<()>

Event type string: "adjudication_rejected". Fields: turn, agent, retry index, model's text, rejection reason.

Extend PendingAuditEvent in kernel.rs: AdjudicationRejected { entity_id: String, attempt_number: u32, raw_response: String, rejection: String, }

run_turn stages one AdjudicationRejected per rejection (in both success-after-retries and budget-exhausted paths), plus the existing intent_adjudicated on success. flush_attempt_events routes AdjudicationRejected variants through log_adjudication_rejected.

Failure path wiring: let outcome = minds::adjudicate(&w, &agent_snapshot, &intent) .map_err(|e| match e { AdjudicationError::RetriesExhausted { attempts } => { for a in &attempts { staged_events.push(PendingAuditEvent::AdjudicationRejected { entity_id: agent.id.clone(), attempt_number: a.attempt_number, raw_response: a.raw_response.clone(), rejection: a.rejection.clone(), }); } let last = attempts.last().map(|a| a.rejection.clone()) .unwrap_or_else(|| "unknown".into()); TurnFailure::for_entity("adjudicate", &agent.id, format!("adjudication rejected after retries exhausted: {last}")) } AdjudicationError::Transport(inner) => TurnFailure::for_entity("adjudicate", &agent.id, inner.to_string()), })?;

// Stage rejections that preceded success — they're part of the story. for a in &outcome.attempts { staged_events.push(PendingAuditEvent::AdjudicationRejected { ... }); } let adjudication = outcome.adjudication; let entities_touched = apply_adjudication(&mut w, &agent.id, &adjudication)?; staged_events.push(PendingAuditEvent::Adjudication { ... });

================================================================ Worked examples

Successful retry (ant turn 1, model first confuses id)

Model's first adjudicate response: { ..., "entity_mutations": [ { "entity_id": "THE CRUMB", "state": "consumed" } ] }

  1. Serde tries to deserialize — succeeds (it's a string).
  2. Validator calls entity_id::normalize("THE CRUMB") -> Ok("the_crumb").
  3. world.entities doesn't contain "the_crumb" — validator returns: "entity_mutations[0].entity_id "THE CRUMB" (normalized to "the_crumb") does not match any entity in this world. Valid entity ids:
    • ant (Ant)
    • crumb (Crumb)
    • sesame_seed (Sesame seed)
    • sugar_grain (Sugar grain)"
  4. adjudicate appends assistant: + user:, retries.
  5. Model returns attempt 2 with entity_id: "crumb". Validator passes.
  6. Returns AdjudicationOutcome { adjudication, attempts: [attempt 1] }.

Audit log: turn 1, perception_emitted, ant, committed turn 1, intent_formed, ant, committed turn 1, adjudication_rejected, ant, committed, attempt=1, rejection="...does not match..." turn 1, intent_adjudicated, ant, committed turn 1, turn_complete, committed

Budget exhausted

Model persistently returns entity_id that doesn't resolve. Budget = 2.

  1. Attempt 1 rejected, retry.
  2. Attempt 2 rejected, retry.
  3. Attempt 3 rejected; attempt_number > budget, return RetriesExhausted.
  4. run_turn stages three AdjudicationRejected events, wraps in TurnFailure.
  5. Failure path: perception + intent + three rejections + attempt_failed terminator.

Audit log: turn 1, perception_emitted, ant, failed turn 1, intent_formed, ant, failed turn 1, adjudication_rejected, ant, failed, attempt=1 turn 1, adjudication_rejected, ant, failed, attempt=2 turn 1, adjudication_rejected, ant, failed, attempt=3 turn 1, attempt_failed, step=adjudicate, error="...retries exhausted: ..."

Canonical world unchanged. Operator reads log, sees a persistent failure pattern.

================================================================ File-by-file changes

src/entity_id.rs (NEW, ~60 lines with tests):

src/kernel.rs (~80 net lines):

src/minds.rs (~120 net lines):

src/llm.rs (~50 net lines):

src/persistence.rs (~45 net lines):

src/scenarios.rs (~15 net lines):

src/mcp.rs (~6 lines):

src/worlds.rs:

src/lib.rs:

tests/phase0.rs (~10 lines):

tests/ant_scenario.rs (~8 lines):

docs/terms.md (~40 lines):

No new Cargo dependencies.

================================================================ Out of scope, deliberately

================================================================ Open knobs

Proposed resolution

Implemented end-to-end, committed, deployed, and verified in production.

Semantic entity IDs (Part A):

Adjudication retry-with-feedback (Part B):

Extras beyond the spec checklist that were required for correctness:

Docs: new sections in docs/terms.md for entity id grammar/normalization/methodology and adjudication retry semantics.

Receipts:

Deploy (production):

Note on commit scope: the single commit also contains the separate db337f85 cleanup (shared manifest, annotations removal, dead-helper removal, ticket_ops test migration). That work was already in the tree awaiting your confirmation on that ticket, was functionally complete and tested, and overlapped the same src/mcp.rs so splitting was churn. Both sets are now live in production. The db337f85 ticket is still in proposed_resolution; this doesn't close it — confirming or rejecting it separately is still up to you.

Caveat per standing guidance: I am not confirming this ticket — only proposing. Over to you.

History (7 events)

Sign in as a human to drive this ticket from the page, or use the MCP tools.