chukwa — ticket ee470925

HOLD — DO NOT PICK UP UNTIL HUMAN AUTHORIZATION.

Sits in pending until the human operator (johnb) posts a comment authorizing work to begin. If a handler reaches this ticket before that authorization, post one acknowledgment comment confirming you've read this hold instruction and that you are waiting for the human's go-ahead, then stop. Do not branch, do not read code, do not draft a plan. The human will return to either authorize, defer, or rewrite.

Background

The substrate has two distinct identifiers for scenarios that are easy to confuse:

scenario_slug — a field on the scenario manifest itself (the stored Scenario struct). Set at scenario assembly time, baked into the content hash, immutable. Example: midnight_library.
scenario_name — a separate human-readable binding stored in the scenario_names table, mapped to a content hash. Mutable, settable/unsettable via set_scenario_name / unset_scenario_name. Example: midnight-library.

These can — and currently do — diverge. For the world first-meeting, the scenario manifest carries scenario_slug: midnight_library (with underscore), while the scenario_names table has the binding midnight-library (with hyphen). Both refer to the same scenario hash 9e525e2adfb0ed0d9f6cbe95f52f192e30836c020b1e9526822c3f32ed0de4d0.

When a world is created, its scenario_label is derived from the manifest's scenario_slug, automatically:

// src/world_store/postgres.rs
let scenario_label = scenario.scenario_slug.as_str().to_string();

So the world stores and surfaces scenario_label: midnight_library — the slug, not the name binding.

get_scenario({ name: ... }) consults only the scenario_names table:

// somewhere in scenario_store/postgres.rs
SELECT hash FROM scenario_names WHERE name = $1

There is no fallback from name → slug, and there is no underscore↔hyphen normalization. The Label documentation explicitly forbids normalization:

No normalization on input — "Subject" is a grammar violation, not a coercion to "subject".

The grammar permits both _ and - as distinct literal characters.

The footgun

The MCP get_world / list_worlds output for first-meeting looks like this:

{
  "scenario": "midnight_library",
  "scenario_label": "midnight_library",
  "scenario_hash": "9e525e2adfb0ed0d9f6cbe95f52f192e30836c020b1e9526822c3f32ed0de4d0"
}

A reasonable next move for an operator (or an agent) is:

get_scenario({ name: "midnight_library" })

This returns 404 / UNKNOWN_SCENARIO, because the scenario_names binding is midnight-library, not midnight_library. The output told the caller "scenario": "midnight_library" and the world's scenario_label is "midnight_library", but neither of those fields is a name binding. They're a slug. The caller has no way to learn this from the output shape alone.

get_scenario({ hash: "9e525e2…" }) works. get_scenario({ name: "midnight-library" }) works. get_scenario({ name: "midnight_library" }) doesn't. The output the user just read implies it should.

Why this isn't urgent (but is worth fixing)

This is a lookup-time issue that affects callers reading scenarios by name through the MCP. It does not affect run_turn execution: the world is located by slug, the scenario hash is already resolved on the world row, cognition profiles are loaded by their content hashes. The slug-vs-name path never gets exercised during simulation. Same hash chain produces the same behavior regardless of whether the scenario was looked up by name, slug, or hash.

So this is a discoverability bug, not a correctness bug. Operators (human or agent) trip on it; the substrate's actual computation is unaffected.

Goal

A caller who reads "scenario": "midnight_library" from the MCP output and tries to look up that scenario should succeed, or should fail in a way that points them to the right field. No silent normalization magic; no hidden fallbacks the user can't predict.

Design candidates

Three plausible directions, each with trade-offs. The handler picks one (or some composition) after reading the code with this lens. I'm naming candidates so the spec is concrete; I am not requiring any specific one.

Candidate A — add explicit slug support to `get_scenario`

Extend the get_scenario MCP tool input to accept { slug: "midnight_library" } as a third lookup mode alongside { name: ... } and { hash: ... }. The implementation queries scenarios.scenario_slug directly. Because slugs are not unique across the scenario store (the same slug can appear on derivative scenarios), the lookup either returns the most recent matching hash, or returns a list of all matching hashes, or errors with AMBIGUOUS_SLUG and asks the caller to disambiguate by hash.

Pros. Honors the existing distinction between slug and name. Adds capability without changing existing semantics. Forces the caller to be explicit about which axis they're querying. Easy to test: round-trip every scenario by all three of (hash, name, slug). No hidden coercion.

Cons. The slug-non-uniqueness question forces a design decision: what is a slug lookup for? If "give me the canonical scenario for this slug right now" is the use case, "most recent" is fine. If "show me everything tagged with this slug," return a list. Either is defensible; pick consciously and document.

Candidate B — fall back from name to slug in `get_scenario({ name })`

If scenario_names has no binding for the requested name, fall back to a scenarios.scenario_slug query. Maybe also fall back through underscore↔hyphen normalization. Caller doesn't have to know the difference; lookups "just work."

Pros. Smallest caller-facing API change. Existing get_scenario({ name: ... }) calls that previously 404'd start succeeding. No new tool surface.

Cons. Hidden normalization is exactly what the Label docs warn against ("Subject is a grammar violation, not a coercion"). The fallback obscures the distinction between "binding lookup" and "slug lookup" — a caller can't tell from a successful response which mechanism worked. If two scenarios have identical slugs but different name bindings, the fallback could surface a different scenario than the caller intended. Future tap-shoe footgun.

Candidate C — rename the world's output field to disambiguate

Rename the world's output scenario / scenario_label field to something that doesn't suggest "you can pass this to get_scenario by name." Options: scenario_slug (matches the manifest field directly), scenario_id, or just don't surface anything name-shaped at all (only scenario_hash).

Pros. Removes the misleading affordance at the source. If the output never says "scenario: foo," nobody tries get_scenario({ name: "foo" }) based on misreading the field. Honest about what the field is.

Cons. Breaking change to the MCP wire shape. Every caller (UI, agent, downstream tooling) that reads the world's scenario field has to update. Less helpful to operators reading a world page who genuinely want to know "what scenario is this," even if they then have to look up the name binding separately. Doesn't help if a future world-creation API allows the slug field to be set by the caller in a way that conflicts with bindings.

Recommended evaluation criteria

The handler should choose by answering:

What does an operator using the MCP actually want? If the typical workflow is "look at a world, click through to its scenario," the world's output should give them something they can navigate. Either a hash (always works), a slug (with explicit slug-lookup support), or a name binding (which is decoupled from the manifest and might not exist).
How often do slug and name diverge in practice? If they almost always agree (because the user creates a name binding that matches the slug), fixing this is mostly cleanup. If they routinely diverge, the design is communicating something the spec should match.
Does the graph browser (just shipped) need an update? The graph-browser routes use get_scenario({ hash }) for detail pages and have a /scenarios/name/:name route. Whichever candidate is chosen, the graph browser's behavior should be consistent — if scenario_label: midnight_library shows up on a world's detail page as a clickable link, that link should resolve, not 404.
What's the smallest fix that closes the footgun? Candidate C is one rename. Candidate A is one new lookup mode. Candidate B is one fallback path. All three are small. The differentiator is which one teaches the caller the distinction without sacrificing usability.

The handler's proposed_resolution should explicitly name the chosen candidate (or composition) and explain why, with the rejected alternatives briefly addressed.

Approach

Single phase, probably. The fix is small. The discipline matches what the substrate-tickets-this-week demonstrated:

Test discipline. Postgres tests run against the sacrificial sidecar at DATABASE_URL=postgres://postgres:postgres@127.0.0.1:5433/postgres. Never the cluster.
Round-trip tests. Whatever the chosen candidate, write tests that round-trip a scenario through every lookup path supported. For each scenario in the test fixture: get_scenario({ hash }) works; get_scenario({ name: "<binding>" }) works (if binding exists); get_scenario({ slug: "<slug>" }) works (if slug-lookup is added); cross-axis confusions return clean errors.
Graph browser consistency. Whatever lookup paths the MCP supports, the graph browser's links must resolve. If scenario_label: midnight_library renders as a clickable anchor on a world detail page, that anchor's href must resolve. If candidate C is chosen and the field is renamed, the graph browser's structural linker needs to know about the new field name.

Acceptance

The remediation candidate (or composition) is named in the proposed_resolution, with the trade-off documented and rejected alternatives briefly addressed.
The footgun is closed end-to-end. Starting from get_world({ slug: "first-meeting" }), an operator can navigate to the underlying scenario in at most one additional MCP call, without trial-and-error guessing across underscore/hyphen variants.
Round-trip test coverage. For each scenario in the test fixture (memory + postgres impls), every supported lookup mode succeeds, and incorrect lookup modes fail with clean named errors (no 500s, no surprising successes).
Graph browser consistency. If the world detail page in the graph browser surfaces the scenario's identity, the surfaced link resolves. The Phase H–I list / detail / uses routes for Scenario continue to work; the catalog contract test still passes.
No silent fallback magic. If the chosen candidate involves any kind of fallback (candidate B), the fallback path emits a structured indicator (log line, response field, etc.) so callers can tell which mechanism resolved the lookup. If hidden normalization is added, the spec must explicitly defend it; the default should be no normalization.
Backward compatibility for existing valid calls. Existing get_scenario({ name: "<existing-binding>" }) calls continue to succeed unchanged. Existing get_scenario({ hash: "<hash>" }) calls continue to succeed unchanged. This is purely additive (or purely a rename + linker-rule update for candidate C).

Out of scope

Renaming scenario_slug or scenario_name themselves. The internal field naming is fine. This ticket is about how the MCP / world-store output surfaces those fields and what lookups are supported.
CRUD on scenarios. Out of scope. (Future work; flagged as deferred earlier today.)
Auto-binding scenario names on world creation. If a world's scenario doesn't have a name binding, this ticket doesn't auto-create one. Whether to do that is a separate design question.
Closing the investigation or remediation tickets (4601f21a, 2dc48e22). Independent ticket.

Sequencing

Independent of every other open ticket. Doesn't block multi-agent turns. Doesn't block the /healthz remediation. Cosmetic to the substrate's correctness, real-but-minor for operator ergonomics. Run it whenever convenient.

04d1b392-… — graph browser ticket; the catalog's Scenario reference rules and the /scenarios/name/:name route are what make the world-detail page's scenario link actually navigate. Whichever candidate is chosen, the graph browser's behavior must stay consistent.
4601f21a-… — investigation ticket where this footgun was originally surfaced (in the conversation, not the ticket comments) while diagnosing turn interruptions. Tangentially related; the slug/name confusion was not the cause of the interruptions.
293a300e-… — world-store ticket; introduced the scenario_label = scenario.scenario_slug.as_str().to_string() derivation that surfaces the slug as the world's label.

Resolve scenario_slug / scenario_name confusion in MCP output and lookup

Body

Background

The footgun

Why this isn't urgent (but is worth fixing)

Goal

Design candidates

Candidate A — add explicit slug support to `get_scenario`

Candidate B — fall back from name to slug in `get_scenario({ name })`

Candidate C — rename the world's output field to disambiguate

Recommended evaluation criteria

Approach

Acceptance

Out of scope

Sequencing

Related

History (2 events)

Resolve scenario_slug / scenario_name confusion in MCP output and lookup

Body

Background

The footgun

Why this isn't urgent (but is worth fixing)

Goal

Design candidates

Candidate A — add explicit slug support to get_scenario

Candidate B — fall back from name to slug in get_scenario({ name })

Candidate C — rename the world's output field to disambiguate

Recommended evaluation criteria

Approach

Acceptance

Out of scope

Sequencing

Related

History (2 events)

Candidate A — add explicit slug support to `get_scenario`

Candidate B — fall back from name to slug in `get_scenario({ name })`