Sign in to edit tickets from this page.

← all tickets · home

Resolve scenario_slug / scenario_name confusion in MCP output and lookup

rejected ee470925-23e0-4c16-97bb-7aeafcd2dfae

created_at
2026-04-27
updated_at
2026-04-27
priority
P3
ticket_type
bug
labels
substrate, mcp, ergonomics
resolved_at
2026-04-27
resolution
rejected

Body

HOLD — DO NOT PICK UP UNTIL HUMAN AUTHORIZATION.

Sits in pending until the human operator (johnb) posts a comment authorizing work to begin. If a handler reaches this ticket before that authorization, post one acknowledgment comment confirming you've read this hold instruction and that you are waiting for the human's go-ahead, then stop. Do not branch, do not read code, do not draft a plan. The human will return to either authorize, defer, or rewrite.

Background

The substrate has two distinct identifiers for scenarios that are easy to confuse:

These can — and currently do — diverge. For the world first-meeting, the scenario manifest carries scenario_slug: midnight_library (with underscore), while the scenario_names table has the binding midnight-library (with hyphen). Both refer to the same scenario hash 9e525e2adfb0ed0d9f6cbe95f52f192e30836c020b1e9526822c3f32ed0de4d0.

When a world is created, its scenario_label is derived from the manifest's scenario_slug, automatically:

// src/world_store/postgres.rs
let scenario_label = scenario.scenario_slug.as_str().to_string();

So the world stores and surfaces scenario_label: midnight_library — the slug, not the name binding.

get_scenario({ name: ... }) consults only the scenario_names table:

// somewhere in scenario_store/postgres.rs
SELECT hash FROM scenario_names WHERE name = $1

There is no fallback from name → slug, and there is no underscore↔hyphen normalization. The Label documentation explicitly forbids normalization:

No normalization on input — "Subject" is a grammar violation, not a coercion to "subject".

The grammar permits both _ and - as distinct literal characters.

The footgun

The MCP get_world / list_worlds output for first-meeting looks like this:

{
  "scenario": "midnight_library",
  "scenario_label": "midnight_library",
  "scenario_hash": "9e525e2adfb0ed0d9f6cbe95f52f192e30836c020b1e9526822c3f32ed0de4d0"
}

A reasonable next move for an operator (or an agent) is:

get_scenario({ name: "midnight_library" })

This returns 404 / UNKNOWN_SCENARIO, because the scenario_names binding is midnight-library, not midnight_library. The output told the caller "scenario": "midnight_library" and the world's scenario_label is "midnight_library", but neither of those fields is a name binding. They're a slug. The caller has no way to learn this from the output shape alone.

get_scenario({ hash: "9e525e2…" }) works. get_scenario({ name: "midnight-library" }) works. get_scenario({ name: "midnight_library" }) doesn't. The output the user just read implies it should.

Why this isn't urgent (but is worth fixing)

This is a lookup-time issue that affects callers reading scenarios by name through the MCP. It does not affect run_turn execution: the world is located by slug, the scenario hash is already resolved on the world row, cognition profiles are loaded by their content hashes. The slug-vs-name path never gets exercised during simulation. Same hash chain produces the same behavior regardless of whether the scenario was looked up by name, slug, or hash.

So this is a discoverability bug, not a correctness bug. Operators (human or agent) trip on it; the substrate's actual computation is unaffected.

Goal

A caller who reads "scenario": "midnight_library" from the MCP output and tries to look up that scenario should succeed, or should fail in a way that points them to the right field. No silent normalization magic; no hidden fallbacks the user can't predict.

Design candidates

Three plausible directions, each with trade-offs. The handler picks one (or some composition) after reading the code with this lens. I'm naming candidates so the spec is concrete; I am not requiring any specific one.

Candidate A — add explicit slug support to get_scenario

Extend the get_scenario MCP tool input to accept { slug: "midnight_library" } as a third lookup mode alongside { name: ... } and { hash: ... }. The implementation queries scenarios.scenario_slug directly. Because slugs are not unique across the scenario store (the same slug can appear on derivative scenarios), the lookup either returns the most recent matching hash, or returns a list of all matching hashes, or errors with AMBIGUOUS_SLUG and asks the caller to disambiguate by hash.

Pros. Honors the existing distinction between slug and name. Adds capability without changing existing semantics. Forces the caller to be explicit about which axis they're querying. Easy to test: round-trip every scenario by all three of (hash, name, slug). No hidden coercion.

Cons. The slug-non-uniqueness question forces a design decision: what is a slug lookup for? If "give me the canonical scenario for this slug right now" is the use case, "most recent" is fine. If "show me everything tagged with this slug," return a list. Either is defensible; pick consciously and document.

Candidate B — fall back from name to slug in get_scenario({ name })

If scenario_names has no binding for the requested name, fall back to a scenarios.scenario_slug query. Maybe also fall back through underscore↔hyphen normalization. Caller doesn't have to know the difference; lookups "just work."

Pros. Smallest caller-facing API change. Existing get_scenario({ name: ... }) calls that previously 404'd start succeeding. No new tool surface.

Cons. Hidden normalization is exactly what the Label docs warn against ("Subject is a grammar violation, not a coercion"). The fallback obscures the distinction between "binding lookup" and "slug lookup" — a caller can't tell from a successful response which mechanism worked. If two scenarios have identical slugs but different name bindings, the fallback could surface a different scenario than the caller intended. Future tap-shoe footgun.

Candidate C — rename the world's output field to disambiguate

Rename the world's output scenario / scenario_label field to something that doesn't suggest "you can pass this to get_scenario by name." Options: scenario_slug (matches the manifest field directly), scenario_id, or just don't surface anything name-shaped at all (only scenario_hash).

Pros. Removes the misleading affordance at the source. If the output never says "scenario: foo," nobody tries get_scenario({ name: "foo" }) based on misreading the field. Honest about what the field is.

Cons. Breaking change to the MCP wire shape. Every caller (UI, agent, downstream tooling) that reads the world's scenario field has to update. Less helpful to operators reading a world page who genuinely want to know "what scenario is this," even if they then have to look up the name binding separately. Doesn't help if a future world-creation API allows the slug field to be set by the caller in a way that conflicts with bindings.

Recommended evaluation criteria

The handler should choose by answering:

The handler's proposed_resolution should explicitly name the chosen candidate (or composition) and explain why, with the rejected alternatives briefly addressed.

Approach

Single phase, probably. The fix is small. The discipline matches what the substrate-tickets-this-week demonstrated:

Acceptance

  1. The remediation candidate (or composition) is named in the proposed_resolution, with the trade-off documented and rejected alternatives briefly addressed.
  2. The footgun is closed end-to-end. Starting from get_world({ slug: "first-meeting" }), an operator can navigate to the underlying scenario in at most one additional MCP call, without trial-and-error guessing across underscore/hyphen variants.
  3. Round-trip test coverage. For each scenario in the test fixture (memory + postgres impls), every supported lookup mode succeeds, and incorrect lookup modes fail with clean named errors (no 500s, no surprising successes).
  4. Graph browser consistency. If the world detail page in the graph browser surfaces the scenario's identity, the surfaced link resolves. The Phase H–I list / detail / uses routes for Scenario continue to work; the catalog contract test still passes.
  5. No silent fallback magic. If the chosen candidate involves any kind of fallback (candidate B), the fallback path emits a structured indicator (log line, response field, etc.) so callers can tell which mechanism resolved the lookup. If hidden normalization is added, the spec must explicitly defend it; the default should be no normalization.
  6. Backward compatibility for existing valid calls. Existing get_scenario({ name: "<existing-binding>" }) calls continue to succeed unchanged. Existing get_scenario({ hash: "<hash>" }) calls continue to succeed unchanged. This is purely additive (or purely a rename + linker-rule update for candidate C).

Out of scope

Sequencing

Independent of every other open ticket. Doesn't block multi-agent turns. Doesn't block the /healthz remediation. Cosmetic to the substrate's correctness, real-but-minor for operator ergonomics. Run it whenever convenient.

Related

History (2 events)

Sign in as a human to drive this ticket from the page, or use the MCP tools.