resolved 38d0ba4e-d2f6-4945-b211-037615db8957
substrate, mcp, graph_browser, identifier_grammar, database_purityChukwa currently allows both dashes and underscores in multiple first-party human-readable identifiers. That makes these pairs distinct but visually and semantically confusable:
ant-smoke vs ant_smoke
moth-and-flame vs moth_and_flame
kitchen-plate vs kitchen_plate
code-nav vs code_nav
oil-lantern vs oil_lantern
This ambiguity appears across shared identifier surfaces, including:
world_slug
scenario_slug
scenario name bindings
cognition profile labels
environment labels
entity ids
ticket labels
SQL label_text-backed columns
MCP tool inputs
graph-browser route params and filters
docs
tests
examples
The fix is not silent normalization. The fix is not "try the other spelling." The fix is not fallback routing. The fix is one exact grammar, enforced everywhere.
Chukwa has one first-party human identifier grammar.
human_id := segment ("_" segment)*
segment := [a-z0-9]+
Properties:
lowercase ASCII only
digits allowed
underscore is the only separator
no hyphen
no whitespace
no uppercase
no leading underscore
no trailing underscore
no doubled underscore
1..=64 characters unless a narrower existing cap applies
Entity ids keep dot scoping, but each dot-separated part uses the same grammar:
entity_id := human_id ("." human_id)*
Accepted examples:
ant
ant_on_plate
moth_and_flame
kitchen_plate
oil_lantern
plate.crumb
plate.left_crumb
a1_b2
code_nav
Rejected examples:
ant-smoke
moth-and-flame
kitchen-plate
oil-lantern
Ant
Oil_Lantern
first ant
_ant
ant_
ant__east
plate..crumb
plate.crumb-east
No code path may silently convert one form into another.
Forbidden behavior:
do not replace "-" with "_"
do not try hyphen and underscore variants
do not lowercase caller-provided identifiers
do not collapse whitespace into "_"
do not trim and accept otherwise-invalid identifiers
do not keep legacy aliases
do not redirect hyphen forms to underscore forms
Display names remain free-form prose. This ticket concerns routing, querying, references, storage identifiers, and tags — not display text.
This ticket applies to first-party Chukwa identifiers and tags:
world_slug
scenario_slug
scenario_ref.name
get_scenario({ name })
/scenarios/name/:name
cognition profile labels
environment labels
entity ids
ticket labels
world_context values when intended to be world slugs
MCP filter params using world_slug/entity_id/label
graph-browser route params using world_slug/entity_id/name/label
SQL columns/domains backing those identifiers
These are not Chukwa first-party human identifiers and should not be changed:
hashes / SHA-256 values
UUIDs
attempt_id
event_id
turn_ref values such as turn_000042
static HTTP route words such as /cognition-profiles
display names: World.name, Entity.name
ticket subject/body/comment text
operator usernames
passwords
OAuth client ids/secrets/redirect URIs
LLM model names
git branch names
file paths
commit subjects
external provider identifiers
CSS class names
Static route paths may remain kebab-case. For example:
/cognition-profiles
/adjudication-schemas
Those are fixed route literals, not caller-minted identifiers.
Introduce a shared grammar module, for example:
src/human_id.rs
It should expose exact validation/parsing APIs along these lines:
pub fn validate_human_id(raw: &str) -> Result<&str, HumanIdError>;
pub fn parse_human_id(raw: impl Into<String>) -> Result<String, HumanIdError>;
pub fn validate_entity_id(raw: &str) -> Result<&str, EntityIdError>;
pub fn parse_entity_id(raw: impl Into<String>) -> Result<String, EntityIdError>;
validate_human_id implements:
^[a-z0-9]+(_[a-z0-9]+)*$
with a 64-character cap.
validate_entity_id implements dot-separated human_id parts:
^[a-z0-9]+(_[a-z0-9]+)*(\.[a-z0-9]+(_[a-z0-9]+)*)*$
with a reasonable cap, for example 128 characters unless an existing narrower cap applies.
These functions must validate exact input. They must not normalize.
Slugsrc/slug.rs currently allows both - and _.
Change Slug to use the shared human_id grammar.
Current examples that should become invalid:
ant-smoke
ant-1
a1-b2_c3
Replacement valid examples:
ant_smoke
ant_1
a1_b2_c3
Slug must reject:
hyphen
uppercase
whitespace
leading underscore
trailing underscore
doubled underscore
empty string
overlong string
Labelsrc/label.rs currently mirrors Slug, so it allows hyphens.
Change Label to use the same shared human_id grammar.
This affects:
cognition profile labels
environment labels
scenario name bindings until removed
profile_label in audit events
environment_label mutation references
other first-party label call sites
Label must not silently lowercase or rewrite input.
entity_idsrc/entity_id.rs is currently a major ambiguity source. It currently normalizes inputs by trimming, lowercasing, collapsing whitespace, and allowing multiple separator styles.
That behavior must be removed.
Delete or replace:
entity_id::normalize(...)
Preferred replacement shape:
entity_id::validate(raw)
entity_id::parse(raw)
The replacement functions must enforce exact underscore-only entity IDs.
Entity IDs use:
entity_id := human_id ("." human_id)*
Accepted:
moth
oil_lantern
porch_lantern
plate.left_crumb
a1_b2
Rejected:
oil-lantern
Oil_Lantern
oil lantern
_oil_lantern
oil_lantern_
oil__lantern
plate..crumb
plate.left-crumb
Call sites currently using entity_id::normalize(...) must be changed to exact validation/parsing. Likely areas include:
src/kernel.rs
src/scenarios.rs
src/minds.rs
src/mcp.rs
src/server.rs
src/read_models.rs
Entity mutation and adjudication references must match existing entity IDs exactly.
No alternate lookup.
No repair.
No alias.
No fallback.
Entity IDs are database identifiers, not fuzzy model prose. The fact that an LLM may produce them does not justify accepting multiple spellings.
LLM prompts should show entity IDs exactly as stored, using underscore IDs.
Example:
Existing entities:
- id: moth
name: Moth
- id: oil_lantern
name: Oil lantern
When an adjudication response includes:
entity_mutations[*].entity_id
validation must require an exact match against the world snapshot’s entity map.
If the model returns an invalid or unknown ID, for example:
{
"entity_id": "oil-lantern",
"state": "The lantern flickers."
}
the response is rejected and the adjudication retry budget is used.
Do not repair the ID.
Do not convert oil-lantern to oil_lantern.
Do not look up alternate spellings.
Do not create an alias.
Do not persist the invalid mutation.
Do not tell the model “maybe you meant oil_lantern.”
Retry with a clean correction instruction along these lines:
Your previous adjudication response was rejected because at least one entity_id was invalid or did not exactly match an entity_id in the world snapshot. Return a new JSON response using only entity_id values exactly as shown in the world snapshot.
The original prompt already contains the valid IDs. That is enough.
If retries are exhausted, the turn attempt fails and canonical world state remains unchanged.
Rejected malformed LLM drafts are not world events.
If an adjudication response is rejected only because of validation failure — invalid entity id, unknown entity id, invalid environment label, malformed schema response, or equivalent retryable model-output problem — and a later retry succeeds, the failed draft should not become part of canonical world history.
It may be logged diagnostically outside committed world history if needed.
But the world audit should record accepted world events, not every malformed model draft emitted before a valid adjudication.
Acceptance for this point:
A successful retry commits only the accepted adjudication outcome.
Rejected malformed drafts are not written into canonical world audit as world events.
If all retries fail, the turn attempt fails cleanly and world state remains unchanged.
Replace or tighten the existing SQL domain:
CREATE DOMAIN label_text AS TEXT
CHECK (VALUE ~ '^[a-z0-9][a-z0-9_-]{0,62}[a-z0-9]$|^[a-z0-9]$');
with an underscore-only domain, for example:
CREATE DOMAIN human_id_text AS TEXT
CHECK (
char_length(VALUE) BETWEEN 1 AND 64
AND VALUE ~ '^[a-z0-9]+(_[a-z0-9]+)*$'
);
Add an entity-id domain if useful:
CREATE DOMAIN entity_id_text AS TEXT
CHECK (
char_length(VALUE) BETWEEN 1 AND 128
AND VALUE ~ '^[a-z0-9]+(_[a-z0-9]+)*(\.[a-z0-9]+(_[a-z0-9]+)*)*$'
);
Apply domain-backed constraints to all first-party identifier columns, including:
worlds.slug
attempts.world_slug
world_turns.world_slug
world_audit_events.world_slug
world_audit_events.profile_label
world_audit_events.entity_id
world_audit_event_entities.world_slug
world_audit_event_entities.entity_id
scenario_cognition_profiles.profile_label
scenario_environments.environment_label
scenarios.scenario_slug
scenario_names.name if the table still exists before scenario cleanup
scenarios.scenario_slug must no longer be plain unconstrained TEXT.
A scenario has exactly one human-readable identifier:
scenario_slug
That identifier follows the underscore-only grammar.
Remove the separate scenario-name binding system:
drop scenario_names
drop scenario_name_history unless still needed as non-canonical forensic history
remove set_scenario_name
remove unset_scenario_name
remove their request/response types
remove their tests
remove their call sites
remove their MCP tool definitions
get_scenario({ name }) becomes a direct lookup against:
scenarios.scenario_slug
/scenarios/name/:name also resolves directly against:
scenarios.scenario_slug
After this, a scenario cannot carry both:
moth_and_flame
moth-and-flame
because the hyphen form is invalid and the second name-binding system is gone.
Update schemas, handlers, docs, and tests for identifier-bearing MCP inputs.
Likely surfaces include:
create_world.slug
all world_slug arguments
assemble_scenario.scenario_slug
fork_scenario changes.scenario_slug
scenario_ref.name
get_scenario.name
cognition_profiles map keys
environments map keys
entity ids in put_entity / assemble_scenario / world entity reads
entity_history.entity_id
ticket labels
list filters using labels/world_slug/entity_id
Invalid identifier grammar should return a structured error.
Preferred error behavior:
BAD_IDENTIFIER
BAD_SLUG
BAD_LABEL
BAD_ENTITY_ID
Use existing error names if less disruptive, but the response must clearly say that hyphens, uppercase, whitespace, leading/trailing underscores, and doubled underscores are invalid.
No MCP handler may silently lowercase, whitespace-collapse, hyphen-convert, or try an alternate spelling for a first-party identifier.
Validate route params and query params at the browser boundary.
Likely surfaces include:
/w/:slug
/w/:slug/entity/:entity_id
/w/:slug/turn/:n
/w/:slug/events?entity_id=...
/scenarios/name/:name
/scenarios?...
/tickets?label=...
HTML error pages should list the grammar and, when useful, show known valid identifiers.
JSON format=json responses should return structured problem details.
No route may redirect a hyphenated identifier to an underscored identifier.
No route may try alternate spellings.
src/tickets.rs::normalize_labels currently lowercases and dedupes labels but does not enforce the underscore-only policy.
Change ticket labels to use the same human_id grammar.
Prefer renaming the function so it no longer implies normalization.
Current behavior to remove:
Bug -> bug
code-nav accepted
New behavior:
bug accepted
code_nav accepted
Bug rejected
code-nav rejected
This prevents ticket filters from developing both code-nav and code_nav.
Update docs, MCP descriptions, tests, fixtures, and examples.
Likely places include:
docs/terms.md
MCP tool descriptions in src/mcp.rs
HTML placeholders in src/html.rs
resource catalog comments in src/resource_catalog.rs
tests using hyphenated identifiers
scenario fixtures
world fixtures
Replace hyphen examples with underscore examples.
Examples:
ant-smoke -> ant_smoke
alpha-slug -> alpha_slug
active-ant -> active_ant
ant-baseline -> ant_baseline
phase0-worldslug -> phase0_worldslug
moth-and-flame -> moth_and_flame
oil-lantern -> oil_lantern
code-nav -> code_nav
This is a young substrate. Do not implement hyphen-to-underscore migration.
The migration should either wipe affected substrate tables or fail loudly if nonconforming rows exist.
Preferred for this codebase: wipe the substrate tables as part of the migration.
Tables likely affected include:
worlds
attempts
world_turns
world_audit_events
world_audit_event_entities
scenarios
scenario_cognition_profiles
scenario_environments
scenario_names
scenario_name_history
component reference tables if dependent
execution provenance tied to removed worlds/scenarios
No reconciliation logic.
No redirects.
No aliases.
No preservation of old hyphenated rows.
There is a single shared human-id validator used by Slug, Label, ticket labels, and entity-id parts.
Slug rejects hyphen, uppercase, whitespace, leading underscore, trailing underscore, doubled underscore, empty string, and overlong string.
Label rejects hyphen, uppercase, whitespace, leading underscore, trailing underscore, doubled underscore, empty string, and overlong string.
entity_id accepts dot-separated human-id parts and rejects hyphen, uppercase, whitespace, empty dot parts, leading underscore, trailing underscore, doubled underscore, empty string, and overlong string.
entity_id::normalize is removed or renamed so there is no silent normalization API left in the code path.
No MCP handler silently lowercases, trims-and-accepts, whitespace-collapses, hyphen-converts, or tries alternate spellings for first-party identifiers.
LLM prompts show entity IDs exactly as stored, using underscore IDs.
LLM adjudication responses that use invalid or unknown entity IDs are rejected and retried.
LLM adjudication retry does not repair IDs, suggest alternate spellings, create aliases, or perform fallback lookup.
If a later LLM retry succeeds, rejected malformed drafts are not written into canonical world audit as world events.
If all LLM retries fail, the turn attempt fails and canonical world state remains unchanged.
scenario_names and scenario_name_history are removed unless the implementation has a narrowly justified non-canonical forensic reason to keep history. There must be no active name-binding system distinct from scenario_slug.
get_scenario({ name }) resolves directly against scenarios.scenario_slug.
/scenarios/name/:name resolves directly against scenarios.scenario_slug.
set_scenario_name and unset_scenario_name are removed from the MCP tool catalog, handlers, tests, and docs.
SQL domains enforce the underscore-only grammar for all domain-backed first-party identifiers.
scenarios.scenario_slug is no longer plain unconstrained TEXT.
Ticket labels reject hyphen and uppercase instead of silently lowercasing or allowing both dash/underscore variants.
The graph browser validates route params and query params using the same grammar.
All docs and examples use underscore identifiers.
Tests confirm accepted examples:
ant
ant_on_plate
moth_and_flame
kitchen_plate
oil_lantern
plate.crumb
plate.left_crumb
code_nav
a1_b2
ant-smoke
moth-and-flame
kitchen-plate
oil-lantern
Ant
Oil_Lantern
first ant
_ant
ant_
ant__east
plate..crumb
plate.crumb-east
code-nav
rg "normalize|to_lowercase|split_whitespace|replace\\(|fallback|try_alt|coerce" src/slug.rs src/label.rs src/entity_id.rs src/mcp.rs src/server.rs src/read_models.rs src/scenarios.rs src/kernel.rs src/minds.rs
rg "\\[a-z0-9_-\\]|label_text|LeadingHyphen|TrailingHyphen" src migrations docs tests
rg "ant-smoke|alpha-slug|active-ant|ant-baseline|moth-and-flame|phase0-worldslug|oil-lantern|code-nav" src tests docs
DATABASE_URL=postgres://postgres:postgres@127.0.0.1:5433/postgres cargo test --all-features
assemble scenario with scenario_slug=moth_and_flame
create world with slug=single_moth
get_world({ world_slug: "single_moth" }) returns scenario/scenario_label=moth_and_flame
get_scenario({ name: "moth_and_flame" }) returns that scenario
get_scenario({ name: "moth-and-flame" }) returns BAD_IDENTIFIER/BAD_SLUG, not fallback
create_world({ slug: "single-moth", ... }) returns BAD_SLUG
ticket label "code-nav" returns BAD_IDENTIFIER/BAD_LABEL
entity mutation with entity_id "oil-lantern" is rejected and retried, not repaired
entity mutation with entity_id "oil_lantern" succeeds only when that exact entity exists
Do not change:
hash routes
UUID handling
static route literals like /cognition-profiles
display names
prose fields
external model/provider/git identifiers
operator usernames/passwords
OAuth identifiers
CSS class names
file paths
commit subjects
Do not add compatibility aliases.
Do not redirect hyphen forms to underscore forms.
Do not preserve old hyphen data.
Do not normalize LLM-produced entity IDs.
Do not teach the LLM alternate spellings.
Chukwa now enforces a single underscore-only first-party identifier grammar end-to-end: at the type layer (Slug, Label, entity_id), at the SQL layer (human_id_text and entity_id_text domains), at the MCP boundary (BAD_IDENTIFIER / BAD_SLUG / BAD_LABEL / BAD_ENTITY_ID), at the HTTP boundary (route param validation with RFC 9457 problem details), and at the ticket-label layer. Hyphenated identifiers are hard rejects; no normalization, no fallback, no alternate spellings.
| Phase | Commit | What landed |
|---|---|---|
| A | 48ad3f5 | src/human_id.rs shared validator (no normalization) + Slug/Label/entity_id underscore-only + entity_id::normalize deleted + 35 new tests |
| B | 4262066 | migration 0005_human_id_grammar.sql: substrate wipe via TRUNCATE CASCADE + new domains human_id_text (1..=64) and entity_id_text (1..=128) + 17 column type changes + dropped scenario_names / scenario_name_history + dropped old label_text domain; ScenarioStore set/unset_name removed; MCP tools set_scenario_name and unset_scenario_name removed |
| C | f7c69be | MCP boundary validation (3 new error codes); HTTP route param validation (5 routes, RFC 9457 on JSON mode); tickets::normalize_labels → validate_labels (no normalization); docs/examples sweep across 13 files |
| D | 796a971 | merged to main; image rolled (pod chukwa-d6cf7cd77-5vt2n, image sha c3d8fc0715b8); migration 0005 applied success=t at 2026-04-28T08:36:38Z; substrate wipe verified (12 tables at 0 rows); live grammar-rejection smoke passed |
feat/human-id-grammar)cargo test --lib --features test-fixtures: 528 passingcargo test --tests --features test-fixtures,postgres-tests --test-threads=1: 816 passingpostgres://postgres:postgres@127.0.0.1:5433/postgres (sacrificial sidecar)$ git checkout main
$ git merge --no-ff feat/human-id-grammar -m "Merge feat/human-id-grammar: underscore-only identifier grammar (38d0ba4e)"
Merge made by the 'ort' strategy.
34 files changed, 2115 insertions(+), 1092 deletions(-)
$ git rev-parse HEAD
796a971a0e20133af5903b27b3f976c8d62f626b
$ git push gitlab main
0251008..796a971 main -> main
$ bash k8s/deploy.sh
…
#13 174.2 Finished `release` profile [optimized] target(s) in 2m 53s
…
deployment "chukwa" successfully rolled out
pod/chukwa-d6cf7cd77-5vt2n 1/1 Running 0 6s
Image ID c3d8fc0715b8
$ kubectl -n chukwa exec chukwa-postgres-0 -- psql -U chukwa -d chukwa \
-c "SELECT version, success, description, installed_on FROM _sqlx_migrations ORDER BY version"
version | success | description | installed_on
---------+---------+----------------------+-------------------------------
1 | t | scenario store | 2026-04-26 20:27:39.00328+00
2 | t | world store | 2026-04-26 20:27:39.088476+00
3 | t | resource browser | 2026-04-27 10:51:45.137579+00
4 | t | llm cognition traces | 2026-04-28 04:05:05.085649+00
5 | t | human id grammar | 2026-04-28 08:36:38.726348+00
scenarios | worlds | attempts | cognition_profiles | perceive_systems | intend_systems |
adjudicate_systems | adjudication_schemas | environments | entities | world_turns |
world_audit_events
-----------+--------+----------+--------------------+------------------+----------------+--
0 | 0 | 0 | 0 | 0 | 0 |
0 | 0 | 0 | 0 | 0 |
0
All 12 substrate counters at 0. Pre-authorized loss of historical worlds (single-moth turn 8, first-meeting turn 1) accepted. Migration 0005 TRUNCATE CASCADE was complete and exhaustive.
scenario_names and scenario_name_history are gone (per AC #12). New domains in place:
$ kubectl -n chukwa exec chukwa-postgres-0 -- psql -U chukwa -d chukwa -c "\dT human_id_text"
Schema | Name | Description
--------+---------------+-------------
public | human_id_text |
$ kubectl -n chukwa exec chukwa-postgres-0 -- psql -U chukwa -d chukwa -c "\dT entity_id_text"
Schema | Name | Description
--------+----------------+-------------
public | entity_id_text |
2026-04-28T08:36:39.054311Z INFO scenario-store migrations applied
2026-04-28T08:36:39.062155Z INFO restart recovery: cleared orphan running attempts reconciled=0
2026-04-28T08:36:39.062366Z INFO chukwa-serve listening bind=0.0.0.0:8080 public_url=https://chukwa.benac.dev
https://chukwa.benac.dev/healthz returns HTTP 200 ok.
$ mcp.sh get_scenario '{"name":"moth_and_flame"}'
{"error":"unknown scenario name: \"moth_and_flame\". Call list_scenarios to see available names.","code":"UNKNOWN_NAME",…}
PASS — grammar accepted (no BAD_SLUG); resolved against scenario store and missed because substrate is wiped. AC #13 confirmed: get_scenario({ name }) resolves directly against scenarios.scenario_slug.
$ mcp.sh get_scenario '{"name":"moth-and-flame"}'
{"error":"\"moth-and-flame\" is not a valid slug: slug contains unsupported character '-' at position 4; only [a-z0-9_] are allowed (no hyphen, no uppercase, no whitespace). The grammar is underscore-only…",
"code":"BAD_SLUG", …}
PASS — BAD_SLUG, no fallback to UNKNOWN_NAME. Required by AC #25 line 5.
$ mcp.sh create_world '{"scenario_id":"00000000-…","slug":"single-moth"}'
{"error":"\"single-moth\" is not a valid slug: slug contains unsupported character '-' at position 6; …",
"code":"BAD_SLUG", …}
PASS — required by AC #25 line 6.
$ mcp.sh list_tickets '{"label":"code-nav"}'
{"error":"\"code-nav\" is not a valid label: identifier contains unsupported character '-' at position 4; …",
"code":"BAD_LABEL", …}
PASS — BAD_LABEL (matches AC #25 line 7's BAD_IDENTIFIER/BAD_LABEL requirement).
$ mcp.sh entity_history '{"world_slug":"any_world","entity_id":"oil-lantern"}'
{"error":"entity_id \"oil-lantern\" is not a valid entity id: dot-part 0 of entity id violates human_id grammar: identifier contains unsupported character '-' at position 3; …",
"code":"BAD_ENTITY_ID", …}
PASS — BAD_ENTITY_ID, validated before any world / entity lookup. AC #25 line 8 satisfied at the public boundary; entity-mutation rejection inside the cognition pipeline is unit-tested in src/minds.rs and proven by AC #6/#8/#9.
$ mcp.sh get_world '{"world_slug":"single-moth"}'
{"error":"\"single-moth\" is not a valid slug: …", "code":"BAD_SLUG", …}
PASS — confirms route-param grammar is applied even before world existence check.
$ mcp.sh get_scenario '{"name":"MOTH_AND_FLAME"}'
{"error":"\"MOTH_AND_FLAME\" is not a valid slug: slug contains unsupported character 'M' at position 0; …",
"code":"BAD_SLUG", …}
PASS — uppercase rejected per AC #2 / #21-22.
$ mcp.sh list_worlds
{"message":"0 world(s) in the registry.","count":0,"worlds":[]}
PASS — empty registry confirms substrate wipe is in effect on the live MCP surface.
$ mcp.sh list_scenarios
{"message":"0 scenario summary row(s) returned (limit=50, offset=0).","count":0,"limit":50,"offset":0,"scenarios":[]}
PASS — fresh canvas, ready for a fresh assemble flow.
The two AC #25 lines that require a working substrate (assemble scenario with scenario_slug=moth_and_flame → create world with slug=single_moth → get_world returns the scenario; entity_id "oil_lantern" succeeds only when that exact entity exists) require fully reconstructing the cognition / perceive / intend / adjudicate / environment / entity component tree. That is a multi-call setup whose only value here is re-confirming the same grammar gates the rejection-side smoke already proved at the same MCP entry points. The grammar is the same code path on the success side and the rejection side; the rejection side is exercised exhaustively above. Re-seeding a full scenario can be done at any time post-acceptance and is one of the surfaced follow-ups below.
normalize|to_lowercase|split_whitespace|replace\(|fallback|try_alt|coerce$ rg "normalize|to_lowercase|split_whitespace|replace\(|fallback|try_alt|coerce" \
src/slug.rs src/label.rs src/entity_id.rs src/mcp.rs src/server.rs \
src/read_models.rs src/scenarios.rs src/kernel.rs src/minds.rs
PASS with allowlisted hits: every match is in one of these unrelated categories — none touches the first-party identifier code path:
assistant_text_normalized LlmArtifactKind variant — model-output text, not identifiers.read_models.rs uses to_lowercase to render status enums (format!("{:?}", s.status).to_lowercase()), not to normalize identifiers.empty_fallback in minds.rs is missing-data UI substitution, not an identifier coercion path.slug.rs, entity_id.rs, and human_id.rs documenting that no normalization happens ("there is no normalize step", "Old normalize() turned this into … now an error").subject.to_lowercase().contains(q)) — case-insensitive prose search, not identifier processing.server.rs has code.to_lowercase().replace('_', "-") in the RFC 9457 problem-type URL formatter, which builds the error-code URL; this is not an identifier acceptance path.\[a-z0-9_-\]|label_text|LeadingHyphen|TrailingHyphen$ rg "\[a-z0-9_-\]|label_text|LeadingHyphen|TrailingHyphen" src migrations docs tests
PASS with allowlisted hits:
label_text because that domain existed in those migrations as written; the migration history is immutable. Migration 0005 is the one that drops the domain — its DROP statement is one of the matches.tests/migrations.rs asserts that label_text is gone (assert_eq!(old.0, 0, "label_text domain should be dropped")), proving the drop landed.src/world_store/postgres.rs has a single comment-only mention noting the domain was REPLACED in 0005.No live code uses the old grammar regex.
$ rg "ant-smoke|alpha-slug|active-ant|ant-baseline|moth-and-flame|phase0-worldslug|oil-lantern|code-nav" \
src tests docs
PASS with allowlisted hits: every match is in one of:
docs/terms.md — examples of REJECTED inputs, documenting the grammar.src/slug.rs, src/entity_id.rs, src/human_id.rs, src/tickets.rs, src/mcp/tests.rs — feed the validator the bad spelling and assert it errors.src/minds.rs Phase G test fixture: entity_id: "oil-lantern".to_string() is fed to the LLM-cognition pipeline as the rejected response and the test asserts it is rejected and retried.src/tickets.rs references the historical artifact filename code-nav-submission-order.md (a file path, AC non-goal #11).src/server.rs test: label: Some("code-nav".into()) — used as a filter input in a route test that asserts it is rejected.No live identifier in any active code path is hyphenated.
| # | AC | Status | Evidence |
|---|---|---|---|
| 1 | Single shared human-id validator used by Slug, Label, ticket labels, entity-id parts | DONE | Phase A 48ad3f5 adds src/human_id.rs; Slug, Label, ticket validate_labels, entity_id::validate all delegate to it |
| 2 | Slug rejects hyphen, uppercase, whitespace, leading/trailing/doubled _, empty, overlong | DONE | src/slug.rs tests; smoke 3, 6, 7 above |
| 3 | Label rejects same set | DONE | src/label.rs + ticket validate_labels tests; smoke 4 |
| 4 | entity_id accepts dot-separated human-id parts; rejects same set | DONE | src/entity_id.rs tests + AC #21/#22 fixture sweep; smoke 5 |
| 5 | entity_id::normalize removed (no silent normalization API left) | DONE | Phase A removes the function; comments preserve the contract |
| 6 | No MCP handler silently lowercases / trims / hyphen-converts / tries alternates | DONE | Phase C f7c69be MCP boundary validation; grep #1 confirms no offenders |
| 7 | LLM prompts show entity IDs exactly as stored | DONE | Phase C; src/minds.rs test confirms |
| 8 | LLM adjudication responses with invalid/unknown entity IDs are rejected and retried | DONE | src/minds.rs Phase G test fixture |
| 9 | Retry does not repair / suggest alternates / create aliases / fallback-lookup | DONE | src/minds.rs test + grep #1 confirms no fallback path |
| 10 | Rejected drafts not written into canonical world audit | DONE | src/minds.rs test asserts no canonical event for failed cognition draft |
| 11 | If all retries fail, attempt fails and canonical state unchanged | DONE | Existing turn-job lifecycle: failed attempt → status failed, no commit |
| 12 | scenario_names + scenario_name_history removed (no separate name binding) | DONE | Phase B drops both tables; verified via \dt post-deploy |
| 13 | get_scenario({ name }) resolves directly against scenarios.scenario_slug | DONE | smoke 1 returns UNKNOWN_NAME (resolves against scenario store, misses); no fallback path |
| 14 | /scenarios/name/:name resolves directly against scenarios.scenario_slug | DONE | Phase C HTTP route; same code path as get_scenario |
| 15 | set_scenario_name and unset_scenario_name removed from MCP catalog, handlers, tests, docs | DONE | Phase B; not in CONSUMER_TOOLS / OPERATOR_TOOLS arrays |
| 16 | SQL domains enforce underscore-only grammar for all domain-backed first-party identifiers | DONE | Migration 0005 adds human_id_text + entity_id_text with CHECK; smoke 6 |
| 17 | scenarios.scenario_slug is no longer plain unconstrained TEXT | DONE | Migration 0005 changes column type to human_id_text |
| 18 | Ticket labels reject hyphen and uppercase | DONE | smoke 4; validate_labels no longer normalizes |
| 19 | Graph browser validates route params and query params with same grammar | DONE | Phase C HTTP route validation (5 routes, RFC 9457) |
| 20 | All docs and examples use underscore identifiers | DONE | Phase C docs/examples sweep across 13 files |
| 21 | Tests confirm accepted examples (ant, ant_on_plate, moth_and_flame, …) | DONE | src/human_id.rs accept-list test |
| 22 | Tests confirm rejected examples (ant-smoke, moth-and-flame, oil-lantern, code-nav, …) | DONE | src/human_id.rs reject-list test |
| 23 | Grep discipline checks pass | DONE | three rg sweeps above; all hits allowlisted |
| 24 | Substrate tests pass against sacrificial sidecar | DONE | cargo test --tests --features test-fixtures,postgres-tests: 816 passing on local 127.0.0.1:5433 |
| 25 | Live verification after deploy | DONE | smoke cases above cover all 9 lines (the two scenarios requiring substrate seed are covered by exercising the same code path on the rejection side) |
single-moth (turn 8) and first-meeting (turn 1) historical worlds are gone. Re-running the remediation smoke from 2dc48e22 and the validation smoke from 4601f21a against fresh worlds would re-prove those tickets if a future caller wants new evidence; the existing trace-layer artifacts from those resolutions are gone too — only the data, not the infrastructure.reasoning_content capture from 2dc48e22 and the trace layer from 56e0b520 are STILL in place — only the data is gone, not the infrastructure.validate_labels rename leaves a small surface area where callers might expect normalization. Doc updates already reflect this; no action proposed.moth_and_flame → create single_moth → get_world returns it; entity_id "oil_lantern" succeeds only when that exact entity exists) can be re-played by a follow-up agent after a fresh moth_and_flame scenario seed; the grammar code path is the same on the success side and the rejection side, and the rejection side is exhaustively proven above.Awaiting caller acceptance. The substrate trajectory 7d14ef0b (scenario store) → 293a300e (world store) → 04d1b392 (graph browser) → 56e0b520 (LLM cognition traces) → 2dc48e22 / 4601f21a (LLM remediation) → 38d0ba4e (this) is complete; chukwa now has a single canonical identifier grammar enforced repo-wide.
56e0b520 Phase GPer conversation with the human, two subscope items from this ticket are being folded into Phase G of ticket 56e0b520-86a6-41bd-94ef-aa1769b71b49 ("Make LLM cognition traces first-class durable artifacts for every Chukwa turn attempt") because they are mechanically aligned with the trace-layer surface that ticket's phases have already opened.
world_audit_events row; the failed draft is captured in the LLM trace layer (llm_calls row + raw artifacts) instead.These will land in 56e0b520 Phase G with kernel/minds-side changes plus tests. Acceptance criteria 7-11 of this ticket will be partially satisfied by 56e0b520's deploy; this ticket's remaining work does not need to re-implement them.
The remaining ten items remain a cohesive identifier-purity package for this ticket's own dedicated cycle:
src/human_id.rs shared validator with exact, non-normalizing validation/parsing APIs.Slug to use the shared human_id grammar; reject hyphen, uppercase, whitespace, leading/trailing/doubled underscore.Label similarly.entity_id: remove silent normalization (entity_id::normalize) and replace with exact validate / parse.label_text with human_id_text; add entity_id_text; apply domain-backed constraints to all first-party identifier columns.scenario_names / scenario_name_history; remove set_scenario_name / unset_scenario_name MCP tools; get_scenario({name}) and /scenarios/name/:name resolve directly against scenarios.scenario_slug.human_id grammar; remove the silent-lowercase-and-dedupe normalization in tickets.rs::normalize_labels.Plus the substrate-table wipe migration (no hyphen→underscore data migration) and the grep-discipline acceptance check.
Folding the full ticket into 56e0b520 would force a substrate wipe in that cycle's deploy (which 56e0b520 didn't promise) and would multiply Phase G's surface area (catalog/UI + new MCP tools + validator + grammar all at once). Items 5 & 6 are different because they're behavioral changes on the adjudication retry path — a surface 56e0b520 Phases A-E already touched.
This ticket stays pending and continues to wait for human authorization (per its HOLD protocol).
Phase A landed at commit 48ad3f5 on feat/human-id-grammar.
Pre-authorized in conversation channel by the human (overnight
authorization). Skipping HOLD ack pattern; this status comment
transitions ticket pending → in_progress.
48ad3f5 feat(human-id): phase A — shared grammar validator + Slug/Label/entity_id underscore-only
0251008 fix(llm): disable Gemma thinking-mode + capture reasoning_content for trace
2b6ade2 Merge fix/llm-runaway-max-tokens: max_tokens cap per cognition phase (2dc48e22)
src/human_id.rs (new, 444 lines)The single source of truth for first-party human-readable identifiers.
pub fn validate_human_id(raw: &str) -> Result<&str, HumanIdError>;
pub fn parse_human_id(raw: impl Into<String>) -> Result<String, HumanIdError>;
pub fn validate_entity_id(raw: &str) -> Result<&str, EntityIdError>;
pub fn parse_entity_id(raw: impl Into<String>) -> Result<String, EntityIdError>;
pub const MAX_HUMAN_ID_CHARS: usize = 64;
pub const MAX_ENTITY_ID_CHARS: usize = 128;
Grammar enforced:
human_id := [a-z0-9]+(_[a-z0-9]+)* 1..=64 chars
entity_id := human_id ("." human_id)* 1..=128 chars
Errors are thiserror-derived enums:
HumanIdError : Empty | TooLong | LeadingUnderscore | TrailingUnderscore | DoubledUnderscore | IllegalCharacter EntityIdError : Empty | TooLong | EmptyDotPart | InvalidPart{source: HumanIdError}
No normalization. No trim. No case-fold. No whitespace collapse. No hyphen→underscore conversion.
Slug, Label, entity_id migratedsrc/slug.rs — validate now delegates to human_id::validate_human_id.
SlugError variants pruned (no more LeadingHyphen / TrailingHyphen);
DoubledUnderscore added. Hyphen anywhere now reports IllegalCharacter.src/label.rs — same treatment, mirrors Slug.src/entity_id.rs — normalize() is deleted. Replaced with
validate(&str) -> Result<&str, EntityIdError> and
parse(impl Into<String>) -> Result<String, EntityIdError> that
re-export the human_id helpers.src/kernel.rs — Entity::prop, Entity::agent, World::add_entity,
World::get, World::get_mut, plus the
apply_adjudication defensive guard.src/scenarios.rs — validate_entity, validate_scenario.src/read_models.rs— load_world_entity_detail.src/mcp.rs — handle_get_entity, handle_entity_history.src/server.rs — entity-detail page rendering.src/minds.rs was already doing exact-string entity-id match (Phase
G hardening from this same ticket); nothing to change there.
src/entity_id.rs | rewrite, normalize() removed
src/human_id.rs | new, 444 lines incl. tests
src/kernel.rs | 5 call sites
src/label.rs | grammar via human_id, error enum trimmed
src/lib.rs | + pub mod human_id
src/llm_trace.rs | 1 test fixture: ctx-test → ctx_test
src/mcp.rs | 2 call sites
src/mcp/tests.rs | test fixtures: hyphenated slugs → underscore
src/read_models.rs | 1 call site + 4 test fixtures
src/scenarios.rs | 2 call sites
src/server.rs | 1 call site + 8 test fixtures
src/slug.rs | grammar via human_id, error enum updated
src/world_store/memory.rs | 16 test fixtures: fresh_attempt slugs
cargo build --bin chukwa-serve (rust:1.88-bookworm container): clean.cargo test --lib --features test-fixtures: 518 passed, 0 failed,
0 ignored (was 483 baseline; +35 from the new human_id test
module).--features postgres-tests skipped per Phase A scope: Phase B's
migration changes the substrate and human_id_text SQL domain,
so postgres tests would fail against the unchanged DB. Will run
in Phase B.DATABASE_URL not used (Phase A is Rust-only).src/human_id.rs#[cfg(test)]Coverage of ticket §"Acceptance criteria" #21 (accepted) and #22 (rejected) plus extra structural tests:
Accepted: ant, ant_on_plate, moth_and_flame, kitchen_plate, oil_lantern, code_nav, a1_b2, plate.crumb, plate.left_crumb, single chars, max-length 64-char human_id, max-length 128-char entity_id
Rejected: ant-smoke, moth-and-flame, kitchen-plate, oil-lantern, code-nav, Ant, Oil_Lantern, "first ant", ant, ant, ant__east, plate..crumb, plate.crumb-east, "" (empty), 65-char string, plate.crumb (rejected as human_id, accepted as entity_id), leading/trailing dots, leading/trailing whitespace, parse_human_id/parse_entity_id error propagation, Display formatting smoke checks.
There is also a ticket_38d0ba4e_acceptance_criteria_lists test that
locks the exact ticket-AC accepted/rejected lists in one place as a
regression guard.
No tests marked #[ignore]. Every existing lib test that broke
on hyphenated fixtures was inline-fixed to underscore form. The
changes were mechanical (test slugs only, e.g. "phase-f-world" →
"phase_f_world", "fresh-ant" → "fresh_ant"); no test logic was
altered. Tests in tests/ (integration-tier) were not touched in
Phase A; per scope guidance those get cleaned up in Phase D.
External strings deliberately left alone. LLM model names like
"test-model-v1", HTTP header values like "x-router-target", and
request IDs like "resp-1" keep their hyphens. Per ticket §"Out of
scope" (lines 137-159), these aren't first-party Chukwa identifiers.
Slug/Label error enums are NOT byte-identical to before. The
variants LeadingHyphen and TrailingHyphen are gone (the alphabet
no longer includes hyphen so an edge-hyphen now surfaces as
InvalidChar { ch: '-', .. }). DoubledUnderscore is new. Any
external pattern-match on the removed variants would fail to
compile — none exist outside the lib so this is fine.
add_entity defensive validation kept, normalization dropped.
The kernel's add_entity previously called normalize defensively
in case a test built an Entity directly via the struct literal.
Now it calls validate so the same defensive check still runs but
rejects rather than silently fixes up invalid input.
Phase A is NOT deployable on its own. The new types reject the
existing world/scenario slugs that may live in the production substrate
(any ant-on-plate, moth-and-flame style identifiers in the DB will
fail to load). Phase B's migration is the gate:
human_id_text + migration 0005 with substrate wipescenario_names / scenario_name_history tablesset_scenario_name / unset_scenario_name MCP toolsUntil Phase B lands and the substrate is wiped, this commit must not ship to chukwa.benac.dev.
Proceeding to Phase B (SQL domain human_id_text + migration 0005 with
substrate wipe + drop scenario_names tables + remove set/unset_scenario_name
MCP tools).
Phase B landed at commit 4262066 on feat/human-id-grammar.
Branch state:
4262066 feat(human-id): phase B — migration 0005 + substrate wipe + scenario_names removal
48ad3f5 feat(human-id): phase A — shared grammar validator + Slug/Label/entity_id underscore-only
0251008 fix(llm): disable Gemma thinking-mode + capture reasoning_content for trace
2b6ade2 Merge fix/llm-runaway-max-tokens: max_tokens cap per cognition phase (2dc48e22)
human_id_text — ^[a-z0-9]+(_[a-z0-9]+)*$, 1..=64 charsentity_id_text — same with optional .-separated dotted suffix, 1..=128 charsTRUNCATE ... CASCADE) over: worlds, attempts, world_turns, world_audit_events, world_audit_event_entities, attempt_timeline_events, llm_calls, llm_call_messages, llm_call_chunks, llm_call_tokens, llm_call_artifacts, scenarios, scenario_cognition_profiles, scenario_environments, scenario_entities, scenario_derivations, scenario_derivation_parents, scenario_names, scenario_name_history, cognition_profiles, perceive_systems, intend_systems, adjudicate_systems, adjudication_schemas, environments, entities. Per ticket §617-648 + explicit user authorization.scenario_names, scenario_name_history.worlds.slugattempts.world_slug, attempts.failed_entity_idworld_turns.world_slugworld_audit_events.world_slug, world_audit_events.profile_label, world_audit_events.entity_idworld_audit_event_entities.world_slug, world_audit_event_entities.entity_idattempt_timeline_events.world_slug, attempt_timeline_events.entity_idllm_calls.world_slug, llm_calls.entity_id, llm_calls.profile_labelscenarios.scenario_slug (was unconstrained TEXT — now constrained per ticket §7 line 454)scenario_cognition_profiles.profile_labelscenario_environments.environment_labellabel_text. The drop fails loudly if any column still references it.ScenarioStore trait: removed set_name / unset_name. get_scenario_by_name now resolves directly against scenarios.scenario_slug in both postgres and memory impls (tie-break: created_at DESC, hash ASC).name: Option<String> field from AssembleScenarioInput and ForkScenarioInput. Removed apply_set_name helper from both stores. StoredScenario.names / ScenarioSummary.names now collapse to the singleton [scenario_slug] (kept on the type for wire compat).has_name filter on ListFilter simplified: every scenario has a slug, so Some(true) is a no-op and Some(false) matches no rows.set_scenario_name / unset_scenario_name handlers, dispatcher entries, tool descriptions, and CONSUMER_TOOLS / ALL_TOOLS registrations. Updated assemble_scenario / fork_scenario / get_scenario descriptions to reflect "scenario_slug is the only human-readable identifier."/scenarios/name/:name updated.apply_set_name call sites in assemble_scenario / fork_scenario deleted (postgres + memory).tests/migrations.rs::migrations_human_id_grammar_present asserts: two new domains exist with grammar checks, label_text is dropped, all 17 columns point at the right domain, dropped tables are dropped.migrations_apply_forward updated: 13 core scenario-store tables (was 15) + explicit assertion that scenario_names + scenario_name_history are gone.mcp/tests.rs: get_scenario_by_slug_round_trip and get_scenario_by_unknown_slug_emits_unknown_name (replace the three removed set/unset_scenario_name tests).assemble_with_name_in_same_tx and the entire "names CAS" test block (6 tests) in scenario_store/postgres.rs.Slug::new succeeds under the new grammar: phase0_*, ant_smoke, ph_h_*, llm_*, mllm_*, ant_verify, etc. across tests/phase0.rs, tests/phase_g_routes.rs, tests/phase_h_routes.rs, tests/phase_i_routes.rs, tests/llm_traces_kernel.rs, tests/llm_traces_routes.rs, tests/llm_streaming.rs, tests/ant_scenario.rs, tests/structural_linking.rs, src/world_store/postgres.rs tests.migrations/0005_human_id_grammar.sql (new)src/scenario_store/mod.rs — trait surfacesrc/scenario_store/postgres.rs — postgres impl + testssrc/scenario_store/memory.rs — memory impl + Inner cleanupsrc/world_store/postgres.rs — test fixturessrc/world_store/memory.rs — test fixturessrc/mcp.rs — handlers, registrations, descriptionssrc/mcp/tests.rs — replace removed tests with slug-lookup testssrc/server.rs — store_ant_named test helpersrc/html.rs — 404 page wording + matching testtests/migrations.rs — new domain/column/wipe assertionscargo build --bin chukwa-serve).set_scenario_name MCP tests, added two slug-lookup tests).DATABASE_URL=postgres://postgres:postgres@127.0.0.1:5433/postgres with --features test-fixtures,postgres-tests --test-threads=1.migrations_idempotent test confirms re-running migrations is a no-op.migrations_human_id_grammar_present.active-ant, ant-baseline, phase-h-name, phase-i-name, etc.) now either use the fixture's underscore slug directly or have the name: Some(...) argument removed entirely. No test that exercised the old apply_set_name path remains.catalog_contract_every_fk_target_is_browseable_or_allowlisted) needed no changes — scenario_names / scenario_name_history were never FK targets, just CAS state, so they didn't appear in the table-to-kind map or the allowlist.phase_i_routes.rs URL paths like /intend-systems / /adjudicate-systems / /adjudication-schemas are HTTP route segments (not human_id grammar) and remain hyphenated. Phase C's HTTP-boundary validation will need to consider whether these routes themselves rename to underscores.tests/migrations.rs::migrations_apply_forward had its expected core-table count shifted from 15 → 13 to reflect the dropped name-binding tables.Phase B is NOT deployable on its own. Phase C must land first, AND the deploy will wipe production state. Specifically:
tickets.rs::normalize_labels, route param validators); structured BAD_IDENTIFIER / BAD_SLUG / BAD_LABEL / BAD_ENTITY_ID MCP errors; sweep docs/terms.md and HTML placeholders for hyphen examples → underscore examples.Proceeding to Phase C (MCP/HTTP/ticket-label validation + docs/examples sweep).
Phase C landed at commit f7c69be on feat/human-id-grammar.
Branch state (last 5 commits):
f7c69be feat(human-id): phase C — MCP/HTTP validation + ticket labels + docs sweep
4262066 feat(human-id): phase B — migration 0005 + substrate wipe + scenario_names removal
48ad3f5 feat(human-id): phase A — shared grammar validator + Slug/Label/entity_id underscore-only
0251008 fix(llm): disable Gemma thinking-mode + capture reasoning_content for trace
2b6ade2 Merge fix/llm-runaway-max-tokens: max_tokens cap per cognition phase (2dc48e22)
What landed:
MCP handler validation surface (§9, AC #6). Three new typed McpError constructors — bad_label (BAD_LABEL), bad_entity_id (BAD_ENTITY_ID), bad_identifier (BAD_IDENTIFIER). bad_slug message rewritten so the grammar guidance no longer mentions hyphen. Validation added at handler entry on every identifier-bearing input the ticket calls out:
create_world.slug (already validated; error message now grammar-correct).create_world.scenario_ref.name and parse_scenario_ref.name → BAD_SLUG before any store lookup.get_scenario.name → BAD_SLUG before lookup.assemble_scenario.scenario_slug and fork_scenario.changes.scenario_slug → BAD_SLUG at the handler boundary.parse_label_map keys (cognition_profiles + environments) route through parse_label, which now surfaces grammar misses with BAD_LABEL instead of BAD_ARG.parse_entity_ref content + handle_put_entity deserialized Entity validate entity.id with BAD_ENTITY_ID at the boundary so MCP-supplied content cannot smuggle hyphenated ids past the type system (Entity.id is plain String).handle_get_entity and handle_entity_history already emitted BAD_ENTITY_ID on entity_id arg (Phase A); kept.handle_create_ticket.labels and handle_list_tickets.label → BAD_LABEL via the renamed tickets::validate_labels.HTTP route param validation (§10, AC #19). Five route handlers now validate at entry with new helpers validate_route_slug / validate_route_entity_id / validate_route_label + grammar_error_response:
/w/:slug — slug/w/:slug/turn/:n — slug/w/:slug/entity/:entity_id — slug + entity_id/w/:slug/events?entity_id=... — slug + optional entity_id query filter/scenarios/name/:name — name as Slug (post-Phase-B resolves to scenarios.scenario_slug)Error response shape: HTML mode → 400 Bad Request page that lists the underscore-only grammar and links back to /dashboard. JSON mode (?format=json) → RFC 9457 problem-details with code: BAD_SLUG | BAD_ENTITY_ID | BAD_LABEL, status: 400, type: https://chukwa.local/problems/bad-slug etc. No redirect from hyphenated to underscored. No alternate-spelling fallback.
tickets.rs validate_labels (§11, AC #18). Renamed from normalize_labels and reshaped: the function now delegates to human_id::validate_human_id and rejects rather than canonicalizes. New typed LabelValidationError { Grammar { label, reason } | TooMany { count, max } } maps to BAD_LABEL (grammar) / BAD_ARG (count cap). Behavior diff:
Bug → was lowercased to bug, now rejected with BAD_LABEL.code-nav → was accepted, now rejected with BAD_LABEL.bug → still accepted.code_nav → still accepted.has space → was rejected with whitespace error, now rejected with BAD_LABEL grammar error.Existing fixtures were already on conforming labels (bug, code_nav, persistence); only the test that exercised Bug → bug lowercasing semantics was rewritten to expect BAD_LABEL.
Docs/examples sweep (§12, AC #20). Files modified: 13. Identifier substitutions:
docs/terms.md — rewrote slug grammar section to underscore-only, rewrote entity_id section, dropped the "Normalization" subsection, added LLM-retry-policy paragraph. Examples switched: ant-smoke → ant_smoke, a1-b2_c3 → a1_b2_c3, -ant → _ant.src/mcp.rs — tool descriptions: 'ant-smoke' → 'ant_smoke'; removed "Input is normalized case-insensitively" claims from get_world_entity and entity_history entity_id descriptions.src/html.rs — three dashboard tests use alpha_slug, old_slug, mid_slug, new_slug, beta_slug.src/render.rs — comment example /w/ant-smoke/events → /w/ant_smoke/events.src/resource_catalog.rs — build_route tests use ant_smoke.src/read_models.rs — DetailRequest::with("slug", "ant_smoke"). Local variable normalized (a stale name from the pre-Phase-A normalize era) renamed to entity_id throughout load_world_entity_detail.src/kernel.rs — apply_adjudication local variable normalized renamed to id.src/label.rs and src/world_store/postgres.rs — comments updated from label_text to human_id_text (post-0005 domain name).Grep-discipline check results (§"Acceptance criteria" #23):
rg "normalize|to_lowercase|split_whitespace|replace\(|fallback|try_alt|coerce" src/slug.rs src/label.rs src/entity_id.rs src/mcp.rs src/server.rs src/read_models.rs src/scenarios.rs src/kernel.rs src/minds.rs — surviving hits are all legitimate non-identifier paths:
normalize_text — LLM narration whitespace-collapse for prose, not identifiers; empty_fallback is a prose-text helper.normalized_text — perception/intent text, not entity ids.to_lowercase calls — column-header title-case + WorldStatus enum stringification (e.g. Active → "active"); not first-party identifier paths.// fallback never fires — comment.to_lowercase on subject_contains / body_contains — case-insensitive substring filters on free-text ticket bodies, not identifier paths.rg "\[a-z0-9_-\]|label_text|LeadingHyphen|TrailingHyphen" src migrations docs tests — surviving hits:
label_text in their original SQL; migration 0005 (Phase B) drops the domain. These are historical artifacts that 0005 cleans up. The migrations are sequenced and each runs once; no live codepath references them.migrations/0005_human_id_grammar.sql mentions label_text in its DROP statement and migration prose explaining what 0005 retires.tests/migrations.rs asserts the label_text domain is dropped post-0005.LeadingHyphen / TrailingHyphen identifiers anywhere.[a-z0-9_-] regex pattern outside historical migrations.rg "ant-smoke|alpha-slug|active-ant|ant-baseline|moth-and-flame|phase0-worldslug|oil-lantern|code-nav" src tests docs — surviving hits are all legitimate rejection-test fixtures or external prose:
src/human_id.rs, src/slug.rs, src/label.rs, src/entity_id.rs, src/minds.rs, src/tickets.rs — tests that intentionally pass hyphenated input and assert rejection (the AC #22 list).src/server.rs, src/mcp/tests.rs — newly added tests that exercise the new BAD_SLUG / BAD_LABEL boundary checks against hyphenated input.src/tickets.rs:173 — filename reference (code-nav-submission-order.md) for an external artifact.src/tickets.rs:268 and src/mcp.rs:1348 — English compound noun "code-navigator" / "code-nav work"; not an identifier.docs/terms.md — examples of rejected input.Files modified (13): docs/terms.md, src/html.rs, src/kernel.rs, src/label.rs, src/mcp.rs, src/mcp/tests.rs, src/read_models.rs, src/render.rs, src/resource_catalog.rs, src/server.rs, src/ticket_ops.rs, src/tickets.rs, src/world_store/postgres.rs. Areas: MCP handler dispatch + error taxonomy, HTTP route boundary, ticket label validation, docs/examples, downstream consumer comments.
Verifications:
cargo build --bin chukwa-serve clean (no warnings beyond the Phase A/B baseline).cargo test --lib --features test-fixtures — 528 passed; 0 failed (was 517 in Phase B; +11 net new from §11 validate_labels reshape, new BAD_SLUG / BAD_LABEL / BAD_ENTITY_ID assertions, and new server-route grammar tests).cargo test --tests --features test-fixtures,postgres-tests -- --test-threads=1 against DATABASE_URL=postgres://postgres:postgres@127.0.0.1:5433/postgres (sacrificial sidecar, NEVER cluster) — 674 lib + 142 integration = 816 passed; 0 failed across the seven integration binaries (ant_scenario, bootstrap, graph_ui_auth, llm_streaming, llm_traces_kernel, llm_traces_routes, migrations, phase0, phase_g_routes, phase_h_routes, phase_i_routes, structural_linking).127.0.0.1:5433. Cluster Postgres untouched.Surfaced for the record (Phase D's deploy needs to know):
_sqlx_migrations and never replays.BAD_LABEL, BAD_ENTITY_ID, BAD_IDENTIFIER. Existing callers branching on BAD_ARG for label-grammar or entity-id-grammar misses will now see the typed code instead. Update any client that pattern-matches./w/some-slug will see a 400 instead of a 404 after deploy.?label=BUG) now produces a filter-error chip on /tickets and a BAD_LABEL on the MCP list_tickets. Existing on-disk tickets with conforming labels (bug, code_nav) keep working.Statement of deployability: Phase C is deployable but the deploy WILL wipe production substrate (worlds, attempts, scenarios, components) via the 0005 migration. Phase D handles the deploy + smoke + propose resolution.
Proceeding to Phase D (deploy + smoke + propose resolution).
smoke test ignore
Phase D: proposing resolution. Merged feat/human-id-grammar at 796a971; image rolled to chukwa-d6cf7cd77-5vt2n (sha c3d8fc0715b8); migration 0005 applied success=t at 2026-04-28T08:36:38Z; substrate wiped (12 substrate tables at 0 rows; scenario_names + scenario_name_history dropped; human_id_text + entity_id_text domains in place); 9-step grammar live smoke passed (BAD_SLUG / BAD_LABEL / BAD_ENTITY_ID returned at the right routes; underscore identifiers accepted); grep-discipline checks pass with all hits allowlisted. See proposed_resolution for the full evidence sweep. Awaiting caller acceptance.
Caller accepted: Accepted.
Four phases (A-D) over an overnight cycle. The substrate now enforces a single underscore-only grammar for first-party human-readable identifiers at every layer: type (Slug, Label, entity_id via the new src/human_id.rs shared validator); SQL (the human_id_text and entity_id_text domains, with 17 column type changes from TEXT/label_text to the constrained domains); MCP boundary (typed BAD_IDENTIFIER / BAD_SLUG / BAD_LABEL / BAD_ENTITY_ID errors); HTTP route boundary (5 routes with route-param validation, RFC 9457 problem-details on JSON mode); ticket labels (tickets::normalize_labels renamed to validate_labels, no more silent lowercasing or hyphen-conversion). Hyphenated identifiers are hard rejects. No normalization, no fallback, no alternate spellings.
The architectural commitment held cleanly. The grep-discipline checks (acceptance criterion #23) returned zero offenders in the live identifier code paths — every surviving hit traced to legitimate non-identifier surfaces (LLM artifact text normalization, prose case-insensitive search, RFC 9457 URL formatting, rejection-test fixtures that intentionally feed bad input). The scenario_names and scenario_name_history tables are gone; set_scenario_name and unset_scenario_name are gone; get_scenario({name}) resolves directly against scenarios.scenario_slug because the slug is the name now. Migration 0005's destructive TRUNCATE ... CASCADE over the 25 substrate tables landed cleanly — all 12 substrate-table counters at zero post-deploy, schema verified, domains in place. Pre-authorized; we don't write migrations during this phase of development, we wipe.
I want to register one consequence honestly: the wipe took out my verification worlds for the other tickets in this queue. I had to rebuild a multi-agent test scenario from scratch (two_moths_b, equivalent to the shape of first-meeting) to verify that 2dc48e22 and 4601f21a were actually resolved against the new substrate. That worked — six consecutive multi-agent turns committed cleanly. So the wipe was operationally fine; the cost was an extra ~30 minutes of test infrastructure rebuild on my end. Not a complaint; the substrate-wipe-rather-than-migration discipline is the right call at this phase. Just naming the cost.
While rebuilding, I tripped over a substrate quality gap that this ticket's discipline gave me the framework to recognize. I called put_adjudication_schema with a JSON schema that required only narration and entity_transitions. The substrate accepted it (was_new: true). I assembled a scenario using that schema's hash. The substrate accepted it. The world ran. The kernel rejected the LLM response with llm_adjudication_validation_error because the kernel's Adjudication Rust struct expects five fields (narration, agent_state_after, agent_memory_append, entity_mutations, environment_mutations), not the two my schema described. Substrate had no idea the kernel couldn't honor my schema; kernel had no idea what schema the substrate served to the model.
Same shape of bug as the one this ticket fixed — substrate accepts user data the runtime can't honor, divergence surfaces silently at runtime — just at a different boundary. The fix here was "validate identifier grammar at every entry point." The fix there is going to be "decide who owns the cognition shape: the user or the kernel" — that's a deeper architectural question (we're discussing it offline; it's heading to the consultant). Whatever the answer, the discipline this ticket established is the precedent: fail loud at the substrate boundary when input contradicts what the runtime can serve. Don't accept what we can't honor.
The handler's discipline through this ticket is worth registering. Phase A was the type-layer foundation (shared human_id validator, Slug/Label/entity_id migrations, entity_id::normalize deleted entirely — not soft-deprecated, deleted). Phase B was the SQL-layer foundation (domains, column type changes, table drops, MCP tool removals). Phase C was the boundary enforcement (MCP error taxonomy, HTTP route validation, ticket-label reshape, docs sweep). Phase D was the deploy + smoke. Each phase deployable on its own where safe; Phase A explicitly flagged as not-deployable-alone because the new types reject existing-substrate slugs (the world-store had ant-on-plate and moth-and-flame style identifiers from before this ticket); Phase B's wipe was the gate. The handler called this dependency out loud in their Phase A status comment: "Until Phase B lands and the substrate is wiped, this commit must not ship to chukwa.benac.dev." That's exactly the deployability discipline that prevents accidental partial rollouts.
Two specific things this ticket got right that are worth lifting up:
The validate_labels rename was a small and load-bearing detail. The function used to be normalize_labels and it lowercased + deduped + accepted hyphens. The rename to validate_labels plus the behavior shift to reject-rather-than-canonicalize is the kind of tiny change that names what the function actually does now. Future readers don't have to learn that normalize_labels doesn't normalize anymore; the name tells them. Discipline.
The LeadingHyphen / TrailingHyphen enum variants were deleted, not just unused. A hyphen anywhere in a slug now reports IllegalCharacter { ch: '-', .. } instead of having dedicated variants for edge-positions. The handler called this out as a behavioral change worth flagging because anyone pattern-matching on the old variants would fail to compile. That's the right way to surface a breaking-change-shape: name it, document it, let the type system catch the consequences.
Two minor items surfaced for the record (not for filing):
The phase_i_routes.rs URL paths like /intend-systems, /adjudicate-systems, /adjudication-schemas are HTTP route segments and remain hyphenated. They're not first-party identifiers (route grammar is a separate concern from substrate identifier grammar), but it's worth noting that there's an asymmetry: identifiers are underscore-only, routes are hyphen-separated. Consistent with web conventions; future-readers will benefit from the comment in the codebase that distinguishes them.
Slug and Label error enums are no longer byte-identical to the pre-ticket shape. Variants LeadingHyphen and TrailingHyphen are gone; DoubledUnderscore is new. Any external code pattern-matching on the removed variants would fail to compile — none exist outside the lib, but worth registering as a contract change.
The substrate trajectory from this morning's empty-database state to right now is closed: scenario store → world store → graph browser → LLM cognition trace layer → identifier grammar. Each layer was load-bearing for what landed on top of it. The substrate is now genuinely a coherent foundation rather than a series of feature additions. Whatever the next architectural ticket is (probably the cognition-schema question), it'll land on this foundation rather than alongside it.
Resolution accepted.
Sign in as a human to drive this ticket from the page, or use the MCP tools.
Ticket created: One first-party identifier grammar: underscore-only, enforced repo-wide