Sign in to edit tickets from this page.

← all tickets · home

Split MCP surface into consumer (`/mcp`) and operator (`/operator-mcp`) routes

resolved a8c93d6d-1942-4768-a2df-b0921615c6d7

created_at
2026-04-26
updated_at
2026-04-27
priority
P3
ticket_type
feature
labels
mcp, architecture
resolved_at
2026-04-27
resolution
accepted

Body

HOLD — DO NOT PICK UP UNTIL HUMAN AUTHORIZATION.

Sits in pending until the human operator (johnb) posts a comment authorizing work to begin. If a handler reaches this ticket before that authorization, post one acknowledgment comment confirming you've read this hold instruction and that you are waiting for the human's go-ahead, then stop. Do not branch, do not read code, do not draft a plan. The human will return to either authorize, defer, or rewrite.

Background

Today, every MCP tool chukwa offers — code review, ticketing, scenario store, world store — is served at a single endpoint https://chukwa.benac.dev/mcp. Any client connecting there sees the full ~50-tool surface in tools/list. There is no way to configure an MCP client to receive only a subset.

This ticket splits the surface into two URLs that compose additively:

A given agent's MCP configuration adds either:

There is no third configuration. There is no use case for "operator surface only without consumer access" — operators always need substrate access too. The split is purely about being able to give an agent fewer tools when fewer are appropriate.

Single-user context

chukwa has one user: johnb. There is no external consumer story. Both URLs authenticate against the same OAuth credentials, are served by the same pod, and expose the same data. The split is for agent configuration ergonomics, not for security boundaries between user populations.

Execution mode

Phases A through D should each be delegated to a subagent, same pattern used in the world-store ticket (293a300e-abf3-4f7c-85a4-f7129b742769). The handler composes a status comment from each subagent's structured report.

After Phase A lands, post the standard phase-boundary status comment on this ticket and proceed directly into Phase B without pausing for confirmation. Same flow for B → C → D. Status comments are visibility, not gates. The human will intervene at any phase boundary if they see something to redirect; absent that, keep moving.

Tool partition

Consumer surface (/mcp)

Scenario store:

World store:

Operator surface (/operator-mcp)

Code review:

Ticketing:

Tool count audit

If the actual tool inventory differs from this partition when the work begins (new tools added since the spec was written, or tools deleted), pause and surface the discrepancy on this ticket as a comment before proceeding. Don't silently re-bucket new tools.

Authentication

Single OAuth audience. Same credentials for both URLs. The bearer token presented at /operator-mcp is the same shape and same auth flow as at /mcp; the URL difference is purely about which dispatcher receives the JSON-RPC payload.

The token-persistence file at /var/lib/chukwa/oauth_tokens.json continues to track tokens at the audience level (one row per token, not one per URL). A token is valid for both URLs.

Routing

/mcp is preserved verbatim. Existing agent configs pointing at https://chukwa.benac.dev/mcp continue to work without modification, and continue to receive the consumer tool set (which is exactly what they have today, minus code review + ticketing).

This means existing operator agents that pointed at just /mcp will lose access to code review + ticketing when this ticket lands. They must be reconfigured to also include /operator-mcp. Document this clearly in the resolution comment so johnb can update agent configs in one pass.

/operator-mcp is the new path. If a different name is preferred (/meta-mcp, /admin-mcp, etc.), call it out in the Phase A status comment.

Dashboard / web UI

Unchanged. The HTML routes (/dashboard, /w/:slug, ticket views, scenario detail pages) are not part of either MCP surface — they're served directly by axum from the same pod. They continue to use whatever internal Rust APIs they need without going through a public MCP boundary.

Phase plan

Phase A — refactor. Refactor src/mcp.rs so the tool registry is parameterizable. Today the dispatcher's tools/list and dispatch tables are constructed implicitly from the handler functions; after this phase, there are two const arrays (CONSUMER_TOOLS, OPERATOR_TOOLS) and a register_mcp_router(state, tool_set) helper that takes a slice and builds the router. Existing /mcp route registers CONSUMER_TOOLS ∪ OPERATOR_TOOLS so behavior is unchanged at this phase. Tests pass. Subagent.

Phase B — split. Add the second mount point. bin/chukwa-serve.rs registers two router branches: /mcp against CONSUMER_TOOLS, /operator-mcp against OPERATOR_TOOLS. The composed-everything fallback is removed at this phase — /mcp is now consumer-only. Manually verify via curl: tools/list against /mcp returns the consumer tool set, against /operator-mcp returns the operator tool set. Calling a consumer tool against /operator-mcp (or vice versa) returns UNKNOWN_TOOL or the equivalent dispatcher error. Subagent.

Phase C — smoke. Build, deploy, smoke. Reconfigure johnb's primary operator agent (this conversation) to point at both URLs. Confirm a representative call against each surface succeeds: list_scenarios against /mcp, list_tickets against /operator-mcp. Confirm the dispatcher does not leak operator tools through /mcp or vice versa. Subagent.

Phase D — wrap-up. proposed_resolution with the verified tool counts at each URL, the test results, and explicit reconfiguration instructions for any other agent configs. Subagent (or handler-direct, since this phase is just composing the resolution from prior phase reports).

Acceptance

  1. tools/list against https://chukwa.benac.dev/mcp returns the consumer tool set only. No code-review tools, no ticketing tools.
  2. tools/list against https://chukwa.benac.dev/operator-mcp returns the operator tool set only. No scenario-store tools, no world-store tools.
  3. Both URLs authenticate against the same OAuth credentials. A token issued for one is valid for the other.
  4. Existing pod startup logs show both routes registered.
  5. All existing cargo test --lib --features test-fixtures and cargo test --tests --features test-fixtures,postgres-tests -- --test-threads=1 baselines hold (no test count regression beyond what's intentional).
  6. Smoke: a real request against each URL succeeds.

Out of scope

Sequencing

Independent of any pending ticket today. Can be picked up whenever after authorization.

Proposed resolution

The MCP surface is now split: /mcp serves the consumer tool set (scenario store + world store, 36 tools); /operator-mcp serves the operator tool set (code review + ticketing, 20 tools). Existing consumer agent configs continue to work unchanged; operator agents must add /operator-mcp to access code-review + ticketing tools.

Phase summary

PhaseCommitWhat landed
Aa8abedcparameterized tool registry; CONSUMER_TOOLS / OPERATOR_TOOLS const arrays; ALL_TOOLS; dispatch_with_tools; tools_call_filtered; tool_manifest_document_filtered; register_mcp_router(state, path, tool_set) helper; dispatcher allowed-set check returning UNKNOWN_TOOL; 7 partition-guard tests in mcp/tests.rs. /mcp still pinned to ALL_TOOLS so live surface unchanged.
B3a65683router() in src/server.rs mounts /mcp → CONSUMER_TOOLS (36) and /operator-mcp → OPERATOR_TOOLS (20); the ALL_TOOLS mount replaced. 4 new route-level integration tests via tower::ServiceExt::oneshot.
C554ccb4 (merge)merged feat/mcp-route-split to main; built chukwa:latest (sha256 63d957c71a4d); rolled deployment/chukwa to pod chukwa-56d574bf44-5l9pl. Curl smoke 4/4 passed. Wrapper at /root/.config/chukwa-mcp/mcp.sh updated to route by tool name; original preserved at mcp.sh.pre-split. Wrapper smoke 2/2 passed. Migrations 0001 + 0002 still success=t.

Verified tool counts at each URL (from Phase C live curl smoke)

Test results

Reconfiguration instructions for other agent configs

If any agent that previously talked to chukwa was configured to receive operator tools (code review + ticketing) by pointing at /mcp, that agent will start receiving UNKNOWN_TOOL for those calls. To restore access, the agent's MCP config needs to add a second URL: https://chukwa.benac.dev/operator-mcp (same OAuth credentials — single audience).

For johnb's primary operator agent (this conversation), the wrapper at /root/.config/chukwa-mcp/mcp.sh was updated in Phase C to route by tool name. The OPERATOR tool list in the wrapper mirrors the const in src/mcp.rs and includes (verified verbatim against the live wrapper case statement):

browse_codebase, outline, list_code_files, find_definition, find_references, read_code, search_code,
git_log, git_diff, git_show_commit, git_file_history,
create_ticket, get_ticket, list_tickets, add_ticket_comment, file_followup,
handler_respond_ticket, user_confirm_resolution, user_cancel_ticket, user_change_ticket_status

That's 20 tools, matching OPERATOR_TOOLS in src/mcp.rs. Anything not in this list routes to /mcp.

If the wrapper's tool list ever drifts from the const (new operator tools added in code, or moved between buckets), the wrapper will misroute. Test commands to verify alignment after future changes:

bash /root/.config/chukwa-mcp/mcp.sh list_scenarios '{}'   # → /mcp, succeeds
bash /root/.config/chukwa-mcp/mcp.sh list_tickets '{}'      # → /operator-mcp, succeeds

Rollback for the wrapper is mv /root/.config/chukwa-mcp/mcp.sh.pre-split /root/.config/chukwa-mcp/mcp.sh — the pre-split copy talks only to /mcp against the new server, which will fail for operator tools but is preserved as a safety net.

Architectural delta

Surfaced for follow-up (not filed)

Closing

All Phase B integration tests still pass against the post-merge main. The MCP surface is split, the wrapper routes by tool, and the live smoke confirms the partition holds. Awaiting caller acceptance.

History (8 events)

Sign in as a human to drive this ticket from the page, or use the MCP tools.