docs: MCP HTTP/SSE transport design spec

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
root
2026-06-04 19:59:39 +10:00
parent f780043f2d
commit 858ef53eea

View File

@@ -0,0 +1,82 @@
# MCP HTTP/SSE Transport — Design
**Date:** 2026-06-04 · **Component:** Void 2.0 · **Status:** Approved (design)
## Goal
Expose Void's existing tool registry to **external agents** over the network via the MCP **Streamable HTTP** transport, authenticated, **read + suggest-only** — every mutation routes into the existing `pending_changes` inbox for owner approval. No external agent can ever write directly or escape its assigned Space.
## Background (what already exists)
- `lib/mcp/companion-stdio.js` — an MCP `Server` over **stdio** that Dross (the local `claude` CLI) consumes via `--mcp-config`. Wraps a tool registry; builds tool `ctx` from env.
- `lib/ai/agent/tools/index.js``companionRegistry` = `search`, `read`, `context`, `propose_change`. Comment already anticipates "a future MCP server can import this same registry."
- `propose_change` — gates via `canAct(ctx.agent, action, entity_type)`, stamps `ctx.space_id` onto created space-scoped entities, writes a `pending_changes` row (`agent_id`, `applied:false` **always**).
- `lib/api/middleware/agent_auth.js``agentOrOwner`: CF Access identity (owner) → `OWNER_TOKEN` bearer → agent bearer via `agents.verifyToken`. Agents carry `capabilities` + `scopes` JSON.
- `lib/auth/cf_access.js``verifyAccessJwt` / `accessOwnerEmail` (RS256 vs team JWKS, audience, exp/nbf; fails closed).
## Decisions (locked)
1. **Transport:** Streamable HTTP (SDK 1.x `StreamableHTTPServerTransport`), **stateless** (fresh transport per request; no session store; server sends no unsolicited notifications).
2. **Auth:** CF Access **service token** at the edge **+** per-agent Void **bearer** at the app. Bearer MUST resolve to an `agent` record — owner / CF-only identities are **rejected** on `/mcp` (external agents never inherit owner powers).
3. **Scope:** each external-agent token is bound to **one Space** via `agent.scopes.space_id`. Reads + proposals are confined to that Space. The bound Space is **forced** server-side; any space argument from the client is ignored.
4. **Approach:** in-app (Approach A), not a separate microservice. Security is CF Access + bearer + space-scope enforcement, not process isolation.
## Architecture
```
external agent
└─ HTTPS → mcp.void.hynesy.com (CF Access service-token policy)
└─ cloudflared tunnel → void-server :3000 POST/GET /mcp
└─ mcpAuth (CF service-token verified + agent bearer → agent, space-scoped)
└─ StreamableHTTPServerTransport (stateless)
└─ MCP Server ── externalRegistry (search, read, context, propose_change)
└─ ctx = { agent, space_id: agent.scopes.space_id, view:null, actor }
└─ propose_change → pending_changes (applied:false)
```
## Components
| File | Responsibility |
|---|---|
| `lib/mcp/external-registry.js` | **Dedicated** registry: `search`, `read`, `context`, `propose_change`. Built separately from `companionRegistry` so future Dross tools never auto-expose to the internet. |
| `lib/mcp/http.js` | `createExternalMcpServer()` (same `Server` + `ListTools`/`CallTools` pattern as `companion-stdio.js`, backed by `externalRegistry`); `handleMcp` Express handler (`POST`/`GET`) using a per-request stateless `StreamableHTTPServerTransport`. Builds `ctx` from `req` (not env). |
| `lib/api/middleware/mcp_auth.js` | (a) verify CF Access service token via `cf_access.js`; (b) require `Authorization: Bearer``agents.verifyToken` → must be an `agent` with `scopes.space_id` set, else 401/403; attach `req.agent`. |
| `lib/mcp/context.js` | Add `buildCtxFromAgent(agent)``{ agent, space_id: agent.scopes.space_id, view:null, actor:{kind:'agent', id:agent.id} }`. |
| `server.js` | Mount `app.all('/mcp', mcpAuth, handleMcp)` (before the 404). |
| `lib/ai/agent/tools/read.js` | **Add space-scope enforcement** (see below). |
## Space-scope enforcement (security crux)
Current state verified:
- `search.js` passes `space_id: ctx.space_id` → already restricts to the bound Space. ✔
- `read.js` has **no** space check → a scoped token could read an `entity_id` from another Space. **Must fix:** `read` returns not-found/empty when the target entity's `space_id !== ctx.space_id` (for space-scoped entity types). For owner/Dross (`ctx.space_id` null = unrestricted) behavior is unchanged.
- `context.js` is space-bound via `ctx.space_id`; external `ctx.view` is always `null`, so it returns only Space context. ✔ (Test asserts no cross-space leak via view.)
`mcpAuth` always sets `ctx.space_id` from `agent.scopes.space_id`; tool handlers never trust a client-supplied space. External agents are provisioned with **propose-only** capabilities (no apply) — enforced by `canAct`.
## Error handling
- Missing/invalid bearer → 401 `unauthorized`. Non-agent (owner/CF-only) token on `/mcp` → 401 `agent_required`.
- Agent without `scopes.space_id` → 403 `no_space_scope`.
- Tool errors → MCP `{ isError:true }` (existing pattern), never 500-leak internals.
- Basic per-token rate limit on `/mcp`.
- Every external `tools/call` written to the audit log (actor=agent, tool, space).
## Testing (supertest, serial — `fileParallelism:false`)
1. `initialize` + `tools/list` over `/mcp` returns exactly the 4 external tools.
2. No bearer → 401; owner token → 401 `agent_required`; agent without space scope → 403.
3. `read` of an entity in **another** space → not-found (cross-space block).
4. `search` only returns hits from the bound space.
5. `propose_change` → row in `pending_changes` with correct `agent_id` + stamped `space_id`, `applied:false`.
6. Unit: `external-registry` exposes only the 4 tools; `buildCtxFromAgent` forces the bound space.
## Deployment
- New hostname **`mcp.void.hynesy.com`** → tunnel ingress → void-server `/mcp`; attach a **CF Access service-token** policy.
- Provision the first external-agent record: bearer token, `scopes.space_id = <chosen space>`, propose-only capabilities.
- Ships behind the hardened deploy (`/health` version gate + auto-rollback). Bump to `2.0.0-alpha.14`.
## Out of scope (YAGNI)
Session/stateful MCP, server-initiated notifications/subscriptions, multi-space tokens, OAuth flows, a separate MCP process. Revisit only if a real client needs them.