119 lines
11 KiB
Markdown
119 lines
11 KiB
Markdown
# Plan 5b — Companion model backend: Claude CLI subprocess (Max subscription)
|
||
|
||
> Amendment to Plan 5. Replaces the Anthropic-API-key model backend with the
|
||
> `claude` CLI subprocess approach (subscription auth), mirroring Void 1.0's
|
||
> `lib/agent.js`. REQUIRED SUB-SKILL for execution: subagent-driven-development.
|
||
|
||
**Why:** Claude Max has no API key; the Agent SDK can't use subscription auth headlessly (ToS). Void 1.0 already powers Claude by spawning the locally-authenticated `claude` CLI (it strips `ANTHROPIC_API_KEY` from the child env so the CLI uses the Max subscription). We replicate that. The CLI owns the agentic loop; our four tools are exposed to it via a local MCP server.
|
||
|
||
**Unchanged from Plan 5:** the tool defs + `companionRegistry` (T5–T9), migration 007 + `findOrCreateForSpace` (T3–T4), the per-Space conversation model, persistence, the right-rail UI (T13–T14), and the SSE event vocabulary the UI consumes (`delta` / `tool` / `draft` / `error` / `done`).
|
||
|
||
**Removed/replaced:** `lib/ai/anthropic.js` (API-key adapter) and `lib/ai/agent/runtime.js` (`runTurn`) are no longer the execution path. Keep the files for now but the companion route stops importing them. `@anthropic-ai/sdk` dependency stays (harmless) or is removed in cleanup.
|
||
|
||
## Architecture
|
||
|
||
```
|
||
Browser rail ──POST /turn (SSE)──▶ void-server companion route
|
||
│ writes per-turn: system-prompt file + mcp-config (env carries space/agent/view)
|
||
▼
|
||
spawn `claude -p --output-format stream-json --include-partial-messages
|
||
--session-id <conversation.id> --append-system-prompt <persona>
|
||
--mcp-config <cfg> --strict-mcp-config
|
||
--allowedTools mcp__void__search,mcp__void__read,mcp__void__context,mcp__void__propose_change
|
||
<userText>`
|
||
(child env: ANTHROPIC_API_KEY/ANTHROPIC_AUTH_TOKEN deleted → Max subscription auth)
|
||
│ stdout = stream-json lines
|
||
▼
|
||
map events → SSE (text_delta→delta, tool_use_start→tool, tool_result→(detect draft), result→done)
|
||
│ claude calls tools ──stdio MCP──▶ lib/mcp/companion-stdio.js
|
||
▼ (reuses companionRegistry handlers + pool;
|
||
persist assistant message + draft_ids reads SPACE_ID/AGENT_ID/VOID_VIEW from env)
|
||
```
|
||
|
||
**MCP transport decision:** stdio. The route writes a per-turn `--mcp-config` JSON declaring one server `void` = `{command:"node", args:["/opt/void-server/lib/mcp/companion-stdio.js"], env:{SPACE_ID, AGENT_ID, VOID_VIEW_JSON, DATABASE_URL,…}}`. claude spawns it; it serves the 4 tools over stdio and runs the same handlers against the DB. No new HTTP attack surface; context flows via env. `--strict-mcp-config` ensures only our server is used; built-in tools are excluded by not allow-listing them.
|
||
|
||
**Draft detection:** `propose_change`'s MCP result already contains `pending_change_id`. The route detects a `tool_result` for `propose_change` (or reads the structured result) and emits a `draft` SSE event + collects the id for the assistant message metadata. (Alternatively the tool returns the id in `structuredContent`; the route maps it.)
|
||
|
||
---
|
||
|
||
## Task B1: MCP server exposing the four tools (stdio)
|
||
|
||
**Files:** Create `lib/mcp/companion-stdio.js`; Create `lib/mcp/context.js` (builds tool `ctx` from env); Test `tests/mcp/companion_tools.test.js`
|
||
|
||
- [ ] **Step 1: Failing test** — import the registry-backed dispatch and assert each of the 4 tools is exposed with the right name + that calling `propose_change` through the MCP dispatch writes a `pending_changes` row (reuse the existing handler). Use a small exported `dispatch(toolName, args, ctx)` so the test doesn't need a live stdio transport.
|
||
|
||
```javascript
|
||
// tests/mcp/companion_tools.test.js (sketch — implementer fills real assertions)
|
||
import { describe, it, expect, beforeAll } from 'vitest';
|
||
import { pool } from '../../lib/db/pool.js';
|
||
import { resetDb } from '../helpers/db.js';
|
||
import { migrateUp } from '../../lib/db/migrate.js';
|
||
import { listMcpTools, callMcpTool } from '../../lib/mcp/companion-stdio.js';
|
||
|
||
let spaceId, agentId;
|
||
beforeAll(async () => { await resetDb(); await migrateUp();
|
||
({rows:[{id:spaceId}]} = await pool.query(`INSERT INTO spaces(slug,name) VALUES('s','S') RETURNING id`));
|
||
({rows:[{id:agentId}]} = await pool.query(`SELECT id FROM agents WHERE slug='companion'`)); });
|
||
|
||
it('exposes the four tools', () => {
|
||
expect(listMcpTools().map(t=>t.name).sort()).toEqual(['context','propose_change','read','search']);
|
||
});
|
||
it('propose_change writes a pending_changes row via MCP dispatch', async () => {
|
||
const ctx = { agent:{kind:'agent',id:agentId,capabilities:{read:true,suggest:true,write:false},scopes:{}}, space_id:spaceId };
|
||
const out = await callMcpTool('propose_change', {entity_type:'task',action:'create',payload:{space_id:spaceId,title:'X'}}, ctx);
|
||
expect(out.pending_change_id).toBeTruthy();
|
||
});
|
||
```
|
||
|
||
- [ ] **Step 2:** Run, confirm fail.
|
||
- [ ] **Step 3:** Implement using the MCP SDK (`@modelcontextprotocol/sdk`) stdio server. Register each tool from `companionRegistry.listTools()` (name, description, JSON-Schema `input_schema`). On a tool call, build `ctx` from env via `lib/mcp/context.js` (`{ agent: JSON.parse(env.VOID_AGENT_JSON), space_id: env.SPACE_ID, view: env.VOID_VIEW_JSON?… }`) and invoke the registry handler; return the result as MCP `content` (JSON-stringified) + `structuredContent`. Export thin `listMcpTools()` / `callMcpTool(name,args,ctx)` for the unit test. When run as `main`, start the stdio transport. Add `@modelcontextprotocol/sdk` to deps.
|
||
- [ ] **Step 4:** Run, confirm pass.
|
||
- [ ] **Step 5:** Commit.
|
||
|
||
## Task B2: Claude subprocess driver
|
||
|
||
**Files:** Create `lib/ai/claude_cli.js`; Test `tests/ai/claude_cli.test.js`
|
||
|
||
- [ ] Implement `runClaudeTurn({ sessionId, systemPrompt, userText, mcpConfigPath, allowedTools, onEvent, claudeExe=process.env.CLAUDE_EXE||'claude', cwd })` that:
|
||
- spawns the CLI with `--print --output-format stream-json --include-partial-messages --session-id <sessionId> --append-system-prompt <systemPrompt> --mcp-config <mcpConfigPath> --strict-mcp-config --allowedTools <list> <userText>` (verify exact flag names against `claude --help` on CT 311; V1 used `--append-system-prompt-file` — confirm whether the file variant or inline is correct in 2.1.159);
|
||
- **deletes `ANTHROPIC_API_KEY` and `ANTHROPIC_AUTH_TOKEN` from the child env** (forces subscription auth);
|
||
- parses stdout stream-json lines and calls `onEvent` with normalized events `{type:'delta',text}` | `{type:'tool',tool,status}` | `{type:'tool_result',name,result}` | `{type:'result',usage,cost}` | `{type:'error',message}`;
|
||
- returns `{ text, toolTrace, usage }` on close; handles non-zero exit + timeout.
|
||
- [ ] **TEST WITHOUT THE REAL CLI:** make `claudeExe` injectable and point the test at a fake script (`tests/fixtures/fake-claude.js`, `#!/usr/bin/env node`) that emits canned stream-json lines (a text block, a `tool_use` for `propose_change`, a `tool_result`, a `result`). Assert `onEvent` receives mapped events and the return shape is right. No subscription, no network.
|
||
- [ ] Commit.
|
||
|
||
## Task B3: Rework companion route onto the CLI driver
|
||
|
||
**Files:** Modify `lib/api/routes/companion.js`; Modify `tests/api/companion.test.js`
|
||
|
||
- [ ] The `POST …/turn` handler:
|
||
- persists the user message (unchanged);
|
||
- resolves `{agent, convo}` (unchanged);
|
||
- writes a per-turn system prompt (the existing `SYSTEM` text + a note that `propose_change` drafts go to the owner's inbox) and a per-turn `--mcp-config` temp JSON declaring the `void` stdio server with env `{SPACE_ID, VOID_AGENT_JSON, VOID_VIEW_JSON, DATABASE_URL}` (+ whatever the pool needs);
|
||
- calls `runClaudeTurn({ sessionId: convo.id, ... , claudeExe: req.app.locals.claudeExe||'claude', onEvent: e => send(...) })`, mapping driver events → existing SSE event names the UI expects (`delta`/`tool`/`draft`/`done`/`error`); detect `propose_change` results → `draft` events + collect ids;
|
||
- persists ONE assistant message with `{tool_trace, draft_ids, usage}` (unchanged shape);
|
||
- cleans up temp files.
|
||
- [ ] **Integration test:** inject `req.app.locals.claudeExe` = the fake-claude fixture path (same approach as the old `app.locals.callModel`). Assert SSE emits tool/draft/delta/done, user+assistant rows persisted, a `pending_changes` row created (the fake triggers the real MCP `propose_change`? — if the fake can't run the MCP server, instead have the fake emit a `tool_result` for propose_change and have the route create/detect the draft from that; keep the assertion that assistant.metadata.draft_ids has length 1). Keep it network-free.
|
||
- [ ] Remove the now-unused imports of `runTurn`/`makeCallModel` from the route. Commit.
|
||
|
||
## Task B4: UI event-name reconciliation (only if needed)
|
||
|
||
**Files:** possibly Modify `public/components/rightrail.js`
|
||
|
||
- [ ] Confirm the route still emits exactly `delta`/`tool`/`draft`/`error`/`done` with the same field names the rail reads. If B3 introduced any new event names (e.g. `tool_use_start` vs `tool`), reconcile in the rail (render a chip per tool event; accumulate deltas). Likely a no-op. Commit only if changed.
|
||
|
||
## Task B5: CT 311 enablement + redeploy + smoke
|
||
|
||
- [ ] Ensure `claude` CLI present on CT 311 (done — v2.1.159) and the user has run `claude login` (subscription). Verify `claude -p "hi"` works with API-key env unset.
|
||
- [ ] `npm install` the new `@modelcontextprotocol/sdk` dep is on CT 311 (push.sh runs `npm install`).
|
||
- [ ] Confirm `CLAUDE_EXE` resolves on CT 311 for the `void` systemd user (PATH); set `CLAUDE_EXE=/path/to/claude` in `/opt/void-server/.env` if the service PATH doesn't include the global npm bin.
|
||
- [ ] Snapshot CT 310+311; `TARGET=root@192.168.1.216 ./deploy/push.sh`; verify `/health`.
|
||
- [ ] **Live smoke:** open the rail in a Space → ask a question (expect streamed answer; tool chips if it searches) → "create a task to X" → inline draft card → approve → task exists + clears from Inbox.
|
||
- [ ] Update CHANGELOG + `docs/plan-5-complete.md` (note the CLI-subprocess backend) + memory. Bump to alpha-6 if the deployed alpha-5 behavior materially changed.
|
||
|
||
## Open risks / verify-during-build
|
||
- Exact `claude` 2.1.159 flag spellings (`--append-system-prompt` vs `--append-system-prompt-file`; `--allowedTools` value format — space-separated list vs repeated). Verify against `claude --help` on CT 311 in B2.
|
||
- stream-json schema in 2.1.159 (event `type`s) — sample real output once logged-in to confirm the mapping (V1's `processEvent` is the reference and should be close).
|
||
- The `void` systemd service user must have a logged-in `claude` credential. `claude login` stores creds in the invoking user's home (`~/.claude`/keychain). The service runs as user `void`; the login must be done AS the `void` user (e.g. `su void -c "claude"` /login), not root. Flag this in B5.
|
||
- MCP stdio child inherits env from claude (which inherits from void-server's spawn) → DATABASE_URL/space context must be set on the claude spawn env so it propagates.
|