Files
Void-Homelab/docs/superpowers/plans/2026-06-01-void-v2-plan5b-claude-cli-pivot.md

119 lines
11 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Plan 5b — Companion model backend: Claude CLI subprocess (Max subscription)
> Amendment to Plan 5. Replaces the Anthropic-API-key model backend with the
> `claude` CLI subprocess approach (subscription auth), mirroring Void 1.0's
> `lib/agent.js`. REQUIRED SUB-SKILL for execution: subagent-driven-development.
**Why:** Claude Max has no API key; the Agent SDK can't use subscription auth headlessly (ToS). Void 1.0 already powers Claude by spawning the locally-authenticated `claude` CLI (it strips `ANTHROPIC_API_KEY` from the child env so the CLI uses the Max subscription). We replicate that. The CLI owns the agentic loop; our four tools are exposed to it via a local MCP server.
**Unchanged from Plan 5:** the tool defs + `companionRegistry` (T5T9), migration 007 + `findOrCreateForSpace` (T3T4), the per-Space conversation model, persistence, the right-rail UI (T13T14), and the SSE event vocabulary the UI consumes (`delta` / `tool` / `draft` / `error` / `done`).
**Removed/replaced:** `lib/ai/anthropic.js` (API-key adapter) and `lib/ai/agent/runtime.js` (`runTurn`) are no longer the execution path. Keep the files for now but the companion route stops importing them. `@anthropic-ai/sdk` dependency stays (harmless) or is removed in cleanup.
## Architecture
```
Browser rail ──POST /turn (SSE)──▶ void-server companion route
│ writes per-turn: system-prompt file + mcp-config (env carries space/agent/view)
spawn `claude -p --output-format stream-json --include-partial-messages
--session-id <conversation.id> --append-system-prompt <persona>
--mcp-config <cfg> --strict-mcp-config
--allowedTools mcp__void__search,mcp__void__read,mcp__void__context,mcp__void__propose_change
<userText>`
(child env: ANTHROPIC_API_KEY/ANTHROPIC_AUTH_TOKEN deleted → Max subscription auth)
│ stdout = stream-json lines
map events → SSE (text_delta→delta, tool_use_start→tool, tool_result→(detect draft), result→done)
│ claude calls tools ──stdio MCP──▶ lib/mcp/companion-stdio.js
▼ (reuses companionRegistry handlers + pool;
persist assistant message + draft_ids reads SPACE_ID/AGENT_ID/VOID_VIEW from env)
```
**MCP transport decision:** stdio. The route writes a per-turn `--mcp-config` JSON declaring one server `void` = `{command:"node", args:["/opt/void-server/lib/mcp/companion-stdio.js"], env:{SPACE_ID, AGENT_ID, VOID_VIEW_JSON, DATABASE_URL,…}}`. claude spawns it; it serves the 4 tools over stdio and runs the same handlers against the DB. No new HTTP attack surface; context flows via env. `--strict-mcp-config` ensures only our server is used; built-in tools are excluded by not allow-listing them.
**Draft detection:** `propose_change`'s MCP result already contains `pending_change_id`. The route detects a `tool_result` for `propose_change` (or reads the structured result) and emits a `draft` SSE event + collects the id for the assistant message metadata. (Alternatively the tool returns the id in `structuredContent`; the route maps it.)
---
## Task B1: MCP server exposing the four tools (stdio)
**Files:** Create `lib/mcp/companion-stdio.js`; Create `lib/mcp/context.js` (builds tool `ctx` from env); Test `tests/mcp/companion_tools.test.js`
- [ ] **Step 1: Failing test** — import the registry-backed dispatch and assert each of the 4 tools is exposed with the right name + that calling `propose_change` through the MCP dispatch writes a `pending_changes` row (reuse the existing handler). Use a small exported `dispatch(toolName, args, ctx)` so the test doesn't need a live stdio transport.
```javascript
// tests/mcp/companion_tools.test.js (sketch — implementer fills real assertions)
import { describe, it, expect, beforeAll } from 'vitest';
import { pool } from '../../lib/db/pool.js';
import { resetDb } from '../helpers/db.js';
import { migrateUp } from '../../lib/db/migrate.js';
import { listMcpTools, callMcpTool } from '../../lib/mcp/companion-stdio.js';
let spaceId, agentId;
beforeAll(async () => { await resetDb(); await migrateUp();
({rows:[{id:spaceId}]} = await pool.query(`INSERT INTO spaces(slug,name) VALUES('s','S') RETURNING id`));
({rows:[{id:agentId}]} = await pool.query(`SELECT id FROM agents WHERE slug='companion'`)); });
it('exposes the four tools', () => {
expect(listMcpTools().map(t=>t.name).sort()).toEqual(['context','propose_change','read','search']);
});
it('propose_change writes a pending_changes row via MCP dispatch', async () => {
const ctx = { agent:{kind:'agent',id:agentId,capabilities:{read:true,suggest:true,write:false},scopes:{}}, space_id:spaceId };
const out = await callMcpTool('propose_change', {entity_type:'task',action:'create',payload:{space_id:spaceId,title:'X'}}, ctx);
expect(out.pending_change_id).toBeTruthy();
});
```
- [ ] **Step 2:** Run, confirm fail.
- [ ] **Step 3:** Implement using the MCP SDK (`@modelcontextprotocol/sdk`) stdio server. Register each tool from `companionRegistry.listTools()` (name, description, JSON-Schema `input_schema`). On a tool call, build `ctx` from env via `lib/mcp/context.js` (`{ agent: JSON.parse(env.VOID_AGENT_JSON), space_id: env.SPACE_ID, view: env.VOID_VIEW_JSON?… }`) and invoke the registry handler; return the result as MCP `content` (JSON-stringified) + `structuredContent`. Export thin `listMcpTools()` / `callMcpTool(name,args,ctx)` for the unit test. When run as `main`, start the stdio transport. Add `@modelcontextprotocol/sdk` to deps.
- [ ] **Step 4:** Run, confirm pass.
- [ ] **Step 5:** Commit.
## Task B2: Claude subprocess driver
**Files:** Create `lib/ai/claude_cli.js`; Test `tests/ai/claude_cli.test.js`
- [ ] Implement `runClaudeTurn({ sessionId, systemPrompt, userText, mcpConfigPath, allowedTools, onEvent, claudeExe=process.env.CLAUDE_EXE||'claude', cwd })` that:
- spawns the CLI with `--print --output-format stream-json --include-partial-messages --session-id <sessionId> --append-system-prompt <systemPrompt> --mcp-config <mcpConfigPath> --strict-mcp-config --allowedTools <list> <userText>` (verify exact flag names against `claude --help` on CT 311; V1 used `--append-system-prompt-file` — confirm whether the file variant or inline is correct in 2.1.159);
- **deletes `ANTHROPIC_API_KEY` and `ANTHROPIC_AUTH_TOKEN` from the child env** (forces subscription auth);
- parses stdout stream-json lines and calls `onEvent` with normalized events `{type:'delta',text}` | `{type:'tool',tool,status}` | `{type:'tool_result',name,result}` | `{type:'result',usage,cost}` | `{type:'error',message}`;
- returns `{ text, toolTrace, usage }` on close; handles non-zero exit + timeout.
- [ ] **TEST WITHOUT THE REAL CLI:** make `claudeExe` injectable and point the test at a fake script (`tests/fixtures/fake-claude.js`, `#!/usr/bin/env node`) that emits canned stream-json lines (a text block, a `tool_use` for `propose_change`, a `tool_result`, a `result`). Assert `onEvent` receives mapped events and the return shape is right. No subscription, no network.
- [ ] Commit.
## Task B3: Rework companion route onto the CLI driver
**Files:** Modify `lib/api/routes/companion.js`; Modify `tests/api/companion.test.js`
- [ ] The `POST …/turn` handler:
- persists the user message (unchanged);
- resolves `{agent, convo}` (unchanged);
- writes a per-turn system prompt (the existing `SYSTEM` text + a note that `propose_change` drafts go to the owner's inbox) and a per-turn `--mcp-config` temp JSON declaring the `void` stdio server with env `{SPACE_ID, VOID_AGENT_JSON, VOID_VIEW_JSON, DATABASE_URL}` (+ whatever the pool needs);
- calls `runClaudeTurn({ sessionId: convo.id, ... , claudeExe: req.app.locals.claudeExe||'claude', onEvent: e => send(...) })`, mapping driver events → existing SSE event names the UI expects (`delta`/`tool`/`draft`/`done`/`error`); detect `propose_change` results → `draft` events + collect ids;
- persists ONE assistant message with `{tool_trace, draft_ids, usage}` (unchanged shape);
- cleans up temp files.
- [ ] **Integration test:** inject `req.app.locals.claudeExe` = the fake-claude fixture path (same approach as the old `app.locals.callModel`). Assert SSE emits tool/draft/delta/done, user+assistant rows persisted, a `pending_changes` row created (the fake triggers the real MCP `propose_change`? — if the fake can't run the MCP server, instead have the fake emit a `tool_result` for propose_change and have the route create/detect the draft from that; keep the assertion that assistant.metadata.draft_ids has length 1). Keep it network-free.
- [ ] Remove the now-unused imports of `runTurn`/`makeCallModel` from the route. Commit.
## Task B4: UI event-name reconciliation (only if needed)
**Files:** possibly Modify `public/components/rightrail.js`
- [ ] Confirm the route still emits exactly `delta`/`tool`/`draft`/`error`/`done` with the same field names the rail reads. If B3 introduced any new event names (e.g. `tool_use_start` vs `tool`), reconcile in the rail (render a chip per tool event; accumulate deltas). Likely a no-op. Commit only if changed.
## Task B5: CT 311 enablement + redeploy + smoke
- [ ] Ensure `claude` CLI present on CT 311 (done — v2.1.159) and the user has run `claude login` (subscription). Verify `claude -p "hi"` works with API-key env unset.
- [ ] `npm install` the new `@modelcontextprotocol/sdk` dep is on CT 311 (push.sh runs `npm install`).
- [ ] Confirm `CLAUDE_EXE` resolves on CT 311 for the `void` systemd user (PATH); set `CLAUDE_EXE=/path/to/claude` in `/opt/void-server/.env` if the service PATH doesn't include the global npm bin.
- [ ] Snapshot CT 310+311; `TARGET=root@192.168.1.216 ./deploy/push.sh`; verify `/health`.
- [ ] **Live smoke:** open the rail in a Space → ask a question (expect streamed answer; tool chips if it searches) → "create a task to X" → inline draft card → approve → task exists + clears from Inbox.
- [ ] Update CHANGELOG + `docs/plan-5-complete.md` (note the CLI-subprocess backend) + memory. Bump to alpha-6 if the deployed alpha-5 behavior materially changed.
## Open risks / verify-during-build
- Exact `claude` 2.1.159 flag spellings (`--append-system-prompt` vs `--append-system-prompt-file`; `--allowedTools` value format — space-separated list vs repeated). Verify against `claude --help` on CT 311 in B2.
- stream-json schema in 2.1.159 (event `type`s) — sample real output once logged-in to confirm the mapping (V1's `processEvent` is the reference and should be close).
- The `void` systemd service user must have a logged-in `claude` credential. `claude login` stores creds in the invoking user's home (`~/.claude`/keychain). The service runs as user `void`; the login must be done AS the `void` user (e.g. `su void -c "claude"` /login), not root. Flag this in B5.
- MCP stdio child inherits env from claude (which inherits from void-server's spawn) → DATABASE_URL/space context must be set on the claude spawn env so it propagates.