Files
Void-Homelab/docs/superpowers/plans/2026-06-01-void-v2-plan5b-claude-cli-pivot.md

11 KiB
Raw Blame History

Plan 5b — Companion model backend: Claude CLI subprocess (Max subscription)

Amendment to Plan 5. Replaces the Anthropic-API-key model backend with the claude CLI subprocess approach (subscription auth), mirroring Void 1.0's lib/agent.js. REQUIRED SUB-SKILL for execution: subagent-driven-development.

Why: Claude Max has no API key; the Agent SDK can't use subscription auth headlessly (ToS). Void 1.0 already powers Claude by spawning the locally-authenticated claude CLI (it strips ANTHROPIC_API_KEY from the child env so the CLI uses the Max subscription). We replicate that. The CLI owns the agentic loop; our four tools are exposed to it via a local MCP server.

Unchanged from Plan 5: the tool defs + companionRegistry (T5T9), migration 007 + findOrCreateForSpace (T3T4), the per-Space conversation model, persistence, the right-rail UI (T13T14), and the SSE event vocabulary the UI consumes (delta / tool / draft / error / done).

Removed/replaced: lib/ai/anthropic.js (API-key adapter) and lib/ai/agent/runtime.js (runTurn) are no longer the execution path. Keep the files for now but the companion route stops importing them. @anthropic-ai/sdk dependency stays (harmless) or is removed in cleanup.

Architecture

Browser rail ──POST /turn (SSE)──▶ void-server companion route
                                     │ writes per-turn: system-prompt file + mcp-config (env carries space/agent/view)
                                     ▼
                                spawn `claude -p --output-format stream-json --include-partial-messages
                                       --session-id <conversation.id> --append-system-prompt <persona>
                                       --mcp-config <cfg> --strict-mcp-config
                                       --allowedTools mcp__void__search,mcp__void__read,mcp__void__context,mcp__void__propose_change
                                       <userText>`
                                       (child env: ANTHROPIC_API_KEY/ANTHROPIC_AUTH_TOKEN deleted → Max subscription auth)
                                     │ stdout = stream-json lines
                                     ▼
                                map events → SSE (text_delta→delta, tool_use_start→tool, tool_result→(detect draft), result→done)
                                     │ claude calls tools ──stdio MCP──▶ lib/mcp/companion-stdio.js
                                     ▼                                    (reuses companionRegistry handlers + pool;
                                persist assistant message + draft_ids       reads SPACE_ID/AGENT_ID/VOID_VIEW from env)

MCP transport decision: stdio. The route writes a per-turn --mcp-config JSON declaring one server void = {command:"node", args:["/opt/void-server/lib/mcp/companion-stdio.js"], env:{SPACE_ID, AGENT_ID, VOID_VIEW_JSON, DATABASE_URL,…}}. claude spawns it; it serves the 4 tools over stdio and runs the same handlers against the DB. No new HTTP attack surface; context flows via env. --strict-mcp-config ensures only our server is used; built-in tools are excluded by not allow-listing them.

Draft detection: propose_change's MCP result already contains pending_change_id. The route detects a tool_result for propose_change (or reads the structured result) and emits a draft SSE event + collects the id for the assistant message metadata. (Alternatively the tool returns the id in structuredContent; the route maps it.)


Task B1: MCP server exposing the four tools (stdio)

Files: Create lib/mcp/companion-stdio.js; Create lib/mcp/context.js (builds tool ctx from env); Test tests/mcp/companion_tools.test.js

  • Step 1: Failing test — import the registry-backed dispatch and assert each of the 4 tools is exposed with the right name + that calling propose_change through the MCP dispatch writes a pending_changes row (reuse the existing handler). Use a small exported dispatch(toolName, args, ctx) so the test doesn't need a live stdio transport.
// tests/mcp/companion_tools.test.js (sketch — implementer fills real assertions)
import { describe, it, expect, beforeAll } from 'vitest';
import { pool } from '../../lib/db/pool.js';
import { resetDb } from '../helpers/db.js';
import { migrateUp } from '../../lib/db/migrate.js';
import { listMcpTools, callMcpTool } from '../../lib/mcp/companion-stdio.js';

let spaceId, agentId;
beforeAll(async () => { await resetDb(); await migrateUp();
  ({rows:[{id:spaceId}]} = await pool.query(`INSERT INTO spaces(slug,name) VALUES('s','S') RETURNING id`));
  ({rows:[{id:agentId}]} = await pool.query(`SELECT id FROM agents WHERE slug='companion'`)); });

it('exposes the four tools', () => {
  expect(listMcpTools().map(t=>t.name).sort()).toEqual(['context','propose_change','read','search']);
});
it('propose_change writes a pending_changes row via MCP dispatch', async () => {
  const ctx = { agent:{kind:'agent',id:agentId,capabilities:{read:true,suggest:true,write:false},scopes:{}}, space_id:spaceId };
  const out = await callMcpTool('propose_change', {entity_type:'task',action:'create',payload:{space_id:spaceId,title:'X'}}, ctx);
  expect(out.pending_change_id).toBeTruthy();
});
  • Step 2: Run, confirm fail.
  • Step 3: Implement using the MCP SDK (@modelcontextprotocol/sdk) stdio server. Register each tool from companionRegistry.listTools() (name, description, JSON-Schema input_schema). On a tool call, build ctx from env via lib/mcp/context.js ({ agent: JSON.parse(env.VOID_AGENT_JSON), space_id: env.SPACE_ID, view: env.VOID_VIEW_JSON?… }) and invoke the registry handler; return the result as MCP content (JSON-stringified) + structuredContent. Export thin listMcpTools() / callMcpTool(name,args,ctx) for the unit test. When run as main, start the stdio transport. Add @modelcontextprotocol/sdk to deps.
  • Step 4: Run, confirm pass.
  • Step 5: Commit.

Task B2: Claude subprocess driver

Files: Create lib/ai/claude_cli.js; Test tests/ai/claude_cli.test.js

  • Implement runClaudeTurn({ sessionId, systemPrompt, userText, mcpConfigPath, allowedTools, onEvent, claudeExe=process.env.CLAUDE_EXE||'claude', cwd }) that:
    • spawns the CLI with --print --output-format stream-json --include-partial-messages --session-id <sessionId> --append-system-prompt <systemPrompt> --mcp-config <mcpConfigPath> --strict-mcp-config --allowedTools <list> <userText> (verify exact flag names against claude --help on CT 311; V1 used --append-system-prompt-file — confirm whether the file variant or inline is correct in 2.1.159);
    • deletes ANTHROPIC_API_KEY and ANTHROPIC_AUTH_TOKEN from the child env (forces subscription auth);
    • parses stdout stream-json lines and calls onEvent with normalized events {type:'delta',text} | {type:'tool',tool,status} | {type:'tool_result',name,result} | {type:'result',usage,cost} | {type:'error',message};
    • returns { text, toolTrace, usage } on close; handles non-zero exit + timeout.
  • TEST WITHOUT THE REAL CLI: make claudeExe injectable and point the test at a fake script (tests/fixtures/fake-claude.js, #!/usr/bin/env node) that emits canned stream-json lines (a text block, a tool_use for propose_change, a tool_result, a result). Assert onEvent receives mapped events and the return shape is right. No subscription, no network.
  • Commit.

Task B3: Rework companion route onto the CLI driver

Files: Modify lib/api/routes/companion.js; Modify tests/api/companion.test.js

  • The POST …/turn handler:
    • persists the user message (unchanged);
    • resolves {agent, convo} (unchanged);
    • writes a per-turn system prompt (the existing SYSTEM text + a note that propose_change drafts go to the owner's inbox) and a per-turn --mcp-config temp JSON declaring the void stdio server with env {SPACE_ID, VOID_AGENT_JSON, VOID_VIEW_JSON, DATABASE_URL} (+ whatever the pool needs);
    • calls runClaudeTurn({ sessionId: convo.id, ... , claudeExe: req.app.locals.claudeExe||'claude', onEvent: e => send(...) }), mapping driver events → existing SSE event names the UI expects (delta/tool/draft/done/error); detect propose_change results → draft events + collect ids;
    • persists ONE assistant message with {tool_trace, draft_ids, usage} (unchanged shape);
    • cleans up temp files.
  • Integration test: inject req.app.locals.claudeExe = the fake-claude fixture path (same approach as the old app.locals.callModel). Assert SSE emits tool/draft/delta/done, user+assistant rows persisted, a pending_changes row created (the fake triggers the real MCP propose_change? — if the fake can't run the MCP server, instead have the fake emit a tool_result for propose_change and have the route create/detect the draft from that; keep the assertion that assistant.metadata.draft_ids has length 1). Keep it network-free.
  • Remove the now-unused imports of runTurn/makeCallModel from the route. Commit.

Task B4: UI event-name reconciliation (only if needed)

Files: possibly Modify public/components/rightrail.js

  • Confirm the route still emits exactly delta/tool/draft/error/done with the same field names the rail reads. If B3 introduced any new event names (e.g. tool_use_start vs tool), reconcile in the rail (render a chip per tool event; accumulate deltas). Likely a no-op. Commit only if changed.

Task B5: CT 311 enablement + redeploy + smoke

  • Ensure claude CLI present on CT 311 (done — v2.1.159) and the user has run claude login (subscription). Verify claude -p "hi" works with API-key env unset.
  • npm install the new @modelcontextprotocol/sdk dep is on CT 311 (push.sh runs npm install).
  • Confirm CLAUDE_EXE resolves on CT 311 for the void systemd user (PATH); set CLAUDE_EXE=/path/to/claude in /opt/void-server/.env if the service PATH doesn't include the global npm bin.
  • Snapshot CT 310+311; TARGET=root@192.168.1.216 ./deploy/push.sh; verify /health.
  • Live smoke: open the rail in a Space → ask a question (expect streamed answer; tool chips if it searches) → "create a task to X" → inline draft card → approve → task exists + clears from Inbox.
  • Update CHANGELOG + docs/plan-5-complete.md (note the CLI-subprocess backend) + memory. Bump to alpha-6 if the deployed alpha-5 behavior materially changed.

Open risks / verify-during-build

  • Exact claude 2.1.159 flag spellings (--append-system-prompt vs --append-system-prompt-file; --allowedTools value format — space-separated list vs repeated). Verify against claude --help on CT 311 in B2.
  • stream-json schema in 2.1.159 (event types) — sample real output once logged-in to confirm the mapping (V1's processEvent is the reference and should be close).
  • The void systemd service user must have a logged-in claude credential. claude login stores creds in the invoking user's home (~/.claude/keychain). The service runs as user void; the login must be done AS the void user (e.g. su void -c "claude" /login), not root. Flag this in B5.
  • MCP stdio child inherits env from claude (which inherits from void-server's spawn) → DATABASE_URL/space context must be set on the claude spawn env so it propagates.