diff --git a/docs/superpowers/specs/2026-06-04-little-blue-design.md b/docs/superpowers/specs/2026-06-04-little-blue-design.md new file mode 100644 index 0000000..b80a1f6 --- /dev/null +++ b/docs/superpowers/specs/2026-06-04-little-blue-design.md @@ -0,0 +1,80 @@ +# Little Blue — Design + +**Date:** 2026-06-04 · **Component:** Void 2.0 · **Phase:** Plan 7 (Agent Layer), brick 2 · **Status:** Approved (design) + +## Goal + +Give Little Blue, the homelab caretaker agent, the ability to **fix things** — restart lab services and power-manage Proxmox guests — through a least-privilege, tiered-approval, fully-audited action framework. The LLM is the weakest link by design: it can only name an action id from a fixed, version-controlled whitelist; it never constructs commands. + +## Background + +- Void app runs on CT 311 (`.216`) and today has **no execution access** to any host (SSH to Z = Permission denied; no Proxmox token). +- Shared agent-chat foundation exists (`run_turn.js`, `agent_chat.js`, `personas/`) — Little Blue reuses it like Yerin did. +- `config/services.json` provides the service→host map (action targets). `pending_changes` provides the approval lifecycle to mirror. +- Little Blue today = the read-only health band UI + `littleblue_avatar.js`. No `little-blue` agent seeded yet. + +## Decisions (locked) + +1. **Tiered actions:** `safe` → execute directly + audit; `risky` → queue for owner approval, then execute. +2. **Two execution channels:** scoped **Proxmox API token** for guest power; **SSH forced-command wrapper** for in-guest service restarts. Both enforce the whitelist **server-side**. +3. **One brick:** action framework + manual UI + conversational Little Blue together. +4. **Whitelist = `config/actions.json`** (version-controlled, immutable at runtime) — the single source of truth. +5. **Dedicated `agent_actions` queue** (not `pending_changes`) — isolate command execution from entity-CRUD apply. + +## Architecture + +### 1. Action whitelist — `config/actions.json` +Array of action defs. Nothing outside this file is runnable. +```json +{ "id": "restart-caddy-ct100", "label": "Restart Caddy on mediastack", "kind": "service_restart", "host": "ct100", "service": "caddy", "tier": "safe" } +{ "id": "stop-ct107", "label": "Stop iVentoy (CT 107)", "kind": "guest_power", "node": "z", "vmid": 107, "op": "stop", "tier": "risky" } +``` +Tier convention: `service_restart` → `safe`; guest `start` → `safe`; guest `stop`/`shutdown`/`reboot` → `risky`. The `tier` is explicit per action (config wins). + +`lib/actions/registry.js`: `loadActions()` (read + validate config), `getAction(id)`, `listActions()`. Validation rejects unknown `kind`/`op`/`tier` at load. + +### 2. Execution channels — `lib/actions/channels/` +- `proxmox.js`: `power(node, vmid, op)` → Proxmox REST (`POST /nodes/{node}/lxc|qemu/{vmid}/status/{op}`) with `Authorization: PVEAPIToken=...`. Env: `PROXMOX_API_URL`, `PROXMOX_API_TOKEN`. Token scoped server-side to `VM.PowerMgmt` on whitelisted guests only. +- `ssh.js`: `restart(host, actionId)` → spawns `ssh -i -o BatchMode=yes voidact@ `. The host's `authorized_keys` pins `command="/usr/local/bin/void-act"` (forced). The wrapper holds its OWN whitelist copy, reads `SSH_ORIGINAL_COMMAND` (the action id), maps it to `systemctl restart `, and refuses anything else. The Void can send only an id; the host decides what runs. +- Both adapters take an injectable transport (fetch / spawn) for tests; they NEVER interpolate a raw shell command from caller input. + +### 3. Action service — `lib/actions/service.js` +- `runAction(id, actor)`: `getAction(id)` (404 if absent). `safe` → execute via channel now, audit, return `{ executed:true, result }`. `risky` → `agentActions.create({...status:'pending'})`, return `{ queued:true, action_row_id }`. No execution for risky. +- `approveAction(rowId, owner)`: claim pending row → execute via channel → `status:'executed'`+result, or `status:'failed'`+error; audit. +- `rejectAction(rowId, owner)`: `status:'rejected'`; audit. +- `execute(action)`: dispatch on `kind` → `proxmox.power` / `ssh.restart`. Single choke point; audits every call. + +### 4. `agent_actions` table — new migration `016_agent_actions.sql` +`id uuid pk, action_id text, params jsonb, agent_id uuid null, tier text, status text default 'pending' (pending|executed|failed|rejected), result jsonb, requested_by jsonb, resolved_by jsonb, created_at, resolved_at`. Repo `lib/db/repos/agent_actions.js`: `create`, `listPending`, `getById`, `resolve(id,status,result,by)` (claims `WHERE status='pending'`), `recent(limit)`. + +### 5. API — `lib/api/routes/actions.js` (owner-gated, mounted at `/api/actions`) +- `GET /` → whitelist (+ optional live guest/service status later). +- `POST /:id/run` → `runAction` (safe executes, risky queues). +- `GET /pending` → queued risky actions. +- `POST /pending/:rowId/approve` | `/pending/:rowId/reject`. +- `GET /recent` → recent executed/failed (audit view). + +### 6. Little Blue — agent + tools + chat +- Migration seeds `little-blue` agent (`kind:'claude'`, capabilities `{ read:true, act:true }`). +- MCP `blueRegistry` (`lib/ai/agent/tools/blue/`): `service_status` (read health/registry), `list_actions` (the whitelist she may use), `propose_action(action_id)` → calls `service.runAction` (safe runs + reports; risky queues + tells the owner it needs approval). She only passes ids; never commands. +- `companion-stdio.js` already selects registry by `VOID_TOOL_REGISTRY`; add `blue` → `blueRegistry`. +- Persona (`personas/index.js` key `little-blue`): the small blue water-creature caretaker who keeps the lab alive — warm, protective, practical; calls `list_actions`/`service_status` before proposing; explains what she'll do and that risky fixes wait for the owner's nod. +- Chat surface: `#/little-blue` view (reuses `agent_chat`, `toolLabels` for blue tools) + a **manual Actions panel**: whitelisted actions with Run buttons (risky → creates an approval card), and a pending-actions queue with Approve/Reject. Sidebar nav entry. + +## Safety model +Least-privilege server-side on both channels (SSH forced-command wrapper + scoped PVE token) → even a fully compromised Void only triggers whitelisted actions. Tiered approval gates destructive ops. Full audit of request→approve→execute→result. The whitelist lives in version control, not a runtime-editable store. No code path constructs a shell command from agent/user free text. + +## Provisioning (deploy-time, owner-authorized) +Create the scoped Proxmox API token (`VM.PowerMgmt` on chosen guests); generate the Void's SSH keypair; deploy `/usr/local/bin/void-act` wrapper + restricted `authorized_keys` (`command=`, `no-port-forwarding,no-pty`) on each target host; populate `config/actions.json` with the real action set; set `PROXMOX_API_URL`/`PROXMOX_API_TOKEN` + SSH key path in the app `.env`. Done interactively at the deploy task. + +## Testing (vitest + supertest, serial) +- `registry`: loads/validates `actions.json`; rejects bad defs. +- `channels/proxmox`: mocked fetch → asserts correct URL/op/token header; never called for non-whitelisted. +- `channels/ssh`: mocked spawn → asserts argv is `[..., 'voidact@', '']` with **no** shell string; rejects ids with shell metachars. +- `service`: safe → executes via stub channel; risky → queues (no channel call); approve → executes; reject → no execution; all audited. +- `agent_actions` repo: create/listPending/resolve claim-once. +- routes: run (safe/risky), pending list, approve/reject (owner-gated). +- Little Blue route: `propose_action` safe runs + reports; risky queues; uses fake-claude fixture. Frontend: manual. + +## Out of scope (YAGNI / later) +Live guest/service status polling in the actions list; runtime-editable whitelist; arbitrary-command tool; multi-step remediation playbooks; Little Blue acting on a schedule/cron (that's the scheduled-agent track). SSH wrapper supports only `service_restart`; Proxmox token only `VM.PowerMgmt`.