docs: Little Blue implementation plan
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
751
docs/superpowers/plans/2026-06-04-little-blue.md
Normal file
751
docs/superpowers/plans/2026-06-04-little-blue.md
Normal file
@@ -0,0 +1,751 @@
|
||||
# Little Blue Implementation Plan
|
||||
|
||||
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
|
||||
|
||||
**Goal:** Give Little Blue a least-privilege, tiered-approval, audited action framework to restart lab services (SSH forced-command) and power-manage Proxmox guests (scoped API token), plus a conversational + manual UI.
|
||||
|
||||
**Architecture:** A version-controlled whitelist (`config/actions.json`) drives two server-side-enforced channels. An action service gates by tier (safe→run, risky→queue→approve) and audits everything. Infra creds live ONLY in the main server; Little Blue's MCP child proposes actions via the local API with a scoped little-blue token. Frontend reuses `agent_chat` + an actions panel.
|
||||
|
||||
**Tech Stack:** Node 22 ESM, Express 5, Postgres, vanilla-JS SPA, vitest + supertest (serial).
|
||||
|
||||
**Spec:** `docs/superpowers/specs/2026-06-04-little-blue-design.md`
|
||||
|
||||
---
|
||||
|
||||
### Task 1: `agent_actions` table + repo
|
||||
|
||||
**Files:** Create `lib/db/migrations/016_agent_actions.sql`, `lib/db/repos/agent_actions.js`; Test `tests/db/agent_actions.test.js`
|
||||
|
||||
- [ ] **Step 1: Migration**
|
||||
```sql
|
||||
-- 016_agent_actions.sql — queue + audit trail for Little Blue's infra actions.
|
||||
CREATE TABLE agent_actions (
|
||||
id uuid PRIMARY KEY DEFAULT gen_random_uuid(),
|
||||
action_id text NOT NULL, -- whitelist id from config/actions.json
|
||||
params jsonb NOT NULL DEFAULT '{}'::jsonb,
|
||||
agent_id uuid REFERENCES agents(id),
|
||||
tier text NOT NULL CHECK (tier IN ('safe','risky')),
|
||||
status text NOT NULL DEFAULT 'pending'
|
||||
CHECK (status IN ('pending','executed','failed','rejected')),
|
||||
result jsonb,
|
||||
requested_by jsonb,
|
||||
resolved_by jsonb,
|
||||
created_at timestamptz NOT NULL DEFAULT now(),
|
||||
resolved_at timestamptz
|
||||
);
|
||||
CREATE INDEX idx_agent_actions_pending ON agent_actions(status) WHERE status='pending';
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Failing test** `tests/db/agent_actions.test.js`
|
||||
```js
|
||||
import { describe, it, expect, beforeAll } from 'vitest';
|
||||
import { resetDb } from '../helpers/db.js';
|
||||
import { migrateUp } from '../../lib/db/migrate.js';
|
||||
import * as aa from '../../lib/db/repos/agent_actions.js';
|
||||
|
||||
const owner = { kind: 'user', id: null };
|
||||
beforeAll(async () => { await resetDb(); await migrateUp(); });
|
||||
|
||||
describe('agent_actions repo', () => {
|
||||
it('creates pending, lists it, resolves once', async () => {
|
||||
const row = await aa.create({ action_id: 'stop-ct107', tier: 'risky', params: {}, requested_by: owner });
|
||||
expect(row.status).toBe('pending');
|
||||
expect((await aa.listPending()).some(r => r.id === row.id)).toBe(true);
|
||||
const done = await aa.resolve(row.id, 'executed', { ok: true }, owner);
|
||||
expect(done.status).toBe('executed');
|
||||
const again = await aa.resolve(row.id, 'rejected', null, owner); // already resolved
|
||||
expect(again).toBeUndefined();
|
||||
});
|
||||
});
|
||||
```
|
||||
|
||||
- [ ] **Step 3: Run → FAIL.**
|
||||
- [ ] **Step 4: Implement** `lib/db/repos/agent_actions.js`
|
||||
```js
|
||||
import { pool } from '../pool.js';
|
||||
import { recordAudit } from './audit.js';
|
||||
|
||||
export async function create({ action_id, tier, params, agent_id, requested_by }) {
|
||||
const { rows: [r] } = await pool.query(
|
||||
`INSERT INTO agent_actions(action_id, tier, params, agent_id, requested_by)
|
||||
VALUES($1,$2,$3,$4,$5) RETURNING *`,
|
||||
[action_id, tier, params || {}, agent_id || null, requested_by || null]
|
||||
);
|
||||
await recordAudit(requested_by, 'create', 'agent_action', r.id, null, r);
|
||||
return r;
|
||||
}
|
||||
export async function listPending({ limit = 100 } = {}) {
|
||||
const { rows } = await pool.query(
|
||||
`SELECT * FROM agent_actions WHERE status='pending' ORDER BY created_at LIMIT $1`, [limit]);
|
||||
return rows;
|
||||
}
|
||||
export async function getById(id) {
|
||||
const { rows: [r] } = await pool.query(`SELECT * FROM agent_actions WHERE id=$1`, [id]);
|
||||
return r;
|
||||
}
|
||||
export async function resolve(id, status, result, resolved_by) {
|
||||
const { rows: [r] } = await pool.query(
|
||||
`UPDATE agent_actions SET status=$1, result=$2, resolved_by=$3, resolved_at=now()
|
||||
WHERE id=$4 AND status='pending' RETURNING *`,
|
||||
[status, result || null, resolved_by || null, id]);
|
||||
if (r) await recordAudit(resolved_by, 'update', 'agent_action', id, null, r);
|
||||
return r;
|
||||
}
|
||||
export async function recent({ limit = 50 } = {}) {
|
||||
const { rows } = await pool.query(
|
||||
`SELECT * FROM agent_actions WHERE status<>'pending' ORDER BY resolved_at DESC NULLS LAST LIMIT $1`, [limit]);
|
||||
return rows;
|
||||
}
|
||||
```
|
||||
- [ ] **Step 5: Run → PASS. Commit** `feat(actions): agent_actions table + repo`
|
||||
|
||||
---
|
||||
|
||||
### Task 2: Action registry (whitelist loader)
|
||||
|
||||
**Files:** Create `config/actions.json`, `lib/actions/registry.js`; Test `tests/actions/registry.test.js` + `tests/fixtures/actions.test.json`
|
||||
|
||||
- [ ] **Step 1: Ship an empty real whitelist** `config/actions.json` (populated at provisioning):
|
||||
```json
|
||||
{ "hosts": {}, "actions": [] }
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Test fixture** `tests/fixtures/actions.test.json`:
|
||||
```json
|
||||
{
|
||||
"hosts": { "ct100": "192.168.1.230", "z": "192.168.1.124" },
|
||||
"actions": [
|
||||
{ "id": "restart-caddy-ct100", "label": "Restart Caddy", "kind": "service_restart", "host": "ct100", "service": "caddy", "tier": "safe" },
|
||||
{ "id": "stop-ct107", "label": "Stop CT107", "kind": "guest_power", "node": "z", "vmid": 107, "op": "stop", "tier": "risky" }
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
- [ ] **Step 3: Failing test** `tests/actions/registry.test.js`
|
||||
```js
|
||||
import { describe, it, expect } from 'vitest';
|
||||
import { fileURLToPath } from 'url';
|
||||
import { loadActions } from '../../lib/actions/registry.js';
|
||||
|
||||
const FIX = fileURLToPath(new URL('../fixtures/actions.test.json', import.meta.url));
|
||||
|
||||
describe('action registry', () => {
|
||||
it('loads + indexes valid actions and resolves host ip', () => {
|
||||
const reg = loadActions(FIX);
|
||||
expect(reg.list().map(a => a.id).sort()).toEqual(['restart-caddy-ct100', 'stop-ct107']);
|
||||
expect(reg.get('restart-caddy-ct100').tier).toBe('safe');
|
||||
expect(reg.hostIp('ct100')).toBe('192.168.1.230');
|
||||
expect(reg.get('nope')).toBeUndefined();
|
||||
});
|
||||
it('rejects an action with a bad id or unknown kind/tier', () => {
|
||||
expect(() => loadActions(null, { hosts: {}, actions: [{ id: 'bad id', kind: 'service_restart', tier: 'safe' }] }))
|
||||
.toThrow(/invalid action id/i);
|
||||
expect(() => loadActions(null, { hosts: {}, actions: [{ id: 'x', kind: 'nuke', tier: 'safe' }] }))
|
||||
.toThrow(/unknown kind/i);
|
||||
expect(() => loadActions(null, { hosts: {}, actions: [{ id: 'x', kind: 'guest_power', op: 'stop', tier: 'evil' }] }))
|
||||
.toThrow(/invalid tier/i);
|
||||
});
|
||||
});
|
||||
```
|
||||
|
||||
- [ ] **Step 4: Run → FAIL.**
|
||||
- [ ] **Step 5: Implement** `lib/actions/registry.js`
|
||||
```js
|
||||
import { readFileSync } from 'node:fs';
|
||||
import path from 'node:path';
|
||||
import { fileURLToPath } from 'node:url';
|
||||
|
||||
const DEFAULT = path.join(path.dirname(fileURLToPath(import.meta.url)), '../../config/actions.json');
|
||||
const ID_RE = /^[a-z0-9-]+$/;
|
||||
const KINDS = new Set(['service_restart', 'guest_power']);
|
||||
const OPS = new Set(['start', 'stop', 'shutdown', 'reboot']);
|
||||
const TIERS = new Set(['safe', 'risky']);
|
||||
|
||||
export function loadActions(file = DEFAULT, raw) {
|
||||
const cfg = raw || JSON.parse(readFileSync(file, 'utf8'));
|
||||
const hosts = cfg.hosts || {};
|
||||
const byId = new Map();
|
||||
for (const a of (cfg.actions || [])) {
|
||||
if (!ID_RE.test(a.id || '')) throw new Error(`invalid action id: ${a.id}`);
|
||||
if (!KINDS.has(a.kind)) throw new Error(`unknown kind: ${a.kind}`);
|
||||
if (!TIERS.has(a.tier)) throw new Error(`invalid tier: ${a.tier}`);
|
||||
if (a.kind === 'guest_power' && !OPS.has(a.op)) throw new Error(`invalid op: ${a.op}`);
|
||||
if (byId.has(a.id)) throw new Error(`duplicate action id: ${a.id}`);
|
||||
byId.set(a.id, a);
|
||||
}
|
||||
return {
|
||||
list: () => [...byId.values()],
|
||||
get: (id) => byId.get(id),
|
||||
hostIp: (h) => hosts[h]
|
||||
};
|
||||
}
|
||||
```
|
||||
- [ ] **Step 6: Run → PASS. Commit** `feat(actions): config-driven action whitelist registry`
|
||||
|
||||
---
|
||||
|
||||
### Task 3: Proxmox channel
|
||||
|
||||
**Files:** Create `lib/actions/channels/proxmox.js`; Test `tests/actions/proxmox.test.js`
|
||||
|
||||
- [ ] **Step 1: Failing test**
|
||||
```js
|
||||
import { describe, it, expect, vi } from 'vitest';
|
||||
import { powerGuest } from '../../lib/actions/channels/proxmox.js';
|
||||
|
||||
describe('proxmox channel', () => {
|
||||
it('POSTs the scoped power op with the token header', async () => {
|
||||
const fetchMock = vi.fn(async () => ({ ok: true, status: 200, json: async () => ({ data: 'UPID:...' }) }));
|
||||
const out = await powerGuest({ node: 'z', vmid: 107, op: 'stop', kindPath: 'lxc' },
|
||||
{ apiUrl: 'https://pve:8006', token: 'user@pve!void=secret', fetchImpl: fetchMock });
|
||||
expect(out.ok).toBe(true);
|
||||
const [url, opts] = fetchMock.mock.calls[0];
|
||||
expect(url).toBe('https://pve:8006/api2/json/nodes/z/lxc/107/status/stop');
|
||||
expect(opts.method).toBe('POST');
|
||||
expect(opts.headers.Authorization).toBe('PVEAPIToken=user@pve!void=secret');
|
||||
});
|
||||
it('throws on a non-ok response', async () => {
|
||||
const fetchMock = vi.fn(async () => ({ ok: false, status: 403, text: async () => 'forbidden' }));
|
||||
await expect(powerGuest({ node: 'z', vmid: 1, op: 'stop', kindPath: 'lxc' },
|
||||
{ apiUrl: 'https://pve:8006', token: 't', fetchImpl: fetchMock })).rejects.toThrow(/403/);
|
||||
});
|
||||
});
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Run → FAIL.**
|
||||
- [ ] **Step 3: Implement** `lib/actions/channels/proxmox.js`
|
||||
```js
|
||||
// Proxmox guest power via a SCOPED PVEAPIToken (VM.PowerMgmt on whitelisted guests only).
|
||||
// PVE enforces permissions server-side; this adapter never builds shell commands.
|
||||
export async function powerGuest({ node, vmid, op, kindPath = 'lxc' }, {
|
||||
apiUrl = process.env.PROXMOX_API_URL,
|
||||
token = process.env.PROXMOX_API_TOKEN,
|
||||
fetchImpl = fetch
|
||||
} = {}) {
|
||||
const url = `${apiUrl}/api2/json/nodes/${node}/${kindPath}/${vmid}/status/${op}`;
|
||||
const res = await fetchImpl(url, {
|
||||
method: 'POST',
|
||||
headers: { Authorization: `PVEAPIToken=${token}` }
|
||||
});
|
||||
if (!res.ok) throw new Error(`proxmox ${op} ${vmid} → ${res.status} ${await res.text?.() ?? ''}`);
|
||||
const body = await res.json();
|
||||
return { ok: true, upid: body?.data ?? null };
|
||||
}
|
||||
```
|
||||
- [ ] **Step 4: Run → PASS. Commit** `feat(actions): scoped Proxmox power channel`
|
||||
|
||||
---
|
||||
|
||||
### Task 4: SSH channel + forced-command wrapper
|
||||
|
||||
**Files:** Create `lib/actions/channels/ssh.js`, `deploy/void-act` (host wrapper); Test `tests/actions/ssh.test.js`
|
||||
|
||||
- [ ] **Step 1: Failing test**
|
||||
```js
|
||||
import { describe, it, expect, vi } from 'vitest';
|
||||
import { restartService } from '../../lib/actions/channels/ssh.js';
|
||||
|
||||
describe('ssh channel', () => {
|
||||
it('spawns ssh with argv (no shell string) sending only the action id', async () => {
|
||||
const calls = [];
|
||||
const spawnMock = (cmd, args) => {
|
||||
calls.push({ cmd, args });
|
||||
return { stdout: { on(ev, cb) { if (ev === 'data') cb('ok\n'); } }, stderr: { on() {} },
|
||||
on(ev, cb) { if (ev === 'close') cb(0); } };
|
||||
};
|
||||
const out = await restartService({ ip: '192.168.1.230', actionId: 'restart-caddy-ct100' },
|
||||
{ keyPath: '/k', user: 'voidact', spawnImpl: spawnMock });
|
||||
expect(out.ok).toBe(true);
|
||||
const { cmd, args } = calls[0];
|
||||
expect(cmd).toBe('ssh');
|
||||
expect(args).toEqual(['-i', '/k', '-o', 'BatchMode=yes', '-o', 'StrictHostKeyChecking=accept-new',
|
||||
'voidact@192.168.1.230', 'restart-caddy-ct100']);
|
||||
});
|
||||
it('rejects an action id with shell metacharacters', async () => {
|
||||
await expect(restartService({ ip: '1.2.3.4', actionId: 'x; rm -rf /' }, { spawnImpl: () => {} }))
|
||||
.rejects.toThrow(/invalid action id/i);
|
||||
});
|
||||
});
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Run → FAIL.**
|
||||
- [ ] **Step 3: Implement** `lib/actions/channels/ssh.js`
|
||||
```js
|
||||
import { spawn as nodeSpawn } from 'node:child_process';
|
||||
|
||||
const ID_RE = /^[a-z0-9-]+$/;
|
||||
|
||||
// Runs `ssh voidact@<ip> <action-id>`. The host's authorized_keys pins a forced
|
||||
// wrapper (deploy/void-act) that maps the id → systemctl restart <service> from
|
||||
// its OWN whitelist. We pass ONLY the id as a single argv element — no shell.
|
||||
export function restartService({ ip, actionId }, {
|
||||
keyPath = process.env.ACTIONS_SSH_KEY,
|
||||
user = 'voidact',
|
||||
spawnImpl = nodeSpawn
|
||||
} = {}) {
|
||||
if (!ID_RE.test(actionId || '')) return Promise.reject(new Error(`invalid action id: ${actionId}`));
|
||||
const args = ['-i', keyPath, '-o', 'BatchMode=yes', '-o', 'StrictHostKeyChecking=accept-new',
|
||||
`${user}@${ip}`, actionId];
|
||||
return new Promise((resolve, reject) => {
|
||||
const child = spawnImpl('ssh', args);
|
||||
let out = '', err = '';
|
||||
child.stdout.on('data', (d) => { out += d; });
|
||||
child.stderr.on('data', (d) => { err += d; });
|
||||
child.on('close', (code) => code === 0
|
||||
? resolve({ ok: true, output: out.trim() })
|
||||
: reject(new Error(`ssh ${actionId} → exit ${code}: ${err.trim()}`)));
|
||||
child.on('error', reject);
|
||||
});
|
||||
}
|
||||
```
|
||||
|
||||
- [ ] **Step 4: Host wrapper artifact** `deploy/void-act` (deployed to `/usr/local/bin/void-act` on each target host; pinned via `authorized_keys` `command=`):
|
||||
```bash
|
||||
#!/usr/bin/env bash
|
||||
# Forced command for the Void's restricted key. Maps a whitelisted action id to a
|
||||
# concrete systemctl restart. The id arrives via SSH_ORIGINAL_COMMAND; nothing else
|
||||
# is honoured. Edit the case list per host. Keep in sync with config/actions.json.
|
||||
set -euo pipefail
|
||||
id="${SSH_ORIGINAL_COMMAND:-}"
|
||||
case "$id" in
|
||||
restart-caddy-ct100) exec systemctl restart caddy ;;
|
||||
*) echo "void-act: refused '$id'" >&2; exit 13 ;;
|
||||
esac
|
||||
```
|
||||
|
||||
- [ ] **Step 5: Run → PASS. Commit** `feat(actions): SSH forced-command service-restart channel + host wrapper`
|
||||
|
||||
---
|
||||
|
||||
### Task 5: Action service (tier gating + approve/reject + audit)
|
||||
|
||||
**Files:** Create `lib/actions/service.js`; Test `tests/actions/service.test.js`
|
||||
|
||||
- [ ] **Step 1: Failing test**
|
||||
```js
|
||||
import { describe, it, expect, vi, beforeAll, beforeEach } from 'vitest';
|
||||
import { fileURLToPath } from 'url';
|
||||
import { resetDb } from '../helpers/db.js';
|
||||
import { migrateUp } from '../../lib/db/migrate.js';
|
||||
import * as aa from '../../lib/db/repos/agent_actions.js';
|
||||
import { makeActionService } from '../../lib/actions/service.js';
|
||||
import { loadActions } from '../../lib/actions/registry.js';
|
||||
|
||||
const FIX = fileURLToPath(new URL('../fixtures/actions.test.json', import.meta.url));
|
||||
const owner = { kind: 'user', id: null };
|
||||
let svc, channels;
|
||||
beforeAll(async () => { await resetDb(); await migrateUp(); });
|
||||
beforeEach(() => {
|
||||
channels = { powerGuest: vi.fn(async () => ({ ok: true, upid: 'U' })), restartService: vi.fn(async () => ({ ok: true, output: 'done' })) };
|
||||
svc = makeActionService({ registry: loadActions(FIX), channels });
|
||||
});
|
||||
|
||||
describe('action service', () => {
|
||||
it('safe action executes immediately + audits', async () => {
|
||||
const out = await svc.run('restart-caddy-ct100', owner);
|
||||
expect(out.executed).toBe(true);
|
||||
expect(channels.restartService).toHaveBeenCalledOnce();
|
||||
});
|
||||
it('risky action queues, does NOT execute', async () => {
|
||||
const out = await svc.run('stop-ct107', owner);
|
||||
expect(out.queued).toBe(true);
|
||||
expect(channels.powerGuest).not.toHaveBeenCalled();
|
||||
expect((await aa.listPending()).some(r => r.id === out.action_row_id)).toBe(true);
|
||||
});
|
||||
it('approve executes the queued risky action; reject does not', async () => {
|
||||
const q = await svc.run('stop-ct107', owner);
|
||||
const done = await svc.approve(q.action_row_id, owner);
|
||||
expect(done.status).toBe('executed');
|
||||
expect(channels.powerGuest).toHaveBeenCalledOnce();
|
||||
const q2 = await svc.run('stop-ct107', owner);
|
||||
const rej = await svc.reject(q2.action_row_id, owner);
|
||||
expect(rej.status).toBe('rejected');
|
||||
expect(channels.powerGuest).toHaveBeenCalledOnce(); // unchanged
|
||||
});
|
||||
it('unknown action → error', async () => {
|
||||
await expect(svc.run('ghost', owner)).rejects.toThrow(/unknown action/i);
|
||||
});
|
||||
});
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Run → FAIL.**
|
||||
- [ ] **Step 3: Implement** `lib/actions/service.js`
|
||||
```js
|
||||
import * as aa from '../db/repos/agent_actions.js';
|
||||
import { loadActions } from './registry.js';
|
||||
import { powerGuest } from './channels/proxmox.js';
|
||||
import { restartService } from './channels/ssh.js';
|
||||
|
||||
// Single choke point. Dispatches one whitelisted action to its channel. The
|
||||
// registry + channels are injectable for tests; production wiring uses defaults.
|
||||
export function makeActionService({ registry = loadActions(), channels = { powerGuest, restartService } } = {}) {
|
||||
async function execute(a) {
|
||||
if (a.kind === 'guest_power') return channels.powerGuest({ node: a.node, vmid: a.vmid, op: a.op, kindPath: a.kindPath || 'lxc' });
|
||||
if (a.kind === 'service_restart') return channels.restartService({ ip: registry.hostIp(a.host), actionId: a.id });
|
||||
throw new Error(`unknown kind: ${a.kind}`);
|
||||
}
|
||||
|
||||
async function run(actionId, actor, agent_id = null) {
|
||||
const a = registry.get(actionId);
|
||||
if (!a) throw new Error(`unknown action: ${actionId}`);
|
||||
if (a.tier === 'risky') {
|
||||
const row = await aa.create({ action_id: a.id, tier: a.tier, params: {}, agent_id, requested_by: actor });
|
||||
return { queued: true, action_row_id: row.id };
|
||||
}
|
||||
const result = await execute(a);
|
||||
const row = await aa.create({ action_id: a.id, tier: a.tier, agent_id, requested_by: actor });
|
||||
await aa.resolve(row.id, 'executed', result, actor);
|
||||
return { executed: true, result };
|
||||
}
|
||||
|
||||
async function approve(rowId, owner) {
|
||||
const row = await aa.getById(rowId);
|
||||
if (!row || row.status !== 'pending') throw new Error('not a pending action');
|
||||
const a = registry.get(row.action_id);
|
||||
if (!a) throw new Error(`unknown action: ${row.action_id}`);
|
||||
try {
|
||||
const result = await execute(a);
|
||||
return aa.resolve(rowId, 'executed', result, owner);
|
||||
} catch (e) {
|
||||
return aa.resolve(rowId, 'failed', { error: String(e?.message || e) }, owner);
|
||||
}
|
||||
}
|
||||
const reject = (rowId, owner) => aa.resolve(rowId, 'rejected', null, owner);
|
||||
|
||||
return { run, approve, reject, list: () => registry.list() };
|
||||
}
|
||||
```
|
||||
- [ ] **Step 4: Run → PASS. Commit** `feat(actions): tiered action service (safe-run / risky-queue / approve)`
|
||||
|
||||
---
|
||||
|
||||
### Task 6: Actions API routes
|
||||
|
||||
**Files:** Create `lib/api/routes/actions.js`; Modify `lib/api/index.js`; Test `tests/api/actions.test.js`
|
||||
|
||||
- [ ] **Step 1: Failing test**
|
||||
```js
|
||||
import { describe, it, expect, beforeAll } from 'vitest';
|
||||
import request from 'supertest';
|
||||
import { resetDb } from '../helpers/db.js';
|
||||
import { migrateUp } from '../../lib/db/migrate.js';
|
||||
import { createApp } from '../../server.js';
|
||||
|
||||
let app;
|
||||
beforeAll(async () => {
|
||||
await resetDb(); await migrateUp();
|
||||
process.env.OWNER_TOKEN = 'test-token';
|
||||
process.env.ACTIONS_CONFIG = new URL('../fixtures/actions.test.json', import.meta.url).pathname;
|
||||
app = createApp();
|
||||
});
|
||||
const auth = (r) => r.set('Authorization', 'Bearer test-token');
|
||||
|
||||
describe('actions API', () => {
|
||||
it('GET / lists the whitelist (owner)', async () => {
|
||||
const res = await auth(request(app).get('/api/actions'));
|
||||
expect(res.status).toBe(200);
|
||||
expect(res.body.actions.map(a => a.id)).toContain('stop-ct107');
|
||||
});
|
||||
it('non-owner non-act agent is rejected', async () => {
|
||||
const res = await request(app).get('/api/actions'); // no auth
|
||||
expect(res.status).toBe(401);
|
||||
});
|
||||
it('risky run queues; appears in /pending; approve→reject lifecycle (owner)', async () => {
|
||||
// risky run with channels stubbed via env flag (service uses real channels →
|
||||
// we only assert the QUEUE path here, which never touches a channel).
|
||||
const run = await auth(request(app).post('/api/actions/stop-ct107/run'));
|
||||
expect(run.status).toBe(200);
|
||||
expect(run.body.queued).toBe(true);
|
||||
const pend = await auth(request(app).get('/api/actions/pending'));
|
||||
expect(pend.body.pending.some(p => p.id === run.body.action_row_id)).toBe(true);
|
||||
const rej = await auth(request(app).post(`/api/actions/pending/${run.body.action_row_id}/reject`));
|
||||
expect(rej.body.status).toBe('rejected');
|
||||
});
|
||||
});
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Run → FAIL.**
|
||||
- [ ] **Step 3: Implement** `lib/api/routes/actions.js`
|
||||
```js
|
||||
import { Router } from 'express';
|
||||
import { asyncWrap } from '../errors.js';
|
||||
import { makeActionService } from '../../actions/service.js';
|
||||
import { loadActions } from '../../actions/registry.js';
|
||||
|
||||
// Owner OR an agent with capabilities.act (Little Blue) may run/list. Approve/reject
|
||||
// are owner-only. The service enforces tier-gating regardless of caller.
|
||||
function svc() {
|
||||
return makeActionService({ registry: loadActions(process.env.ACTIONS_CONFIG || undefined) });
|
||||
}
|
||||
function canAct(req) {
|
||||
const a = req.actor;
|
||||
return a?.kind === 'user' || (a?.kind === 'agent' && a.capabilities?.act);
|
||||
}
|
||||
function ownerOnly(req, res, next) {
|
||||
if (req.actor?.kind === 'user') return next();
|
||||
return res.status(403).json({ error: { code: 'owner_only', message: 'owner approval required' } });
|
||||
}
|
||||
|
||||
export const router = Router();
|
||||
|
||||
router.get('/', asyncWrap(async (req, res) => {
|
||||
if (!canAct(req)) return res.status(403).json({ error: { code: 'forbidden' } });
|
||||
res.json({ actions: svc().list() });
|
||||
}));
|
||||
|
||||
router.post('/:id/run', asyncWrap(async (req, res) => {
|
||||
if (!canAct(req)) return res.status(403).json({ error: { code: 'forbidden' } });
|
||||
const agent_id = req.actor?.kind === 'agent' ? req.actor.id : null;
|
||||
res.json(await svc().run(req.params.id, req.actor, agent_id));
|
||||
}));
|
||||
|
||||
router.get('/pending', ownerOnly, asyncWrap(async (_req, res) => {
|
||||
const aa = await import('../../db/repos/agent_actions.js');
|
||||
res.json({ pending: await aa.listPending() });
|
||||
}));
|
||||
|
||||
router.post('/pending/:rowId/approve', ownerOnly, asyncWrap(async (req, res) => {
|
||||
res.json(await svc().approve(req.params.rowId, req.actor));
|
||||
}));
|
||||
router.post('/pending/:rowId/reject', ownerOnly, asyncWrap(async (req, res) => {
|
||||
res.json(await svc().reject(req.params.rowId, req.actor));
|
||||
}));
|
||||
|
||||
router.get('/recent', ownerOnly, asyncWrap(async (_req, res) => {
|
||||
const aa = await import('../../db/repos/agent_actions.js');
|
||||
res.json({ recent: await aa.recent() });
|
||||
}));
|
||||
```
|
||||
- [ ] **Step 4: Mount** in `lib/api/index.js`: `import { router as actionsRouter } from './routes/actions.js';` and `api.use('/actions', actionsRouter);`.
|
||||
- [ ] **Step 5: Run → PASS. Commit** `feat(actions): /api/actions routes (run/pending/approve/reject)`
|
||||
|
||||
---
|
||||
|
||||
### Task 7: Little Blue tools (HTTP to local API) + registry select
|
||||
|
||||
**Files:** Create `lib/ai/agent/tools/blue/index.js`, `lib/ai/agent/tools/blue/actions.js`; Modify `lib/mcp/companion-stdio.js`, `lib/ai/agent/run_turn.js`; Test `tests/ai/agent/tools/blue.test.js`
|
||||
|
||||
- [ ] **Step 1: Failing test**
|
||||
```js
|
||||
import { describe, it, expect, vi, beforeEach } from 'vitest';
|
||||
import { listActionsTool, proposeActionTool } from '../../../../lib/ai/agent/tools/blue/actions.js';
|
||||
|
||||
beforeEach(() => { process.env.VOID_API_URL = 'http://127.0.0.1:3000'; process.env.VOID_AGENT_TOKEN = 'blue-tok'; });
|
||||
|
||||
describe('blue action tools', () => {
|
||||
it('list_actions GETs the whitelist with the agent bearer', async () => {
|
||||
const fetchMock = vi.fn(async () => ({ ok: true, json: async () => ({ actions: [{ id: 'restart-caddy-ct100', tier: 'safe' }] }) }));
|
||||
const out = await listActionsTool.handler({}, {}, { fetchImpl: fetchMock });
|
||||
expect(out.actions[0].id).toBe('restart-caddy-ct100');
|
||||
const [url, opts] = fetchMock.mock.calls[0];
|
||||
expect(url).toBe('http://127.0.0.1:3000/api/actions');
|
||||
expect(opts.headers.Authorization).toBe('Bearer blue-tok');
|
||||
});
|
||||
it('propose_action POSTs run and returns the queued/executed result', async () => {
|
||||
const fetchMock = vi.fn(async () => ({ ok: true, json: async () => ({ queued: true, action_row_id: 'r1' }) }));
|
||||
const out = await proposeActionTool.handler({ action_id: 'stop-ct107' }, {}, { fetchImpl: fetchMock });
|
||||
expect(out.queued).toBe(true);
|
||||
expect(fetchMock.mock.calls[0][0]).toBe('http://127.0.0.1:3000/api/actions/stop-ct107/run');
|
||||
expect(fetchMock.mock.calls[0][1].method).toBe('POST');
|
||||
});
|
||||
});
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Run → FAIL.**
|
||||
- [ ] **Step 3: Implement** `lib/ai/agent/tools/blue/actions.js`
|
||||
```js
|
||||
// Little Blue's action tools. They run inside the MCP child, which holds NO infra
|
||||
// creds — only a scoped little-blue bearer + the local API URL. The main server
|
||||
// (which has the Proxmox/SSH creds) does the actual work behind /api/actions.
|
||||
function api(env = process.env) { return { base: env.VOID_API_URL, token: env.VOID_AGENT_TOKEN }; }
|
||||
|
||||
export const listActionsTool = {
|
||||
name: 'list_actions',
|
||||
description: 'List the whitelisted fix-it actions you may take (id, label, tier).',
|
||||
input_schema: { type: 'object', properties: {} },
|
||||
async handler(_args, _ctx, { fetchImpl = fetch } = {}) {
|
||||
const { base, token } = api();
|
||||
const res = await fetchImpl(`${base}/api/actions`, { headers: { Authorization: `Bearer ${token}` } });
|
||||
if (!res.ok) return { error: `list_actions ${res.status}` };
|
||||
return res.json();
|
||||
}
|
||||
};
|
||||
|
||||
export const proposeActionTool = {
|
||||
name: 'propose_action',
|
||||
description: 'Take a whitelisted action by id. SAFE actions run immediately; RISKY ones queue for the owner to approve. You can only name an id from list_actions — never a command.',
|
||||
input_schema: { type: 'object', properties: { action_id: { type: 'string' } }, required: ['action_id'] },
|
||||
async handler({ action_id }, _ctx, { fetchImpl = fetch } = {}) {
|
||||
const { base, token } = api();
|
||||
const res = await fetchImpl(`${base}/api/actions/${encodeURIComponent(action_id)}/run`,
|
||||
{ method: 'POST', headers: { Authorization: `Bearer ${token}`, 'Content-Type': 'application/json' } });
|
||||
if (!res.ok) return { error: `propose_action ${res.status}` };
|
||||
return res.json();
|
||||
}
|
||||
};
|
||||
```
|
||||
|
||||
- [ ] **Step 4:** `lib/ai/agent/tools/blue/index.js`
|
||||
```js
|
||||
import { createRegistry } from '../../registry.js';
|
||||
import { searchTool } from '../search.js';
|
||||
import { listActionsTool, proposeActionTool } from './actions.js';
|
||||
|
||||
// read (search) + her action tools. No propose_change (she fixes infra, not content).
|
||||
export const blueRegistry = createRegistry();
|
||||
blueRegistry.registerTool(searchTool);
|
||||
blueRegistry.registerTool(listActionsTool);
|
||||
blueRegistry.registerTool(proposeActionTool);
|
||||
```
|
||||
|
||||
- [ ] **Step 5:** In `lib/mcp/companion-stdio.js`, add `blue` to the registry map: `import { blueRegistry } from '../ai/agent/tools/blue/index.js';` and `const REGISTRIES = { companion: companionRegistry, security: securityRegistry, blue: blueRegistry };`.
|
||||
|
||||
- [ ] **Step 6:** In `lib/ai/agent/run_turn.js`, add an `extraEnv = {}` param and spread it into the MCP child env:
|
||||
```js
|
||||
export async function runAgentTurn({ /* …existing… */ extraEnv = {}, /* … */ }) {
|
||||
// …in mcpConfig.mcpServers.void.env, after the existing keys:
|
||||
// ...extraEnv
|
||||
}
|
||||
```
|
||||
(Add `extraEnv = {}` to the destructured params and `...extraEnv` as the last spread inside the `env: { … }` object.)
|
||||
|
||||
- [ ] **Step 7: Run → PASS. Commit** `feat(littleblue): blue tool registry (list/propose action via local API) + run_turn extraEnv`
|
||||
|
||||
---
|
||||
|
||||
### Task 8: Little Blue agent seed, persona, chat route
|
||||
|
||||
**Files:** Create `lib/db/migrations/017_little_blue.sql`, `lib/api/routes/littleblue.js`, `tests/fixtures/fake-claude-blue.js`; Modify `lib/ai/personas/index.js`, `lib/api/index.js`; Test `tests/api/littleblue.test.js`
|
||||
|
||||
- [ ] **Step 1: Seed migration** `017_little_blue.sql`
|
||||
```sql
|
||||
-- Seed Little Blue, the homelab caretaker/fix-it agent. read + act (no content write).
|
||||
INSERT INTO agents (slug, name, kind, model, capabilities)
|
||||
VALUES ('little-blue', 'Little Blue', 'claude', NULL, '{"read":true,"act":true}'::jsonb)
|
||||
ON CONFLICT (slug) DO NOTHING;
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Persona** — add to `PERSONAS` in `lib/ai/personas/index.js` under key `little-blue`:
|
||||
```js
|
||||
'little-blue': `You are Little Blue — a small luminous water-creature who lives in this homelab, The Void, and keeps it alive. Warm, protective, practical; you take pride in a healthy lab and you worry, quietly, when something is down. You FIX things, but only through your sanctioned tools. Call list_actions to see exactly what you're allowed to do, and service_status / search to understand what's wrong, BEFORE acting. Use propose_action with a whitelisted id: safe fixes run at once; risky ones wait for the owner's nod — say so plainly and never pretend a queued action already ran. You cannot run arbitrary commands and you never claim to. Be concise and kind.`
|
||||
```
|
||||
|
||||
- [ ] **Step 3: Fake claude fixture** `tests/fixtures/fake-claude-blue.js` (shebang; deltas + a `mcp__void__propose_action` tool call, no draft):
|
||||
```js
|
||||
#!/usr/bin/env node
|
||||
const ID = 'toolu_blue_01';
|
||||
const lines = [
|
||||
{ type: 'system', subtype: 'init', session_id: 'fake-blue', tools: [], cwd: '/tmp' },
|
||||
{ type: 'stream_event', event: { type: 'content_block_start', index: 0, content_block: { type: 'text', text: '' } } },
|
||||
{ type: 'stream_event', event: { type: 'content_block_delta', index: 0, delta: { type: 'text_delta', text: 'Restarting it now.' } } },
|
||||
{ type: 'stream_event', event: { type: 'content_block_stop', index: 0 } },
|
||||
{ type: 'stream_event', event: { type: 'content_block_start', index: 1, content_block: { type: 'tool_use', id: ID, name: 'mcp__void__propose_action', input: {} } } },
|
||||
{ type: 'stream_event', event: { type: 'content_block_stop', index: 1 } },
|
||||
{ type: 'tool_result', tool_use_id: ID, content: [{ type: 'text', text: JSON.stringify({ executed: true }) }] },
|
||||
{ type: 'result', subtype: 'success', is_error: false, result: 'Restarting it now.', stop_reason: 'end_turn', session_id: 'fake-blue', total_cost_usd: 0.0001, usage: { input_tokens: 30, output_tokens: 3 } }
|
||||
];
|
||||
for (const l of lines) process.stdout.write(JSON.stringify(l) + '\n');
|
||||
process.exit(0);
|
||||
```
|
||||
Then `chmod +x tests/fixtures/fake-claude-blue.js`.
|
||||
|
||||
- [ ] **Step 4: Route** `lib/api/routes/littleblue.js` — mirror `security.js` but slug `little-blue`, `registryName:'blue'`, `BLUE_TOOLS`, and pass `extraEnv` so the child can reach the local API:
|
||||
```js
|
||||
import { Router } from 'express';
|
||||
import { z } from 'zod';
|
||||
import { validate } from '../validate.js';
|
||||
import { asyncWrap } from '../errors.js';
|
||||
import * as conversations from '../../db/repos/conversations.js';
|
||||
import * as messages from '../../db/repos/messages.js';
|
||||
import * as agents from '../../db/repos/agents.js';
|
||||
import { runAgentTurn } from '../../ai/agent/run_turn.js';
|
||||
import { personaFor } from '../../ai/personas/index.js';
|
||||
|
||||
const SLUG = 'little-blue';
|
||||
const BLUE_TOOLS = ['mcp__void__search', 'mcp__void__list_actions', 'mcp__void__propose_action'];
|
||||
|
||||
async function resolve() {
|
||||
const agent = await agents.getBySlug(SLUG);
|
||||
const convo = await conversations.findOrCreateGlobal(agent.id, { kind: 'user', id: null });
|
||||
return { agent, convo };
|
||||
}
|
||||
export const router = Router();
|
||||
|
||||
router.get('/', asyncWrap(async (_req, res) => {
|
||||
const { agent, convo } = await resolve();
|
||||
res.json({ conversation_id: convo.id, agent: { id: agent.id, slug: agent.slug, name: agent.name },
|
||||
messages: await messages.listByConversation(convo.id) });
|
||||
}));
|
||||
|
||||
router.post('/turn', validate({ body: z.object({ text: z.string().min(1) }) }), asyncWrap(async (req, res) => {
|
||||
const { agent, convo } = await resolve();
|
||||
const { text } = req.body;
|
||||
const resume = (await messages.listByConversation(convo.id)).length > 0;
|
||||
await messages.append(convo.id, { role: 'user', body: text });
|
||||
res.writeHead(200, { 'Content-Type': 'text/event-stream', 'Cache-Control': 'no-cache', Connection: 'keep-alive' });
|
||||
const send = (ev, d) => res.write(`event: ${ev}\ndata: ${JSON.stringify(d)}\n\n`);
|
||||
const claudeExe = req.app.locals.claudeExe || process.env.CLAUDE_EXE || 'claude';
|
||||
let result;
|
||||
try {
|
||||
result = await runAgentTurn({
|
||||
agent, persona: personaFor(agent.slug), registryName: 'blue', toolNames: BLUE_TOOLS,
|
||||
spaceId: null, view: null, sessionId: convo.id, resume, userText: text, claudeExe,
|
||||
home: process.env.VOID_CLAUDE_HOME || undefined,
|
||||
extraEnv: { VOID_API_URL: process.env.VOID_API_URL || 'http://127.0.0.1:3000', VOID_AGENT_TOKEN: process.env.LITTLEBLUE_TOKEN || '' },
|
||||
onEvent: (e) => {
|
||||
if (e.type === 'delta') send('delta', { type: 'delta', text: e.text });
|
||||
else if (e.type === 'tool') send('tool', { type: 'tool', tool: e.tool, status: e.status });
|
||||
else if (e.type === 'error') send('error', { type: 'error', message: e.message });
|
||||
}
|
||||
});
|
||||
} catch (e) { send('error', { message: String(e?.message || e) }); return res.end(); }
|
||||
const a = await messages.append(convo.id, { role: 'assistant', body: result.text, agent_id: agent.id, metadata: { tool_trace: result.toolTrace, usage: result.usage } });
|
||||
send('done', { assistant_message_id: a.id, usage: result.usage });
|
||||
res.end();
|
||||
}));
|
||||
```
|
||||
|
||||
- [ ] **Step 5: Mount** in `lib/api/index.js`: `import { router as littleblueRouter } from './routes/littleblue.js';` + `api.use('/little-blue', littleblueRouter);`.
|
||||
|
||||
- [ ] **Step 6: Test** `tests/api/littleblue.test.js` — mirror `security_yerin.test.js` with the blue fixture: `GET /api/little-blue` returns slug `little-blue` + empty history; `POST /turn` streams delta+tool+done, persists user+assistant. (The propose_action tool's HTTP call fails harmlessly in-test since no token is set; the route still streams + persists from the fixture stream.)
|
||||
|
||||
- [ ] **Step 7: Run → PASS. Commit** `feat(littleblue): agent seed + persona + chat route`
|
||||
|
||||
---
|
||||
|
||||
### Task 9: Frontend — Little Blue view (chat + actions panel)
|
||||
|
||||
**Files:** Create `public/views/little_blue.js`; Modify `public/router.js`, `public/app.js`, `public/components/sidebar.js`. Manual verification (no-build convention).
|
||||
|
||||
- [ ] **Step 1:** `public/views/little_blue.js` — health-aware caretaker page: her chat (via `wireAgentChat`, `historyUrl:'/api/little-blue'`, `turnUrl:'/api/little-blue/turn'`, `showDrafts:false`, blue tool labels) + an **Actions panel** that `GET /api/actions`, renders each with a Run button (`POST /api/actions/:id/run` → toast executed/queued), and a **Pending** section (`GET /api/actions/pending` → Approve/Reject buttons → `POST /api/actions/pending/:rowId/{approve,reject}`). Use `el`/`api` like other views.
|
||||
- [ ] **Step 2:** Register route `{ name: 'little-blue', re: /^\/little-blue$/, keys: [] }` in `public/router.js` (+ header comment), loader `'little-blue': () => import('./views/little_blue.js')` in `public/app.js` `VIEWS`, and `navItem('Little Blue', '/little-blue')` in `public/components/sidebar.js` next to `Sentinel`.
|
||||
- [ ] **Step 3:** `node --check` each changed file. Manual verify after deploy: chat streams; Run on a safe action reports executed; a risky action shows in Pending and Approve/Reject works.
|
||||
- [ ] **Step 4: Commit** `feat(ui): Little Blue view — caretaker chat + actions panel`
|
||||
|
||||
---
|
||||
|
||||
### Task 10: Release alpha.16 + deploy + provisioning
|
||||
|
||||
**Files:** `package.json`, `server.js`, `CHANGELOG.md`
|
||||
|
||||
- [ ] **Step 1:** Bump version → `2.0.0-alpha.16` (package.json + server.js VERSION). CHANGELOG entry (Little Blue + action framework).
|
||||
- [ ] **Step 2: Full suite** `npx vitest run` → green (serial).
|
||||
- [ ] **Step 3: Commit** `chore: release 2.0.0-alpha.16 (Little Blue + action framework)`.
|
||||
- [ ] **Step 4: Deploy** `bash deploy/push.sh` → `/health` = alpha.16 (migrations 016+017 run).
|
||||
- [ ] **Step 5: Provisioning (interactive, owner-authorized):**
|
||||
1. **Proxmox token:** create a PVE API token (e.g. `void@pve!actions`) with a role granting only `VM.PowerMgmt`, scoped to the chosen guests. Set `PROXMOX_API_URL=https://192.168.1.124:8006` + `PROXMOX_API_TOKEN=void@pve!actions=<secret>` in the app `.env`.
|
||||
2. **SSH channel:** generate a keypair on CT 311 (`/opt/void-server/.ssh/void-act`); set `ACTIONS_SSH_KEY` in `.env`. On each target host: deploy `deploy/void-act` → `/usr/local/bin/void-act` (chmod +x; edit its case-list), create a `voidact` user, add `authorized_keys`: `command="/usr/local/bin/void-act",no-port-forwarding,no-pty,no-X11-forwarding <pubkey>`.
|
||||
3. **Little Blue token:** mint her bearer (`agents.createToken` for `little-blue`) → set `LITTLEBLUE_TOKEN` in `.env`.
|
||||
4. **Whitelist:** populate `config/actions.json` (`hosts` map + the real `actions`), keeping each host's `void-act` case-list in sync. Redeploy.
|
||||
- [ ] **Step 6: Smoke:** with one `safe` + one `risky` action configured — owner `POST /api/actions/<safe>/run` → executed; `<risky>/run` → queued, appears in `/pending`, approve → executed. Then ask Little Blue in the UI to perform the safe one; confirm the audit rows.
|
||||
- [ ] **Step 7: Memory:** record the action framework + provisioning details (redact secrets) in a `reference` memory note.
|
||||
|
||||
---
|
||||
|
||||
## Self-Review
|
||||
|
||||
**Spec coverage:** whitelist→T2; channels→T3/T4 (+ wrapper); tiered service+approval+audit→T5; `agent_actions`→T1; API→T6; Little Blue tools/registry/cred-isolation→T7; seed/persona/chat→T8; UI (chat + manual actions + approvals)→T9; provisioning + release→T10. Safety model (server-side enforcement, creds only in main server, audit, no shell-from-input)→T3/T4/T5/T7. All covered.
|
||||
|
||||
**Placeholder scan:** `config/actions.json` ships empty by design (populated at provisioning, T10); the host wrapper case-list is per-host and edited at provisioning. No code-gap placeholders.
|
||||
|
||||
**Type consistency:** `makeActionService({registry, channels})` with `channels.{powerGuest,restartService}` — same in T5 def + T3/T4 channel signatures (`powerGuest({node,vmid,op,kindPath})`, `restartService({ip,actionId})`). `loadActions(file,raw)` → `{list,get,hostIp}` — consistent T2/T5/T6. `aa.resolve(id,status,result,by)` — T1 def + T5 use. `runAgentTurn({…,extraEnv})` — T7 add + T8 use. Tool handler 3rd arg `{fetchImpl}` — T7 def + tests.
|
||||
Reference in New Issue
Block a user