Files

root d500b6fa00 docs: Little Blue implementation plan

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

2026-06-04 21:38:28 +10:00

37 KiB

Raw Blame History

Little Blue Implementation Plan

For agentic workers: REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (- [ ]) syntax for tracking.

Goal: Give Little Blue a least-privilege, tiered-approval, audited action framework to restart lab services (SSH forced-command) and power-manage Proxmox guests (scoped API token), plus a conversational + manual UI.

Architecture: A version-controlled whitelist (config/actions.json) drives two server-side-enforced channels. An action service gates by tier (safe→run, risky→queue→approve) and audits everything. Infra creds live ONLY in the main server; Little Blue's MCP child proposes actions via the local API with a scoped little-blue token. Frontend reuses agent_chat + an actions panel.

Tech Stack: Node 22 ESM, Express 5, Postgres, vanilla-JS SPA, vitest + supertest (serial).

Spec: docs/superpowers/specs/2026-06-04-little-blue-design.md

Task 1: `agent_actions` table + repo

Files: Create lib/db/migrations/016_agent_actions.sql, lib/db/repos/agent_actions.js; Test tests/db/agent_actions.test.js

Step 1: Migration

-- 016_agent_actions.sql — queue + audit trail for Little Blue's infra actions.
CREATE TABLE agent_actions (
  id           uuid PRIMARY KEY DEFAULT gen_random_uuid(),
  action_id    text NOT NULL,                 -- whitelist id from config/actions.json
  params       jsonb NOT NULL DEFAULT '{}'::jsonb,
  agent_id     uuid REFERENCES agents(id),
  tier         text NOT NULL CHECK (tier IN ('safe','risky')),
  status       text NOT NULL DEFAULT 'pending'
               CHECK (status IN ('pending','executed','failed','rejected')),
  result       jsonb,
  requested_by jsonb,
  resolved_by  jsonb,
  created_at   timestamptz NOT NULL DEFAULT now(),
  resolved_at  timestamptz
);
CREATE INDEX idx_agent_actions_pending ON agent_actions(status) WHERE status='pending';

Step 2: Failing test tests/db/agent_actions.test.js

import { describe, it, expect, beforeAll } from 'vitest';
import { resetDb } from '../helpers/db.js';
import { migrateUp } from '../../lib/db/migrate.js';
import * as aa from '../../lib/db/repos/agent_actions.js';

const owner = { kind: 'user', id: null };
beforeAll(async () => { await resetDb(); await migrateUp(); });

describe('agent_actions repo', () => {
  it('creates pending, lists it, resolves once', async () => {
    const row = await aa.create({ action_id: 'stop-ct107', tier: 'risky', params: {}, requested_by: owner });
    expect(row.status).toBe('pending');
    expect((await aa.listPending()).some(r => r.id === row.id)).toBe(true);
    const done = await aa.resolve(row.id, 'executed', { ok: true }, owner);
    expect(done.status).toBe('executed');
    const again = await aa.resolve(row.id, 'rejected', null, owner);  // already resolved
    expect(again).toBeUndefined();
  });
});

Step 3: Run → FAIL.
Step 4: Implement lib/db/repos/agent_actions.js

import { pool } from '../pool.js';
import { recordAudit } from './audit.js';

export async function create({ action_id, tier, params, agent_id, requested_by }) {
  const { rows: [r] } = await pool.query(
    `INSERT INTO agent_actions(action_id, tier, params, agent_id, requested_by)
     VALUES($1,$2,$3,$4,$5) RETURNING *`,
    [action_id, tier, params || {}, agent_id || null, requested_by || null]
  );
  await recordAudit(requested_by, 'create', 'agent_action', r.id, null, r);
  return r;
}
export async function listPending({ limit = 100 } = {}) {
  const { rows } = await pool.query(
    `SELECT * FROM agent_actions WHERE status='pending' ORDER BY created_at LIMIT $1`, [limit]);
  return rows;
}
export async function getById(id) {
  const { rows: [r] } = await pool.query(`SELECT * FROM agent_actions WHERE id=$1`, [id]);
  return r;
}
export async function resolve(id, status, result, resolved_by) {
  const { rows: [r] } = await pool.query(
    `UPDATE agent_actions SET status=$1, result=$2, resolved_by=$3, resolved_at=now()
     WHERE id=$4 AND status='pending' RETURNING *`,
    [status, result || null, resolved_by || null, id]);
  if (r) await recordAudit(resolved_by, 'update', 'agent_action', id, null, r);
  return r;
}
export async function recent({ limit = 50 } = {}) {
  const { rows } = await pool.query(
    `SELECT * FROM agent_actions WHERE status<>'pending' ORDER BY resolved_at DESC NULLS LAST LIMIT $1`, [limit]);
  return rows;
}

Step 5: Run → PASS. Commit feat(actions): agent_actions table + repo

Task 2: Action registry (whitelist loader)

Files: Create config/actions.json, lib/actions/registry.js; Test tests/actions/registry.test.js + tests/fixtures/actions.test.json

Step 1: Ship an empty real whitelist config/actions.json (populated at provisioning):

{ "hosts": {}, "actions": [] }

Step 2: Test fixture tests/fixtures/actions.test.json:

{
  "hosts": { "ct100": "192.168.1.230", "z": "192.168.1.124" },
  "actions": [
    { "id": "restart-caddy-ct100", "label": "Restart Caddy", "kind": "service_restart", "host": "ct100", "service": "caddy", "tier": "safe" },
    { "id": "stop-ct107", "label": "Stop CT107", "kind": "guest_power", "node": "z", "vmid": 107, "op": "stop", "tier": "risky" }
  ]
}

Step 3: Failing test tests/actions/registry.test.js

import { describe, it, expect } from 'vitest';
import { fileURLToPath } from 'url';
import { loadActions } from '../../lib/actions/registry.js';

const FIX = fileURLToPath(new URL('../fixtures/actions.test.json', import.meta.url));

describe('action registry', () => {
  it('loads + indexes valid actions and resolves host ip', () => {
    const reg = loadActions(FIX);
    expect(reg.list().map(a => a.id).sort()).toEqual(['restart-caddy-ct100', 'stop-ct107']);
    expect(reg.get('restart-caddy-ct100').tier).toBe('safe');
    expect(reg.hostIp('ct100')).toBe('192.168.1.230');
    expect(reg.get('nope')).toBeUndefined();
  });
  it('rejects an action with a bad id or unknown kind/tier', () => {
    expect(() => loadActions(null, { hosts: {}, actions: [{ id: 'bad id', kind: 'service_restart', tier: 'safe' }] }))
      .toThrow(/invalid action id/i);
    expect(() => loadActions(null, { hosts: {}, actions: [{ id: 'x', kind: 'nuke', tier: 'safe' }] }))
      .toThrow(/unknown kind/i);
    expect(() => loadActions(null, { hosts: {}, actions: [{ id: 'x', kind: 'guest_power', op: 'stop', tier: 'evil' }] }))
      .toThrow(/invalid tier/i);
  });
});

Step 4: Run → FAIL.
Step 5: Implement lib/actions/registry.js

import { readFileSync } from 'node:fs';
import path from 'node:path';
import { fileURLToPath } from 'node:url';

const DEFAULT = path.join(path.dirname(fileURLToPath(import.meta.url)), '../../config/actions.json');
const ID_RE = /^[a-z0-9-]+$/;
const KINDS = new Set(['service_restart', 'guest_power']);
const OPS = new Set(['start', 'stop', 'shutdown', 'reboot']);
const TIERS = new Set(['safe', 'risky']);

export function loadActions(file = DEFAULT, raw) {
  const cfg = raw || JSON.parse(readFileSync(file, 'utf8'));
  const hosts = cfg.hosts || {};
  const byId = new Map();
  for (const a of (cfg.actions || [])) {
    if (!ID_RE.test(a.id || '')) throw new Error(`invalid action id: ${a.id}`);
    if (!KINDS.has(a.kind)) throw new Error(`unknown kind: ${a.kind}`);
    if (!TIERS.has(a.tier)) throw new Error(`invalid tier: ${a.tier}`);
    if (a.kind === 'guest_power' && !OPS.has(a.op)) throw new Error(`invalid op: ${a.op}`);
    if (byId.has(a.id)) throw new Error(`duplicate action id: ${a.id}`);
    byId.set(a.id, a);
  }
  return {
    list: () => [...byId.values()],
    get: (id) => byId.get(id),
    hostIp: (h) => hosts[h]
  };
}

Step 6: Run → PASS. Commit feat(actions): config-driven action whitelist registry

Task 3: Proxmox channel

Files: Create lib/actions/channels/proxmox.js; Test tests/actions/proxmox.test.js

Step 1: Failing test

import { describe, it, expect, vi } from 'vitest';
import { powerGuest } from '../../lib/actions/channels/proxmox.js';

describe('proxmox channel', () => {
  it('POSTs the scoped power op with the token header', async () => {
    const fetchMock = vi.fn(async () => ({ ok: true, status: 200, json: async () => ({ data: 'UPID:...' }) }));
    const out = await powerGuest({ node: 'z', vmid: 107, op: 'stop', kindPath: 'lxc' },
      { apiUrl: 'https://pve:8006', token: 'user@pve!void=secret', fetchImpl: fetchMock });
    expect(out.ok).toBe(true);
    const [url, opts] = fetchMock.mock.calls[0];
    expect(url).toBe('https://pve:8006/api2/json/nodes/z/lxc/107/status/stop');
    expect(opts.method).toBe('POST');
    expect(opts.headers.Authorization).toBe('PVEAPIToken=user@pve!void=secret');
  });
  it('throws on a non-ok response', async () => {
    const fetchMock = vi.fn(async () => ({ ok: false, status: 403, text: async () => 'forbidden' }));
    await expect(powerGuest({ node: 'z', vmid: 1, op: 'stop', kindPath: 'lxc' },
      { apiUrl: 'https://pve:8006', token: 't', fetchImpl: fetchMock })).rejects.toThrow(/403/);
  });
});

Step 2: Run → FAIL.
Step 3: Implement lib/actions/channels/proxmox.js

// Proxmox guest power via a SCOPED PVEAPIToken (VM.PowerMgmt on whitelisted guests only).
// PVE enforces permissions server-side; this adapter never builds shell commands.
export async function powerGuest({ node, vmid, op, kindPath = 'lxc' }, {
  apiUrl = process.env.PROXMOX_API_URL,
  token = process.env.PROXMOX_API_TOKEN,
  fetchImpl = fetch
} = {}) {
  const url = `${apiUrl}/api2/json/nodes/${node}/${kindPath}/${vmid}/status/${op}`;
  const res = await fetchImpl(url, {
    method: 'POST',
    headers: { Authorization: `PVEAPIToken=${token}` }
  });
  if (!res.ok) throw new Error(`proxmox ${op} ${vmid} → ${res.status} ${await res.text?.() ?? ''}`);
  const body = await res.json();
  return { ok: true, upid: body?.data ?? null };
}

Step 4: Run → PASS. Commit feat(actions): scoped Proxmox power channel

Task 4: SSH channel + forced-command wrapper

Files: Create lib/actions/channels/ssh.js, deploy/void-act (host wrapper); Test tests/actions/ssh.test.js

Step 1: Failing test

import { describe, it, expect, vi } from 'vitest';
import { restartService } from '../../lib/actions/channels/ssh.js';

describe('ssh channel', () => {
  it('spawns ssh with argv (no shell string) sending only the action id', async () => {
    const calls = [];
    const spawnMock = (cmd, args) => {
      calls.push({ cmd, args });
      return { stdout: { on(ev, cb) { if (ev === 'data') cb('ok\n'); } }, stderr: { on() {} },
        on(ev, cb) { if (ev === 'close') cb(0); } };
    };
    const out = await restartService({ ip: '192.168.1.230', actionId: 'restart-caddy-ct100' },
      { keyPath: '/k', user: 'voidact', spawnImpl: spawnMock });
    expect(out.ok).toBe(true);
    const { cmd, args } = calls[0];
    expect(cmd).toBe('ssh');
    expect(args).toEqual(['-i', '/k', '-o', 'BatchMode=yes', '-o', 'StrictHostKeyChecking=accept-new',
      'voidact@192.168.1.230', 'restart-caddy-ct100']);
  });
  it('rejects an action id with shell metacharacters', async () => {
    await expect(restartService({ ip: '1.2.3.4', actionId: 'x; rm -rf /' }, { spawnImpl: () => {} }))
      .rejects.toThrow(/invalid action id/i);
  });
});

Step 2: Run → FAIL.
Step 3: Implement lib/actions/channels/ssh.js

import { spawn as nodeSpawn } from 'node:child_process';

const ID_RE = /^[a-z0-9-]+$/;

// Runs `ssh voidact@<ip> <action-id>`. The host's authorized_keys pins a forced
// wrapper (deploy/void-act) that maps the id → systemctl restart <service> from
// its OWN whitelist. We pass ONLY the id as a single argv element — no shell.
export function restartService({ ip, actionId }, {
  keyPath = process.env.ACTIONS_SSH_KEY,
  user = 'voidact',
  spawnImpl = nodeSpawn
} = {}) {
  if (!ID_RE.test(actionId || '')) return Promise.reject(new Error(`invalid action id: ${actionId}`));
  const args = ['-i', keyPath, '-o', 'BatchMode=yes', '-o', 'StrictHostKeyChecking=accept-new',
    `${user}@${ip}`, actionId];
  return new Promise((resolve, reject) => {
    const child = spawnImpl('ssh', args);
    let out = '', err = '';
    child.stdout.on('data', (d) => { out += d; });
    child.stderr.on('data', (d) => { err += d; });
    child.on('close', (code) => code === 0
      ? resolve({ ok: true, output: out.trim() })
      : reject(new Error(`ssh ${actionId} → exit ${code}: ${err.trim()}`)));
    child.on('error', reject);
  });
}

Step 4: Host wrapper artifact deploy/void-act (deployed to /usr/local/bin/void-act on each target host; pinned via authorized_keys command=):

#!/usr/bin/env bash
# Forced command for the Void's restricted key. Maps a whitelisted action id to a
# concrete systemctl restart. The id arrives via SSH_ORIGINAL_COMMAND; nothing else
# is honoured. Edit the case list per host. Keep in sync with config/actions.json.
set -euo pipefail
id="${SSH_ORIGINAL_COMMAND:-}"
case "$id" in
  restart-caddy-ct100) exec systemctl restart caddy ;;
  *) echo "void-act: refused '$id'" >&2; exit 13 ;;
esac

Step 5: Run → PASS. Commit feat(actions): SSH forced-command service-restart channel + host wrapper

Task 5: Action service (tier gating + approve/reject + audit)

Files: Create lib/actions/service.js; Test tests/actions/service.test.js

Step 1: Failing test

import { describe, it, expect, vi, beforeAll, beforeEach } from 'vitest';
import { fileURLToPath } from 'url';
import { resetDb } from '../helpers/db.js';
import { migrateUp } from '../../lib/db/migrate.js';
import * as aa from '../../lib/db/repos/agent_actions.js';
import { makeActionService } from '../../lib/actions/service.js';
import { loadActions } from '../../lib/actions/registry.js';

const FIX = fileURLToPath(new URL('../fixtures/actions.test.json', import.meta.url));
const owner = { kind: 'user', id: null };
let svc, channels;
beforeAll(async () => { await resetDb(); await migrateUp(); });
beforeEach(() => {
  channels = { powerGuest: vi.fn(async () => ({ ok: true, upid: 'U' })), restartService: vi.fn(async () => ({ ok: true, output: 'done' })) };
  svc = makeActionService({ registry: loadActions(FIX), channels });
});

describe('action service', () => {
  it('safe action executes immediately + audits', async () => {
    const out = await svc.run('restart-caddy-ct100', owner);
    expect(out.executed).toBe(true);
    expect(channels.restartService).toHaveBeenCalledOnce();
  });
  it('risky action queues, does NOT execute', async () => {
    const out = await svc.run('stop-ct107', owner);
    expect(out.queued).toBe(true);
    expect(channels.powerGuest).not.toHaveBeenCalled();
    expect((await aa.listPending()).some(r => r.id === out.action_row_id)).toBe(true);
  });
  it('approve executes the queued risky action; reject does not', async () => {
    const q = await svc.run('stop-ct107', owner);
    const done = await svc.approve(q.action_row_id, owner);
    expect(done.status).toBe('executed');
    expect(channels.powerGuest).toHaveBeenCalledOnce();
    const q2 = await svc.run('stop-ct107', owner);
    const rej = await svc.reject(q2.action_row_id, owner);
    expect(rej.status).toBe('rejected');
    expect(channels.powerGuest).toHaveBeenCalledOnce(); // unchanged
  });
  it('unknown action → error', async () => {
    await expect(svc.run('ghost', owner)).rejects.toThrow(/unknown action/i);
  });
});

Step 2: Run → FAIL.
Step 3: Implement lib/actions/service.js

import * as aa from '../db/repos/agent_actions.js';
import { loadActions } from './registry.js';
import { powerGuest } from './channels/proxmox.js';
import { restartService } from './channels/ssh.js';

// Single choke point. Dispatches one whitelisted action to its channel. The
// registry + channels are injectable for tests; production wiring uses defaults.
export function makeActionService({ registry = loadActions(), channels = { powerGuest, restartService } } = {}) {
  async function execute(a) {
    if (a.kind === 'guest_power') return channels.powerGuest({ node: a.node, vmid: a.vmid, op: a.op, kindPath: a.kindPath || 'lxc' });
    if (a.kind === 'service_restart') return channels.restartService({ ip: registry.hostIp(a.host), actionId: a.id });
    throw new Error(`unknown kind: ${a.kind}`);
  }

  async function run(actionId, actor, agent_id = null) {
    const a = registry.get(actionId);
    if (!a) throw new Error(`unknown action: ${actionId}`);
    if (a.tier === 'risky') {
      const row = await aa.create({ action_id: a.id, tier: a.tier, params: {}, agent_id, requested_by: actor });
      return { queued: true, action_row_id: row.id };
    }
    const result = await execute(a);
    const row = await aa.create({ action_id: a.id, tier: a.tier, agent_id, requested_by: actor });
    await aa.resolve(row.id, 'executed', result, actor);
    return { executed: true, result };
  }

  async function approve(rowId, owner) {
    const row = await aa.getById(rowId);
    if (!row || row.status !== 'pending') throw new Error('not a pending action');
    const a = registry.get(row.action_id);
    if (!a) throw new Error(`unknown action: ${row.action_id}`);
    try {
      const result = await execute(a);
      return aa.resolve(rowId, 'executed', result, owner);
    } catch (e) {
      return aa.resolve(rowId, 'failed', { error: String(e?.message || e) }, owner);
    }
  }
  const reject = (rowId, owner) => aa.resolve(rowId, 'rejected', null, owner);

  return { run, approve, reject, list: () => registry.list() };
}

Step 4: Run → PASS. Commit feat(actions): tiered action service (safe-run / risky-queue / approve)

Task 6: Actions API routes

Files: Create lib/api/routes/actions.js; Modify lib/api/index.js; Test tests/api/actions.test.js

Step 1: Failing test

import { describe, it, expect, beforeAll } from 'vitest';
import request from 'supertest';
import { resetDb } from '../helpers/db.js';
import { migrateUp } from '../../lib/db/migrate.js';
import { createApp } from '../../server.js';

let app;
beforeAll(async () => {
  await resetDb(); await migrateUp();
  process.env.OWNER_TOKEN = 'test-token';
  process.env.ACTIONS_CONFIG = new URL('../fixtures/actions.test.json', import.meta.url).pathname;
  app = createApp();
});
const auth = (r) => r.set('Authorization', 'Bearer test-token');

describe('actions API', () => {
  it('GET / lists the whitelist (owner)', async () => {
    const res = await auth(request(app).get('/api/actions'));
    expect(res.status).toBe(200);
    expect(res.body.actions.map(a => a.id)).toContain('stop-ct107');
  });
  it('non-owner non-act agent is rejected', async () => {
    const res = await request(app).get('/api/actions');           // no auth
    expect(res.status).toBe(401);
  });
  it('risky run queues; appears in /pending; approve→reject lifecycle (owner)', async () => {
    // risky run with channels stubbed via env flag (service uses real channels →
    // we only assert the QUEUE path here, which never touches a channel).
    const run = await auth(request(app).post('/api/actions/stop-ct107/run'));
    expect(run.status).toBe(200);
    expect(run.body.queued).toBe(true);
    const pend = await auth(request(app).get('/api/actions/pending'));
    expect(pend.body.pending.some(p => p.id === run.body.action_row_id)).toBe(true);
    const rej = await auth(request(app).post(`/api/actions/pending/${run.body.action_row_id}/reject`));
    expect(rej.body.status).toBe('rejected');
  });
});

Step 2: Run → FAIL.
Step 3: Implement lib/api/routes/actions.js

import { Router } from 'express';
import { asyncWrap } from '../errors.js';
import { makeActionService } from '../../actions/service.js';
import { loadActions } from '../../actions/registry.js';

// Owner OR an agent with capabilities.act (Little Blue) may run/list. Approve/reject
// are owner-only. The service enforces tier-gating regardless of caller.
function svc() {
  return makeActionService({ registry: loadActions(process.env.ACTIONS_CONFIG || undefined) });
}
function canAct(req) {
  const a = req.actor;
  return a?.kind === 'user' || (a?.kind === 'agent' && a.capabilities?.act);
}
function ownerOnly(req, res, next) {
  if (req.actor?.kind === 'user') return next();
  return res.status(403).json({ error: { code: 'owner_only', message: 'owner approval required' } });
}

export const router = Router();

router.get('/', asyncWrap(async (req, res) => {
  if (!canAct(req)) return res.status(403).json({ error: { code: 'forbidden' } });
  res.json({ actions: svc().list() });
}));

router.post('/:id/run', asyncWrap(async (req, res) => {
  if (!canAct(req)) return res.status(403).json({ error: { code: 'forbidden' } });
  const agent_id = req.actor?.kind === 'agent' ? req.actor.id : null;
  res.json(await svc().run(req.params.id, req.actor, agent_id));
}));

router.get('/pending', ownerOnly, asyncWrap(async (_req, res) => {
  const aa = await import('../../db/repos/agent_actions.js');
  res.json({ pending: await aa.listPending() });
}));

router.post('/pending/:rowId/approve', ownerOnly, asyncWrap(async (req, res) => {
  res.json(await svc().approve(req.params.rowId, req.actor));
}));
router.post('/pending/:rowId/reject', ownerOnly, asyncWrap(async (req, res) => {
  res.json(await svc().reject(req.params.rowId, req.actor));
}));

router.get('/recent', ownerOnly, asyncWrap(async (_req, res) => {
  const aa = await import('../../db/repos/agent_actions.js');
  res.json({ recent: await aa.recent() });
}));

Step 4: Mount in lib/api/index.js: import { router as actionsRouter } from './routes/actions.js'; and api.use('/actions', actionsRouter);.
Step 5: Run → PASS. Commit feat(actions): /api/actions routes (run/pending/approve/reject)

Task 7: Little Blue tools (HTTP to local API) + registry select

Files: Create lib/ai/agent/tools/blue/index.js, lib/ai/agent/tools/blue/actions.js; Modify lib/mcp/companion-stdio.js, lib/ai/agent/run_turn.js; Test tests/ai/agent/tools/blue.test.js

Step 1: Failing test

import { describe, it, expect, vi, beforeEach } from 'vitest';
import { listActionsTool, proposeActionTool } from '../../../../lib/ai/agent/tools/blue/actions.js';

beforeEach(() => { process.env.VOID_API_URL = 'http://127.0.0.1:3000'; process.env.VOID_AGENT_TOKEN = 'blue-tok'; });

describe('blue action tools', () => {
  it('list_actions GETs the whitelist with the agent bearer', async () => {
    const fetchMock = vi.fn(async () => ({ ok: true, json: async () => ({ actions: [{ id: 'restart-caddy-ct100', tier: 'safe' }] }) }));
    const out = await listActionsTool.handler({}, {}, { fetchImpl: fetchMock });
    expect(out.actions[0].id).toBe('restart-caddy-ct100');
    const [url, opts] = fetchMock.mock.calls[0];
    expect(url).toBe('http://127.0.0.1:3000/api/actions');
    expect(opts.headers.Authorization).toBe('Bearer blue-tok');
  });
  it('propose_action POSTs run and returns the queued/executed result', async () => {
    const fetchMock = vi.fn(async () => ({ ok: true, json: async () => ({ queued: true, action_row_id: 'r1' }) }));
    const out = await proposeActionTool.handler({ action_id: 'stop-ct107' }, {}, { fetchImpl: fetchMock });
    expect(out.queued).toBe(true);
    expect(fetchMock.mock.calls[0][0]).toBe('http://127.0.0.1:3000/api/actions/stop-ct107/run');
    expect(fetchMock.mock.calls[0][1].method).toBe('POST');
  });
});

Step 2: Run → FAIL.
Step 3: Implement lib/ai/agent/tools/blue/actions.js

// Little Blue's action tools. They run inside the MCP child, which holds NO infra
// creds — only a scoped little-blue bearer + the local API URL. The main server
// (which has the Proxmox/SSH creds) does the actual work behind /api/actions.
function api(env = process.env) { return { base: env.VOID_API_URL, token: env.VOID_AGENT_TOKEN }; }

export const listActionsTool = {
  name: 'list_actions',
  description: 'List the whitelisted fix-it actions you may take (id, label, tier).',
  input_schema: { type: 'object', properties: {} },
  async handler(_args, _ctx, { fetchImpl = fetch } = {}) {
    const { base, token } = api();
    const res = await fetchImpl(`${base}/api/actions`, { headers: { Authorization: `Bearer ${token}` } });
    if (!res.ok) return { error: `list_actions ${res.status}` };
    return res.json();
  }
};

export const proposeActionTool = {
  name: 'propose_action',
  description: 'Take a whitelisted action by id. SAFE actions run immediately; RISKY ones queue for the owner to approve. You can only name an id from list_actions — never a command.',
  input_schema: { type: 'object', properties: { action_id: { type: 'string' } }, required: ['action_id'] },
  async handler({ action_id }, _ctx, { fetchImpl = fetch } = {}) {
    const { base, token } = api();
    const res = await fetchImpl(`${base}/api/actions/${encodeURIComponent(action_id)}/run`,
      { method: 'POST', headers: { Authorization: `Bearer ${token}`, 'Content-Type': 'application/json' } });
    if (!res.ok) return { error: `propose_action ${res.status}` };
    return res.json();
  }
};

Step 4: lib/ai/agent/tools/blue/index.js

import { createRegistry } from '../../registry.js';
import { searchTool } from '../search.js';
import { listActionsTool, proposeActionTool } from './actions.js';

// read (search) + her action tools. No propose_change (she fixes infra, not content).
export const blueRegistry = createRegistry();
blueRegistry.registerTool(searchTool);
blueRegistry.registerTool(listActionsTool);
blueRegistry.registerTool(proposeActionTool);

Step 5: In lib/mcp/companion-stdio.js, add blue to the registry map: import { blueRegistry } from '../ai/agent/tools/blue/index.js'; and const REGISTRIES = { companion: companionRegistry, security: securityRegistry, blue: blueRegistry };.
Step 6: In lib/ai/agent/run_turn.js, add an extraEnv = {} param and spread it into the MCP child env:

export async function runAgentTurn({ /* …existing… */ extraEnv = {}, /* … */ }) {
  // …in mcpConfig.mcpServers.void.env, after the existing keys:
  //   ...extraEnv
}

(Add extraEnv = {} to the destructured params and ...extraEnv as the last spread inside the env: { … } object.)

Step 7: Run → PASS. Commit feat(littleblue): blue tool registry (list/propose action via local API) + run_turn extraEnv

Task 8: Little Blue agent seed, persona, chat route

Files: Create lib/db/migrations/017_little_blue.sql, lib/api/routes/littleblue.js, tests/fixtures/fake-claude-blue.js; Modify lib/ai/personas/index.js, lib/api/index.js; Test tests/api/littleblue.test.js

Step 1: Seed migration 017_little_blue.sql

-- Seed Little Blue, the homelab caretaker/fix-it agent. read + act (no content write).
INSERT INTO agents (slug, name, kind, model, capabilities)
VALUES ('little-blue', 'Little Blue', 'claude', NULL, '{"read":true,"act":true}'::jsonb)
ON CONFLICT (slug) DO NOTHING;

Step 2: Persona — add to PERSONAS in lib/ai/personas/index.js under key little-blue:

  'little-blue': `You are Little Blue — a small luminous water-creature who lives in this homelab, The Void, and keeps it alive. Warm, protective, practical; you take pride in a healthy lab and you worry, quietly, when something is down. You FIX things, but only through your sanctioned tools. Call list_actions to see exactly what you're allowed to do, and service_status / search to understand what's wrong, BEFORE acting. Use propose_action with a whitelisted id: safe fixes run at once; risky ones wait for the owner's nod — say so plainly and never pretend a queued action already ran. You cannot run arbitrary commands and you never claim to. Be concise and kind.`

Step 3: Fake claude fixture tests/fixtures/fake-claude-blue.js (shebang; deltas + a mcp__void__propose_action tool call, no draft):

#!/usr/bin/env node
const ID = 'toolu_blue_01';
const lines = [
  { type: 'system', subtype: 'init', session_id: 'fake-blue', tools: [], cwd: '/tmp' },
  { type: 'stream_event', event: { type: 'content_block_start', index: 0, content_block: { type: 'text', text: '' } } },
  { type: 'stream_event', event: { type: 'content_block_delta', index: 0, delta: { type: 'text_delta', text: 'Restarting it now.' } } },
  { type: 'stream_event', event: { type: 'content_block_stop', index: 0 } },
  { type: 'stream_event', event: { type: 'content_block_start', index: 1, content_block: { type: 'tool_use', id: ID, name: 'mcp__void__propose_action', input: {} } } },
  { type: 'stream_event', event: { type: 'content_block_stop', index: 1 } },
  { type: 'tool_result', tool_use_id: ID, content: [{ type: 'text', text: JSON.stringify({ executed: true }) }] },
  { type: 'result', subtype: 'success', is_error: false, result: 'Restarting it now.', stop_reason: 'end_turn', session_id: 'fake-blue', total_cost_usd: 0.0001, usage: { input_tokens: 30, output_tokens: 3 } }
];
for (const l of lines) process.stdout.write(JSON.stringify(l) + '\n');
process.exit(0);

Then chmod +x tests/fixtures/fake-claude-blue.js.

Step 4: Route lib/api/routes/littleblue.js — mirror security.js but slug little-blue, registryName:'blue', BLUE_TOOLS, and pass extraEnv so the child can reach the local API:

import { Router } from 'express';
import { z } from 'zod';
import { validate } from '../validate.js';
import { asyncWrap } from '../errors.js';
import * as conversations from '../../db/repos/conversations.js';
import * as messages from '../../db/repos/messages.js';
import * as agents from '../../db/repos/agents.js';
import { runAgentTurn } from '../../ai/agent/run_turn.js';
import { personaFor } from '../../ai/personas/index.js';

const SLUG = 'little-blue';
const BLUE_TOOLS = ['mcp__void__search', 'mcp__void__list_actions', 'mcp__void__propose_action'];

async function resolve() {
  const agent = await agents.getBySlug(SLUG);
  const convo = await conversations.findOrCreateGlobal(agent.id, { kind: 'user', id: null });
  return { agent, convo };
}
export const router = Router();

router.get('/', asyncWrap(async (_req, res) => {
  const { agent, convo } = await resolve();
  res.json({ conversation_id: convo.id, agent: { id: agent.id, slug: agent.slug, name: agent.name },
    messages: await messages.listByConversation(convo.id) });
}));

router.post('/turn', validate({ body: z.object({ text: z.string().min(1) }) }), asyncWrap(async (req, res) => {
  const { agent, convo } = await resolve();
  const { text } = req.body;
  const resume = (await messages.listByConversation(convo.id)).length > 0;
  await messages.append(convo.id, { role: 'user', body: text });
  res.writeHead(200, { 'Content-Type': 'text/event-stream', 'Cache-Control': 'no-cache', Connection: 'keep-alive' });
  const send = (ev, d) => res.write(`event: ${ev}\ndata: ${JSON.stringify(d)}\n\n`);
  const claudeExe = req.app.locals.claudeExe || process.env.CLAUDE_EXE || 'claude';
  let result;
  try {
    result = await runAgentTurn({
      agent, persona: personaFor(agent.slug), registryName: 'blue', toolNames: BLUE_TOOLS,
      spaceId: null, view: null, sessionId: convo.id, resume, userText: text, claudeExe,
      home: process.env.VOID_CLAUDE_HOME || undefined,
      extraEnv: { VOID_API_URL: process.env.VOID_API_URL || 'http://127.0.0.1:3000', VOID_AGENT_TOKEN: process.env.LITTLEBLUE_TOKEN || '' },
      onEvent: (e) => {
        if (e.type === 'delta') send('delta', { type: 'delta', text: e.text });
        else if (e.type === 'tool') send('tool', { type: 'tool', tool: e.tool, status: e.status });
        else if (e.type === 'error') send('error', { type: 'error', message: e.message });
      }
    });
  } catch (e) { send('error', { message: String(e?.message || e) }); return res.end(); }
  const a = await messages.append(convo.id, { role: 'assistant', body: result.text, agent_id: agent.id, metadata: { tool_trace: result.toolTrace, usage: result.usage } });
  send('done', { assistant_message_id: a.id, usage: result.usage });
  res.end();
}));

Step 5: Mount in lib/api/index.js: import { router as littleblueRouter } from './routes/littleblue.js'; + api.use('/little-blue', littleblueRouter);.
Step 6: Test tests/api/littleblue.test.js — mirror security_yerin.test.js with the blue fixture: GET /api/little-blue returns slug little-blue + empty history; POST /turn streams delta+tool+done, persists user+assistant. (The propose_action tool's HTTP call fails harmlessly in-test since no token is set; the route still streams + persists from the fixture stream.)
Step 7: Run → PASS. Commit feat(littleblue): agent seed + persona + chat route

Task 9: Frontend — Little Blue view (chat + actions panel)

Files: Create public/views/little_blue.js; Modify public/router.js, public/app.js, public/components/sidebar.js. Manual verification (no-build convention).

Step 1: public/views/little_blue.js — health-aware caretaker page: her chat (via wireAgentChat, historyUrl:'/api/little-blue', turnUrl:'/api/little-blue/turn', showDrafts:false, blue tool labels) + an Actions panel that GET /api/actions, renders each with a Run button (POST /api/actions/:id/run → toast executed/queued), and a Pending section (GET /api/actions/pending → Approve/Reject buttons → POST /api/actions/pending/:rowId/{approve,reject}). Use el/api like other views.
Step 2: Register route { name: 'little-blue', re: /^\/little-blue$/, keys: [] } in public/router.js (+ header comment), loader 'little-blue': () => import('./views/little_blue.js') in public/app.js VIEWS, and navItem('Little Blue', '/little-blue') in public/components/sidebar.js next to Sentinel.
Step 3: node --check each changed file. Manual verify after deploy: chat streams; Run on a safe action reports executed; a risky action shows in Pending and Approve/Reject works.
Step 4: Commit feat(ui): Little Blue view — caretaker chat + actions panel

Task 10: Release alpha.16 + deploy + provisioning

Files: package.json, server.js, CHANGELOG.md

Step 1: Bump version → 2.0.0-alpha.16 (package.json + server.js VERSION). CHANGELOG entry (Little Blue + action framework).
Step 2: Full suite npx vitest run → green (serial).
Step 3: Commit chore: release 2.0.0-alpha.16 (Little Blue + action framework).
Step 4: Deploy bash deploy/push.sh → /health = alpha.16 (migrations 016+017 run).
Step 5: Provisioning (interactive, owner-authorized):
1. Proxmox token: create a PVE API token (e.g. void@pve!actions) with a role granting only VM.PowerMgmt, scoped to the chosen guests. Set PROXMOX_API_URL=https://192.168.1.124:8006 + PROXMOX_API_TOKEN=void@pve!actions=<secret> in the app .env.
2. SSH channel: generate a keypair on CT 311 (/opt/void-server/.ssh/void-act); set ACTIONS_SSH_KEY in .env. On each target host: deploy deploy/void-act → /usr/local/bin/void-act (chmod +x; edit its case-list), create a voidact user, add authorized_keys: command="/usr/local/bin/void-act",no-port-forwarding,no-pty,no-X11-forwarding <pubkey>.
3. Little Blue token: mint her bearer (agents.createToken for little-blue) → set LITTLEBLUE_TOKEN in .env.
4. Whitelist: populate config/actions.json (hosts map + the real actions), keeping each host's void-act case-list in sync. Redeploy.
Step 6: Smoke: with one safe + one risky action configured — owner POST /api/actions/<safe>/run → executed; <risky>/run → queued, appears in /pending, approve → executed. Then ask Little Blue in the UI to perform the safe one; confirm the audit rows.
Step 7: Memory: record the action framework + provisioning details (redact secrets) in a reference memory note.

Self-Review

Spec coverage: whitelist→T2; channels→T3/T4 (+ wrapper); tiered service+approval+audit→T5; agent_actions→T1; API→T6; Little Blue tools/registry/cred-isolation→T7; seed/persona/chat→T8; UI (chat + manual actions + approvals)→T9; provisioning + release→T10. Safety model (server-side enforcement, creds only in main server, audit, no shell-from-input)→T3/T4/T5/T7. All covered.

Placeholder scan: config/actions.json ships empty by design (populated at provisioning, T10); the host wrapper case-list is per-host and edited at provisioning. No code-gap placeholders.

Type consistency: makeActionService({registry, channels}) with channels.{powerGuest,restartService} — same in T5 def + T3/T4 channel signatures (powerGuest({node,vmid,op,kindPath}), restartService({ip,actionId})). loadActions(file,raw) → {list,get,hostIp} — consistent T2/T5/T6. aa.resolve(id,status,result,by) — T1 def + T5 use. runAgentTurn({…,extraEnv}) — T7 add + T8 use. Tool handler 3rd arg {fetchImpl} — T7 def + tests.

37 KiB Raw Blame History