feat(infra): commit live infra-audit/cluster work to reconcile git with prod

This work (network_hosts inventory + infra_audit MCP tool, /api/cluster +
Sacred Valley cluster card, topbar cluster-health pill + SW self-heal) was
built in an earlier session and DEPLOYED to CT 311 as alpha.24–26, but was
never committed to git — prod was running code absent from the repo. Commits
it as-is (already prod-validated) so git matches the live state, and restores
its alpha.24/25/26 CHANGELOG entries. Files are disjoint from the fold-in
work; both now ship together under alpha.27.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
root
2026-06-08 15:20:38 +10:00
parent ae2ea09f0c
commit b0b23ba05d
19 changed files with 606 additions and 4 deletions

View File

@@ -9,6 +9,19 @@ Format: [Keep a Changelog](https://keepachangelog.com).
- feat: phuryn usage dashboard now reachable at aiusage.hynesy.com behind CF Access. - feat: phuryn usage dashboard now reachable at aiusage.hynesy.com behind CF Access.
- feat: Sacred Valley AI Usage card opens the in-Void #/ai-usage route. - feat: Sacred Valley AI Usage card opens the in-Void #/ai-usage route.
## 2.0.0-alpha.26 — Topbar cluster-health pill + always-fresh self-heal
- **Topbar cluster-health indicator** (`public/components/topbar.js`): a themed pill left of Inbox/Chat/Owner that polls `/api/cluster` every 30s and shows **healthy** (green) when quorate + all nodes online + HA clean, **HA issue / node down / no quorum** (amber/red) otherwise. Click → Sacred Valley. Reuses the `--ok/--warn/--bad` dot palette.
- **Always-fresh self-heal** (`public/index.html`): inline pre-module script unregisters any service worker and clears caches on every load. The legacy Void 1 caching SW (origin-scoped to `void.hynesy.com`) was serving stale assets that survived hard reloads; this removes it on the next load and prevents recurrence on every device. Assets are already served `no-cache`, so with no SW the app is always fresh.
## 2.0.0-alpha.25 — Cluster health Sacred Valley card
- **`GET /api/cluster`** (`lib/proxmox/cluster.js` + route, 10s-cached): read-only Proxmox cluster health — `quorate`, per-node online state, HA master/fencing, and HA service count + error count. Pure `normalizeCluster()` folds `/cluster/status` + `/cluster/ha/status/current`; unit-tested with injected fetch. Uses a **dedicated read-only PVE token** (`PROXMOX_RO_TOKEN`, user `void-ro@pve` with `PVEAuditor` on `/`) — never the power-action token.
- **Sacred Valley "Cluster · HZ" card** (`public/views/cards/cluster.js`, registered in `sacred_valley.js`): polls every 30s, shows the quorum badge, node up/down dots, master, and HA-service issues. Reuses the tile status palette (blackflame `--ok`/`--warn`/`--bad`).
## 2.0.0-alpha.24 — Infra sanity check + LAN host/MAC inventory
- **`network_hosts` inventory table** (`migration 023`, repo `lib/db/repos/network_hosts.js`): authoritative id→ip→MAC map of every cluster guest + PVE host + the Pi QDevice, seeded from a live capture. Source of truth for router DHCP reservations (the LAN pool is the whole `.2.254`, so each pinned guest needs a static IP + a MAC reservation) and for the audit below. Idempotent seed (`ON CONFLICT DO UPDATE`).
- **`infra_audit` sanity check** (`lib/infra/audit.js`, `GET /api/infra/audit`, MCP tool `infra_audit` in `blueRegistry`): probes every `192.168.x.y:port` referenced in the Wiki **and** every enabled service URL, reports unreachable endpoints (stale/incorrect IPs or ports) grouped by source, plus inventory hosts missing a MAC. Read-only TCP connects; available to the owner or any authed agent (e.g. Little Blue) so agents can verify the docs/registry match reality.
- **Service registry IP fixes**: `magicmirror``192.168.1.224`, `obd2``192.168.1.225` (moved off contested DHCP-range addresses to static).
## 2.0.0-alpha.23 — Local/remote-aware service tiles ## 2.0.0-alpha.23 — Local/remote-aware service tiles
- **Optional `external` URL per service** (`migration 022`, `config/services.json`, repo + `/api/health/services` payload + `svcBody`): Little Blue health-band tiles previously linked to the single LAN `url`, so they opened dead private IPs when browsing remotely (e.g. Gramps `http://192.168.1.99`). Migration adds the column and **backfills** curated domains by id (the live instance is already seeded, so a column-add alone wouldn't populate them); also normalises `jellyfin`/`chaptarr` (which stored a domain in `url`) to LAN `url` + `external`. - **Optional `external` URL per service** (`migration 022`, `config/services.json`, repo + `/api/health/services` payload + `svcBody`): Little Blue health-band tiles previously linked to the single LAN `url`, so they opened dead private IPs when browsing remotely (e.g. Gramps `http://192.168.1.99`). Migration adds the column and **backfills** curated domains by id (the live instance is already seeded, so a column-add alone wouldn't populate them); also normalises `jellyfin`/`chaptarr` (which stored a domain in `url`) to LAN `url` + `external`.
- **Context-based tile target + one-click alt** (`public/views/service_url.js`, `public/components/service_tile.js`, `public/views/health_band.js`): the tile picks its primary URL from `location.hostname` — public host (e.g. `void.hynesy.com`) opens the domain, private IP/localhost/.local opens the LAN address — and always offers a `⇄` alt to the *other* URL (a reliable manual fallback; an auto-probe can't work because an HTTPS dashboard is blocked from probing `http://` LAN IPs by mixed-content). Services with no `external` are dimmed with a "LAN-only" badge when remote. Tile root is now a `div` with a stretched primary `<a>` + sibling alt `<a>` (no nested anchors). Health checker unchanged (still probes LAN `url` from CT 311). - **Context-based tile target + one-click alt** (`public/views/service_url.js`, `public/components/service_tile.js`, `public/views/health_band.js`): the tile picks its primary URL from `location.hostname` — public host (e.g. `void.hynesy.com`) opens the domain, private IP/localhost/.local opens the LAN address — and always offers a `⇄` alt to the *other* URL (a reliable manual fallback; an auto-probe can't work because an HTTPS dashboard is blocked from probing `http://` LAN IPs by mixed-content). Services with no `external` are dimmed with a "LAN-only" badge when remote. Tile root is now a `div` with a stretched primary `<a>` + sibling alt `<a>` (no nested anchors). Health checker unchanged (still probes LAN `url` from CT 311).

View File

@@ -10,7 +10,7 @@
{ "id": "gramps", "name": "Gramps Web", "category": "infrastructure", "host": "ct109", "url": "http://192.168.1.99", "external": "https://gramps.hynesy.com", "icon": "gramps" }, { "id": "gramps", "name": "Gramps Web", "category": "infrastructure", "host": "ct109", "url": "http://192.168.1.99", "external": "https://gramps.hynesy.com", "icon": "gramps" },
{ "id": "scanopy", "name": "Scanopy", "category": "infrastructure", "host": "ct100", "url": "http://192.168.1.230:60072", "icon": "scanopy" }, { "id": "scanopy", "name": "Scanopy", "category": "infrastructure", "host": "ct100", "url": "http://192.168.1.230:60072", "icon": "scanopy" },
{ "id": "homelab", "name": "Homelable", "category": "infrastructure", "host": "ct100", "url": "http://192.168.1.230:3000", "icon": "" }, { "id": "homelab", "name": "Homelable", "category": "infrastructure", "host": "ct100", "url": "http://192.168.1.230:3000", "icon": "" },
{ "id": "obd2", "name": "OBD2", "category": "infrastructure", "host": "ct .28", "url": "http://192.168.1.28:8384", "icon": "" }, { "id": "obd2", "name": "OBD2", "category": "infrastructure", "host": "ct112 · .225", "url": "http://192.168.1.225:8384", "icon": "" },
{ "id": "pterodactyl", "name": "Pterodactyl", "category": "infrastructure", "host": "192.168.1.247", "url": "http://192.168.1.247", "icon": "pterodactyl" }, { "id": "pterodactyl", "name": "Pterodactyl", "category": "infrastructure", "host": "192.168.1.247", "url": "http://192.168.1.247", "icon": "pterodactyl" },
{ "id": "pve-z", "name": "Proxmox · z", "category": "infrastructure", "host": "z", "url": "https://192.168.1.124:8006", "icon": "proxmox", "check": { "type": "tcp" } }, { "id": "pve-z", "name": "Proxmox · z", "category": "infrastructure", "host": "z", "url": "https://192.168.1.124:8006", "icon": "proxmox", "check": { "type": "tcp" } },
{ "id": "pve-z3", "name": "Proxmox · Z3", "category": "infrastructure", "host": "z3", "url": "https://192.168.1.125:8006", "icon": "proxmox", "check": { "type": "tcp" } }, { "id": "pve-z3", "name": "Proxmox · Z3", "category": "infrastructure", "host": "z3", "url": "https://192.168.1.125:8006", "icon": "proxmox", "check": { "type": "tcp" } },
@@ -25,6 +25,6 @@
{ "id": "void1", "name": "The Void 1.x", "category": "other", "host": "ct301", "url": "http://192.168.1.11:2424", "icon": "void" }, { "id": "void1", "name": "The Void 1.x", "category": "other", "host": "ct301", "url": "http://192.168.1.11:2424", "icon": "void" },
{ "id": "farm-timelapse", "name": "Farm Timelapse", "category": "other", "host": "192.168.1.108", "url": "http://192.168.1.108:8000", "icon": "" }, { "id": "farm-timelapse", "name": "Farm Timelapse", "category": "other", "host": "192.168.1.108", "url": "http://192.168.1.108:8000", "icon": "" },
{ "id": "magicmirror", "name": "MagicMirror", "category": "other", "host": "192.168.1.27", "url": "http://192.168.1.27:8080", "icon": "magicmirror" }, { "id": "magicmirror", "name": "MagicMirror", "category": "other", "host": "ct111 · .224", "url": "http://192.168.1.224:8080", "icon": "magicmirror" },
{ "id": "claude-usage", "name": "Claude Usage", "category": "other", "host": "ct300", "url": "http://192.168.1.212:8080", "icon": "claude" } { "id": "claude-usage", "name": "Claude Usage", "category": "other", "host": "ct300", "url": "http://192.168.1.212:8080", "icon": "claude" }
] ]

View File

@@ -1,9 +1,12 @@
import { createRegistry } from '../../registry.js'; import { createRegistry } from '../../registry.js';
import { searchTool } from '../search.js'; import { searchTool } from '../search.js';
import { listActionsTool, proposeActionTool } from './actions.js'; import { listActionsTool, proposeActionTool } from './actions.js';
import { infraAuditTool } from './infra_audit.js';
// read (search) + her action tools. No propose_change (she fixes infra, not content). // read (search) + her action tools + infra sanity check. No propose_change
// (she fixes infra, not content).
export const blueRegistry = createRegistry(); export const blueRegistry = createRegistry();
blueRegistry.registerTool(searchTool); blueRegistry.registerTool(searchTool);
blueRegistry.registerTool(listActionsTool); blueRegistry.registerTool(listActionsTool);
blueRegistry.registerTool(proposeActionTool); blueRegistry.registerTool(proposeActionTool);
blueRegistry.registerTool(infraAuditTool);

View File

@@ -0,0 +1,17 @@
// Little Blue's infra sanity check. Runs in the MCP child (no infra creds) — it
// calls the main server's read-only /api/infra/audit, which probes wiki-referenced
// endpoints + registered service URLs and reports anything unreachable (e.g. a
// doc/registry pointing at a stale IP) plus inventory hosts missing a MAC.
function api(env = process.env) { return { base: env.VOID_API_URL, token: env.VOID_AGENT_TOKEN }; }
export const infraAuditTool = {
name: 'infra_audit',
description: 'Run a homelab sanity check: probe every IP:port the wiki references and every monitored service, and report unreachable endpoints (stale/incorrect IPs or ports) plus inventory hosts missing a MAC. Read-only — use to verify the docs/registry match reality.',
input_schema: { type: 'object', properties: {} },
async handler(_args, _ctx, { fetchImpl = fetch } = {}) {
const { base, token } = api();
const res = await fetchImpl(`${base}/api/infra/audit`, { headers: { Authorization: `Bearer ${token}` } });
if (!res.ok) return { error: `infra_audit ${res.status}` };
return res.json();
}
};

View File

@@ -32,6 +32,8 @@ import { router as securityRouter } from './routes/security.js';
import { router as actionsRouter } from './routes/actions.js'; import { router as actionsRouter } from './routes/actions.js';
import { router as littleblueRouter } from './routes/littleblue.js'; import { router as littleblueRouter } from './routes/littleblue.js';
import { router as aiUsageRouter } from './routes/ai_usage.js'; import { router as aiUsageRouter } from './routes/ai_usage.js';
import { router as infraRouter } from './routes/infra.js';
import { router as clusterRouter } from './routes/cluster.js';
export function mountApi(app) { export function mountApi(app) {
const api = Router(); const api = Router();
@@ -45,6 +47,8 @@ export function mountApi(app) {
api.use('/spaces/:space_id/companion', companionRouter); api.use('/spaces/:space_id/companion', companionRouter);
api.use('/security', securityRouter); api.use('/security', securityRouter);
api.use('/actions', actionsRouter); api.use('/actions', actionsRouter);
api.use('/infra', infraRouter);
api.use('/cluster', clusterRouter);
api.use('/little-blue', littleblueRouter); api.use('/little-blue', littleblueRouter);
api.use('/ai-usage', aiUsageRouter); api.use('/ai-usage', aiUsageRouter);
api.use('/projects', projectsRouter); api.use('/projects', projectsRouter);

17
lib/api/routes/cluster.js Normal file
View File

@@ -0,0 +1,17 @@
import { Router } from 'express';
import { asyncWrap } from '../errors.js';
import { clusterHealth } from '../../proxmox/cluster.js';
// Read-only cluster health for the Sacred Valley card. Cached briefly so multiple
// polling clients coalesce into one PVE call. Owner or any authed agent.
export const router = Router();
let cache = { at: 0, data: null };
const TTL = 10_000;
router.get('/', asyncWrap(async (_req, res) => {
if (cache.data && Date.now() - cache.at < TTL) return res.json(cache.data);
const data = await clusterHealth();
cache = { at: Date.now(), data };
res.json(data);
}));

26
lib/api/routes/infra.js Normal file
View File

@@ -0,0 +1,26 @@
import { Router } from 'express';
import { asyncWrap } from '../errors.js';
import { pool } from '../../db/pool.js';
import * as monitored from '../../db/repos/monitored_services.js';
import * as networkHosts from '../../db/repos/network_hosts.js';
import { runAudit, tcpProbe } from '../../infra/audit.js';
// Read-only infra sanity check: probe every IP:port referenced in the wiki and
// every enabled service URL, and surface hosts missing a recorded MAC. Available
// to the owner or any authed agent (no mutations, just TCP connects).
export const router = Router();
const probe = (host, port) => tcpProbe(host, port, 1500);
router.get('/audit', asyncWrap(async (_req, res) => {
const { rows: pages } = await pool.query(
`SELECT p.title, p.body_md FROM pages p JOIN spaces s ON s.id = p.space_id WHERE s.slug = 'wiki'`);
const services = (await monitored.listEnabled()).filter(s => /^https?:\/\//.test(s.url || ''));
const report = await runAudit({ pages, services, probe });
const missingMac = (await networkHosts.missingMac()).map(h => h.id);
res.json({ ...report, inventory: { missing_mac: missingMac } });
}));
router.get('/hosts', asyncWrap(async (_req, res) => {
res.json({ hosts: await networkHosts.all() });
}));

View File

@@ -0,0 +1,45 @@
-- 023_network_hosts.sql
-- Authoritative LAN inventory of cluster guests + hosts: id -> ip -> MAC.
-- Source of truth for router DHCP reservations and the infra_audit sanity check.
-- Pool is the whole .2-.254, so every pinned guest needs a static IP + a router
-- reservation on its MAC; this table is where we record the MAC<->IP mapping.
CREATE TABLE IF NOT EXISTS network_hosts (
id text PRIMARY KEY, -- e.g. ct100, vm200, pve-z, qdevice-pi
kind text NOT NULL, -- lxc | vm | pve-host | qdevice
name text NOT NULL,
node text, -- z | Z3 | won | -
ip text,
mac text, -- NULL when not yet captured (host down)
note text,
created_at timestamptz NOT NULL DEFAULT now(),
updated_at timestamptz NOT NULL DEFAULT now()
);
CREATE INDEX IF NOT EXISTS idx_network_hosts_ip ON network_hosts(ip);
-- Seed the current inventory (captured 2026-06-08). Idempotent: re-running keeps
-- the row but refreshes ip/mac/note so a later edit-and-migrate stays correct.
INSERT INTO network_hosts (id, kind, name, node, ip, mac, note) VALUES
('ct100','lxc','mediastack','z','192.168.1.230','BC:24:11:D8:2B:7F','Docker media host'),
('ct102','lxc','ollama','z','192.168.1.185','BC:24:11:06:89:40','Ollama (GPU)'),
('ct103','lxc','openwebui','z','192.168.1.231','BC:24:11:98:28:A1','Open WebUI'),
('ct104','lxc','bookstack','z','192.168.1.213','BC:24:11:C3:F4:0A','BookStack mirror'),
('ct105','lxc','gitea','z','192.168.1.223','BC:24:11:AA:2B:4E','Gitea (static, was DHCP)'),
('ct106','lxc','pihole','z','192.168.1.140','BC:24:11:DB:2A:39','Pi-hole DNS adblock'),
('ct107','lxc','iventoy','z','192.168.1.150','BC:24:11:9B:01:10','PXE (parked, donatello-vm rootfs)'),
('ct108','lxc','tlcapture','z','192.168.1.108','BC:24:11:6D:97:27','Farm Timelapse'),
('ct109','lxc','gramps','z','192.168.1.99','BC:24:11:8E:D3:58','Gramps Web'),
('ct110','lxc','n8n','z','192.168.1.235','BC:24:11:28:70:30','n8n'),
('ct111','lxc','magicmirror','z','192.168.1.224','BC:24:11:6C:D4:E6','MagicMirror (static, was DHCP .27)'),
('ct112','lxc','obd2','z','192.168.1.225','BC:24:11:E7:D8:BF','OBD2 telemetry (static, was DHCP .28)'),
('ct300','lxc','claude','z','192.168.1.212','BC:24:11:9E:AA:73','Claude Code workspace'),
('ct301','lxc','void1','z','192.168.1.11','BC:24:11:4D:B7:CC','Void 1.x legacy'),
('ct310','lxc','void2-db','z','192.168.1.215','BC:24:11:49:C6:29','Void 2.0 Postgres'),
('ct311','lxc','void2-app','z','192.168.1.216','BC:24:11:9B:B7:3A','Void 2.0 app'),
('vm117','vm','Pterodactyl-Deb','z','192.168.1.247','BC:24:11:37:C1:F7','Game panel (static, in-guest)'),
('vm200','vm','OpenClaw','z','192.168.1.183','BC:24:11:29:84:B9','OpenClaw agent (static, in-guest)'),
('pve-z','pve-host','z','z','192.168.1.124','00:E0:4C:0F:36:00','Cluster node 1 (GPU)'),
('pve-z3','pve-host','Z3','Z3','192.168.1.125','6C:0B:5E:78:1C:93','Cluster node 2 (HA target)'),
('qdevice-pi','qdevice','retropie','-','192.168.1.254','D8:3A:DD:22:C4:21','QDevice corosync-qnetd — reserve this MAC to .254')
ON CONFLICT (id) DO UPDATE SET
kind = EXCLUDED.kind, name = EXCLUDED.name, node = EXCLUDED.node,
ip = EXCLUDED.ip, mac = EXCLUDED.mac, note = EXCLUDED.note, updated_at = now();

View File

@@ -0,0 +1,28 @@
import { pool } from '../pool.js';
const COLS = 'id, kind, name, node, ip, mac, note, updated_at';
// Authoritative guest/host LAN inventory (id -> ip -> mac). Read-only here; the
// canonical seed lives in migration 023. Used by the infra_audit sanity check
// and as the source for router DHCP reservations.
export async function all() {
const { rows } = await pool.query(`SELECT ${COLS} FROM network_hosts ORDER BY id`);
return rows;
}
export async function get(id) {
const { rows: [r] } = await pool.query(`SELECT ${COLS} FROM network_hosts WHERE id=$1`, [id]);
return r || null;
}
// Hosts still missing a captured MAC (e.g. the Pi when it was down at seed time).
export async function missingMac() {
const { rows } = await pool.query(`SELECT ${COLS} FROM network_hosts WHERE mac IS NULL ORDER BY id`);
return rows;
}
export async function setMac(id, mac) {
const { rows: [r] } = await pool.query(
`UPDATE network_hosts SET mac=$2, updated_at=now() WHERE id=$1 RETURNING ${COLS}`, [id, mac]);
return r || null;
}

86
lib/infra/audit.js Normal file
View File

@@ -0,0 +1,86 @@
import net from 'node:net';
// Doc/infra sanity check. Pure functions with an injected `probe(host, port) ->
// Promise<bool>` so they're testable offline; the default tcpProbe is used in prod.
const LAN_RE = /(?<![\d.])(192\.168\.\d{1,3}\.\d{1,3})(?::(\d{1,5}))?(?![\d])/g;
// Pull unique LAN endpoints from free text. host-only refs come back with port:null.
export function extractEndpoints(text) {
const seen = new Map();
for (const m of String(text || '').matchAll(LAN_RE)) {
const host = m[1];
const port = m[2] ? Number(m[2]) : null;
const key = `${host}:${port ?? ''}`;
if (!seen.has(key)) seen.set(key, { host, port });
}
return [...seen.values()];
}
export function parseUrl(url) {
try {
const u = new URL(url);
const port = u.port ? Number(u.port) : (u.protocol === 'https:' ? 443 : 80);
return { host: u.hostname, port };
} catch { return null; }
}
// Default reachability probe: a TCP connect with a short timeout.
export function tcpProbe(host, port, timeoutMs = 2500) {
return new Promise((resolve) => {
const sock = new net.Socket();
let done = false;
const finish = (ok) => { if (done) return; done = true; sock.destroy(); resolve(ok); };
sock.setTimeout(timeoutMs);
sock.once('connect', () => finish(true));
sock.once('timeout', () => finish(false));
sock.once('error', () => finish(false));
sock.connect(port, host);
});
}
// Cross-check every IP:port referenced in the wiki against live reachability.
// Flags stale references (e.g. a CT that moved off an old IP) grouped by page.
export async function auditDocs({ pages, probe }) {
const map = new Map(); // host:port -> { host, port, pages:Set }
for (const p of pages || []) {
for (const ep of extractEndpoints(p.body_md)) {
const key = `${ep.host}:${ep.port ?? ''}`;
if (!map.has(key)) map.set(key, { host: ep.host, port: ep.port, pages: new Set() });
map.get(key).pages.add(p.title);
}
}
const all = [...map.values()];
const probable = all.filter(e => e.port != null);
const unprobed = all.filter(e => e.port == null).map(e => ({ host: e.host, port: null, pages: [...e.pages] }));
const unreachable = [];
for (const e of probable) {
if (!(await probe(e.host, e.port))) unreachable.push({ host: e.host, port: e.port, pages: [...e.pages] });
}
return {
ok: unreachable.length === 0,
summary: { endpoints: all.length, probed: probable.length, reachable: probable.length - unreachable.length, unreachable: unreachable.length },
unreachable,
unprobed
};
}
// Probe each registered service's LAN url; flag any that don't answer.
export async function auditServices({ services, probe }) {
let probed = 0;
const unreachable = [];
for (const s of services || []) {
const hp = parseUrl(s.url);
if (!hp) continue;
probed++;
if (!(await probe(hp.host, hp.port))) unreachable.push({ id: s.id, url: s.url, host: hp.host, port: hp.port });
}
return { ok: unreachable.length === 0, summary: { probed, unreachable: unreachable.length }, unreachable };
}
// Full sanity sweep used by the API route / MCP tool.
export async function runAudit({ pages = [], services = [], probe = tcpProbe }) {
const docs = await auditDocs({ pages, probe });
const svc = await auditServices({ services, probe });
return { ok: docs.ok && svc.ok, docs, services: svc };
}

76
lib/proxmox/cluster.js Normal file
View File

@@ -0,0 +1,76 @@
import { Agent } from 'undici';
// Read-only Proxmox cluster health for the Sacred Valley card. Uses a dedicated
// PVEAuditor token (PROXMOX_RO_TOKEN) — never the power-action token. PVE's REST
// API has no vote-count endpoint, so "quorum" here = the corosync `quorate` flag
// (from /cluster/status) plus the HA-manager quorum status (/cluster/ha/status).
let insecure;
function tlsDispatcher() {
if (process.env.PROXMOX_INSECURE_TLS !== '1') return undefined;
insecure ??= new Agent({ connect: { rejectUnauthorized: false } });
return insecure;
}
async function pveGet(path, { apiUrl, token, fetchImpl = fetch }) {
const res = await fetchImpl(`${apiUrl}/api2/json${path}`, {
headers: { Authorization: `PVEAPIToken=${token}` },
dispatcher: tlsDispatcher()
});
if (!res.ok) throw new Error(`pve ${path} -> ${res.status}`);
return (await res.json())?.data ?? [];
}
const SETTLED_STATES = new Set(['started', 'stopped', 'ignored', 'disabled']);
// Pure: fold /cluster/status + /cluster/ha/status/current into the card shape.
export function normalizeCluster(statusData = [], haData = []) {
const cluster = statusData.find(e => e.type === 'cluster') || {};
const nodes = statusData
.filter(e => e.type === 'node')
.map(n => ({ name: n.name, online: n.online === 1 || n.online === true, local: !!n.local, ip: n.ip || null }))
.sort((a, b) => a.name.localeCompare(b.name));
const quorum = haData.find(e => e.type === 'quorum') || {};
const master = haData.find(e => e.type === 'master') || {};
const fencing = haData.find(e => e.type === 'fencing') || {};
const services = haData
.filter(e => e.type === 'service')
.map(s => ({ sid: s.sid || (s.id || '').replace(/^service:/, ''), state: s.state || s.crm_state || 'unknown', node: s.node || null }))
.sort((a, b) => a.sid.localeCompare(b.sid));
const servicesError = services.filter(s => !SETTLED_STATES.has(s.state));
return {
name: cluster.name || null,
quorate: cluster.quorate === 1 || cluster.quorate === true,
nodes_total: cluster.nodes ?? nodes.length,
nodes_online: nodes.filter(n => n.online).length,
nodes,
ha: {
quorum_ok: quorum.quorate === 1 || quorum.status === 'OK',
master: master.node || null,
fencing: fencing['armed-state'] || (fencing.status ? 'armed' : null),
services_total: services.length,
services_error: servicesError.length,
services
}
};
}
export async function clusterHealth(opts = {}) {
const cfg = {
apiUrl: opts.apiUrl || process.env.PROXMOX_API_URL,
token: opts.token || process.env.PROXMOX_RO_TOKEN || process.env.PROXMOX_API_TOKEN,
fetchImpl: opts.fetchImpl || fetch
};
if (!cfg.apiUrl || !cfg.token) return { error: 'proxmox_not_configured', at: Date.now() };
try {
const [status, ha] = await Promise.all([
pveGet('/cluster/status', cfg),
pveGet('/cluster/ha/status/current', cfg).catch(() => []) // HA may be absent on a bare cluster
]);
return { ...normalizeCluster(status, ha), at: Date.now() };
} catch (e) {
return { error: String(e.message || e), at: Date.now() };
}
}

View File

@@ -5,6 +5,29 @@ import { el, mount, clear } from '../dom.js';
import { navigate } from '../router.js'; import { navigate } from '../router.js';
import { on } from '../state.js'; import { on } from '../state.js';
import { toggleSidebar, toggleRail } from './chrome.js'; import { toggleSidebar, toggleRail } from './chrome.js';
import { api } from '../api.js';
// Cluster health → topbar pill. Returns [status, label, title].
function classifyCluster(c) {
if (!c || c.error) return ['unknown', 'cluster ?', 'Cluster status unavailable'];
if (!c.quorate) return ['down', 'no quorum', 'Cluster has LOST quorum'];
if ((c.nodes_online ?? 0) < (c.nodes_total ?? 0)) return ['down', 'node down', `${c.nodes_online}/${c.nodes_total} nodes online`];
if (c.ha && c.ha.services_error > 0) return ['warn', 'HA issue', `${c.ha.services_error} HA service(s) in error`];
return ['ok', 'healthy', `Quorate · ${c.nodes_online}/${c.nodes_total} nodes · HA ok`];
}
function startClusterHealth(pill, labelEl) {
async function tick() {
let c = null;
try { c = await api.get('/api/cluster'); } catch { c = { error: 'fetch' }; }
const [status, label, title] = classifyCluster(c);
pill.className = 'icon-btn cluster-health status-' + status;
pill.title = title;
labelEl.textContent = label;
}
tick();
setInterval(tick, 30000);
}
function captureModal() { function captureModal() {
const root = document.getElementById('modal-root'); const root = document.getElementById('modal-root');
@@ -37,17 +60,24 @@ export function renderTopbar(root) {
const bell = el('button', { class: 'icon-btn', onclick: () => navigate('/inbox') }, 'Inbox'); const bell = el('button', { class: 'icon-btn', onclick: () => navigate('/inbox') }, 'Inbox');
const chLabel = el('span', { class: 'ch-label' }, '…');
const clusterPill = el('button', { class: 'icon-btn cluster-health status-unknown', title: 'Cluster health', onclick: () => navigate('/sacred-valley') },
el('span', { class: 'dot' }), chLabel);
mount(root, mount(root,
el('button', { class: 'chrome-toggle', title: 'Toggle menu', onclick: toggleSidebar }, '☰'), el('button', { class: 'chrome-toggle', title: 'Toggle menu', onclick: toggleSidebar }, '☰'),
el('div', { class: 'brand' }, 'VOID'), el('div', { class: 'brand' }, 'VOID'),
el('button', { class: 'icon-btn', onclick: captureModal }, '+ Capture'), el('button', { class: 'icon-btn', onclick: captureModal }, '+ Capture'),
el('div', { class: 'topbar-search' }, searchInput), el('div', { class: 'topbar-search' }, searchInput),
el('div', { class: 'topbar-spacer' }), el('div', { class: 'topbar-spacer' }),
clusterPill,
bell, bell,
el('button', { class: 'chrome-toggle', title: 'Toggle companion chat', onclick: toggleRail }, '◆'), el('button', { class: 'chrome-toggle', title: 'Toggle companion chat', onclick: toggleRail }, '◆'),
el('button', { class: 'icon-btn', onclick: () => alert('Agent-switching ships post-Plan-2.') }, 'Owner') el('button', { class: 'icon-btn', onclick: () => alert('Agent-switching ships post-Plan-2.') }, 'Owner')
); );
startClusterHealth(clusterPill, chLabel);
on('pending-count', (n) => { on('pending-count', (n) => {
const old = bell.querySelector('.badge'); const old = bell.querySelector('.badge');
if (old) old.remove(); if (old) old.remove();

View File

@@ -4,6 +4,26 @@
<meta charset="utf-8" /> <meta charset="utf-8" />
<meta name="viewport" content="width=device-width, initial-scale=1" /> <meta name="viewport" content="width=device-width, initial-scale=1" />
<title>Void</title> <title>Void</title>
<script>
// Always-fresh: Void 2 ships NO service worker. Proactively unregister any
// worker (notably the legacy Void 1 caching SW that still controls the
// void.hynesy.com origin in returning browsers) and drop its caches on every
// load, so assets are never served stale. Runs before any module script.
(function () {
try {
if ('serviceWorker' in navigator) {
navigator.serviceWorker.getRegistrations()
.then(function (rs) { rs.forEach(function (r) { r.unregister(); }); })
.catch(function () {});
}
if (window.caches && caches.keys) {
caches.keys()
.then(function (ks) { ks.forEach(function (k) { caches.delete(k); }); })
.catch(function () {});
}
} catch (e) {}
})();
</script>
<!-- <!--
Cradle aesthetic: Cinzel for marquee headings (Sacred Valley, view titles), Cradle aesthetic: Cinzel for marquee headings (Sacred Valley, view titles),
Cormorant Garamond for body display in cards. System UI for chrome. Cormorant Garamond for body display in cards. System UI for chrome.

View File

@@ -448,6 +448,25 @@ ul.plain li:last-child { border-bottom: none; }
.tile.status-warn .dot { background: var(--warn); box-shadow: 0 0 7px var(--warn); } .tile.status-warn .dot { background: var(--warn); box-shadow: 0 0 7px var(--warn); }
.tile.status-down .dot { background: var(--bad); box-shadow: 0 0 7px var(--bad); } .tile.status-down .dot { background: var(--bad); box-shadow: 0 0 7px var(--bad); }
.tile.status-unknown .dot { background: var(--muted); } .tile.status-unknown .dot { background: var(--muted); }
/* cluster-health card — reuse the tile status palette */
.sv-cluster .dot { display: inline-block; width: 8px; height: 8px; border-radius: 50%; margin-right: 6px; vertical-align: middle; background: var(--muted); }
.sv-cluster .status-ok .dot { background: var(--ok); box-shadow: 0 0 7px var(--ok); }
.sv-cluster .status-down .dot { background: var(--bad); box-shadow: 0 0 7px var(--bad); }
.sv-cluster .warn { color: var(--warn); }
.sv-cluster .cl-badge { font-family: var(--font-mono); font-size: 11px; font-weight: 700; letter-spacing: .06em; padding: 1px 7px; border-radius: 4px; }
.sv-cluster .cl-badge.ok { color: var(--ok); border: 1px solid var(--accent-soft); }
.sv-cluster .cl-badge.bad { color: var(--bad); border: 1px solid var(--bad); background: var(--accent-soft); }
/* topbar cluster-health pill */
.cluster-health { display: inline-flex; align-items: center; gap: 6px; }
.cluster-health .dot { width: 8px; height: 8px; border-radius: 50%; background: var(--muted); flex: none; }
.cluster-health .ch-label { font-size: 12px; letter-spacing: .04em; text-transform: lowercase; }
.cluster-health.status-ok .dot { background: var(--ok); box-shadow: 0 0 7px var(--ok); }
.cluster-health.status-ok .ch-label { color: var(--ok); }
.cluster-health.status-warn .dot { background: var(--warn); box-shadow: 0 0 7px var(--warn); }
.cluster-health.status-warn .ch-label { color: var(--warn); }
.cluster-health.status-down { border-color: var(--bad); }
.cluster-health.status-down .dot { background: var(--bad); box-shadow: 0 0 7px var(--bad); }
.cluster-health.status-down .ch-label { color: var(--bad); }
.tile-go { color: var(--lb); font-size: 12px; opacity: 0; transition: opacity .25s; } .tile-go { color: var(--lb); font-size: 12px; opacity: 0; transition: opacity .25s; }
.tile:hover .tile-go { opacity: 1; } .tile:hover .tile-go { opacity: 1; }
/* Tile root is a div hosting a stretched primary link + a small alt control. */ /* Tile root is a div hosting a stretched primary link + a small alt control. */

View File

@@ -0,0 +1,44 @@
// public/views/cards/cluster.js — Proxmox cluster health + quorum across nodes.
import { el, mount } from '../../dom.js';
import { api } from '../../api.js';
let body, timer;
function nodeRow(n, master) {
const tags = [];
if (n.local) tags.push('local');
if (master === n.name) tags.push('master');
return el('div', { class: 'sv-row status-' + (n.online ? 'ok' : 'down') },
el('span', { class: 'k' }, el('span', { class: 'dot' }), n.name),
el('span', {}, (n.online ? 'online' : 'OFFLINE') + (tags.length ? ' · ' + tags.join(' · ') : '')));
}
async function load() {
if (!body) return;
try {
const c = await api.get('/api/cluster');
if (c.error) { mount(body, el('span', { class: 'muted' }, 'Cluster: ' + c.error)); return; }
const haIssues = c.ha?.services_error || 0;
const rows = el('div', { class: 'sv-cluster' },
el('div', { class: 'sv-row' },
el('span', { class: 'k' }, 'Quorum'),
el('span', { class: 'cl-badge ' + (c.quorate ? 'ok' : 'bad') }, c.quorate ? 'QUORATE' : 'NO QUORUM')),
el('div', { class: 'sv-row' },
el('span', { class: 'k' }, 'Nodes'),
el('span', { class: c.nodes_online < c.nodes_total ? 'warn' : '' }, `${c.nodes_online}/${c.nodes_total} online`)),
...(c.nodes || []).map(n => nodeRow(n, c.ha?.master)),
el('div', { class: 'sv-row' },
el('span', { class: 'k' }, 'HA services'),
el('span', { class: haIssues ? 'warn' : '' },
haIssues ? `${c.ha.services_total} · ${haIssues}` : `${c.ha?.services_total ?? 0} · ok`))
);
mount(body, rows);
} catch { mount(body, el('span', { class: 'muted' }, 'Cluster unavailable')); }
}
export default {
id: 'cluster', title: 'Cluster · HZ', size: 'm',
mount(e) { body = e; load(); },
start() { timer = setInterval(load, 30000); },
stop() { clearInterval(timer); body = null; }
};

View File

@@ -13,8 +13,9 @@ import inbox from './cards/inbox.js';
import search from './cards/search.js'; import search from './cards/search.js';
import speedtest from './cards/speedtest.js'; import speedtest from './cards/speedtest.js';
import aiUsage from './cards/ai_usage.js'; import aiUsage from './cards/ai_usage.js';
import cluster from './cards/cluster.js';
const CARD_MODULES = [clock, weather, hostPerf, jobs, inbox, search, speedtest, aiUsage]; const CARD_MODULES = [clock, weather, hostPerf, cluster, jobs, inbox, search, speedtest, aiUsage];
const BY_ID = new Map(CARD_MODULES.map(d => [d.id, d])); const BY_ID = new Map(CARD_MODULES.map(d => [d.id, d]));
let active = []; // mounted cards needing stop() let active = []; // mounted cards needing stop()

View File

@@ -1,5 +1,6 @@
import { describe, it, expect, vi, beforeEach } from 'vitest'; import { describe, it, expect, vi, beforeEach } from 'vitest';
import { listActionsTool, proposeActionTool } from '../../../../lib/ai/agent/tools/blue/actions.js'; import { listActionsTool, proposeActionTool } from '../../../../lib/ai/agent/tools/blue/actions.js';
import { infraAuditTool } from '../../../../lib/ai/agent/tools/blue/infra_audit.js';
beforeEach(() => { process.env.VOID_API_URL = 'http://127.0.0.1:3000'; process.env.VOID_AGENT_TOKEN = 'blue-tok'; }); beforeEach(() => { process.env.VOID_API_URL = 'http://127.0.0.1:3000'; process.env.VOID_AGENT_TOKEN = 'blue-tok'; });
@@ -19,4 +20,13 @@ describe('blue action tools', () => {
expect(fetchMock.mock.calls[0][0]).toBe('http://127.0.0.1:3000/api/actions/stop-ct107/run'); expect(fetchMock.mock.calls[0][0]).toBe('http://127.0.0.1:3000/api/actions/stop-ct107/run');
expect(fetchMock.mock.calls[0][1].method).toBe('POST'); expect(fetchMock.mock.calls[0][1].method).toBe('POST');
}); });
it('infra_audit GETs the read-only audit with the agent bearer', async () => {
const fetchMock = vi.fn(async () => ({ ok: true, json: async () => ({ ok: false, docs: { summary: { unreachable: 1 } } }) }));
const out = await infraAuditTool.handler({}, {}, { fetchImpl: fetchMock });
expect(out.ok).toBe(false);
const [url, opts] = fetchMock.mock.calls[0];
expect(url).toBe('http://127.0.0.1:3000/api/infra/audit');
expect(opts.headers.Authorization).toBe('Bearer blue-tok');
});
}); });

100
tests/infra/audit.test.js Normal file
View File

@@ -0,0 +1,100 @@
import { describe, it, expect } from 'vitest';
import { extractEndpoints, auditDocs, auditServices, parseUrl, runAudit } from '../../lib/infra/audit.js';
// Doc-drift sanity check: pull every LAN endpoint referenced in the wiki and
// confirm it's actually live. Catches stale IPs/ports (e.g. a CT that moved
// off 192.168.1.13 but is still documented there). Pure logic, injected probe.
describe('extractEndpoints', () => {
it('pulls host:port and host-only LAN refs, deduped', () => {
const eps = extractEndpoints('see http://192.168.1.27:8080 and 192.168.1.27:8080 plus 192.168.1.99 alone');
expect(eps).toContainEqual({ host: '192.168.1.27', port: 8080 });
expect(eps).toContainEqual({ host: '192.168.1.99', port: null });
// deduped: the repeated .27:8080 appears once
expect(eps.filter(e => e.host === '192.168.1.27' && e.port === 8080)).toHaveLength(1);
});
it('ignores non-LAN addresses and bare version-like numbers', () => {
const eps = extractEndpoints('cloudflare 49.185.140.110:8006 and version 7.0.2 and 10.0.0.1');
expect(eps).toHaveLength(0);
});
it('does not treat a CIDR mask as a port', () => {
const eps = extractEndpoints('192.168.1.230/24 is the host');
expect(eps).toContainEqual({ host: '192.168.1.230', port: null });
});
});
describe('auditDocs', () => {
const pages = [
{ title: 'Network map', body_md: 'magicmirror 192.168.1.13:8080' },
{ title: 'Overview', body_md: 'magicmirror 192.168.1.13:8080 and ollama 192.168.1.185:11434' },
{ title: 'Host', body_md: 'gramps 192.168.1.99 alone' }
];
it('flags doc-referenced endpoints that are unreachable, grouped by citing page', async () => {
const probe = async (host, port) => !(host === '192.168.1.13' && port === 8080); // .13:8080 is dead
const report = await auditDocs({ pages, probe });
expect(report.summary.probed).toBe(2); // .13:8080 and .185:11434 (host-only .99 not probed)
expect(report.summary.unreachable).toBe(1);
const dead = report.unreachable.find(u => u.host === '192.168.1.13' && u.port === 8080);
expect(dead).toBeTruthy();
expect(dead.pages.sort()).toEqual(['Network map', 'Overview']);
});
it('reports a clean bill when everything resolves', async () => {
const report = await auditDocs({ pages, probe: async () => true });
expect(report.summary.unreachable).toBe(0);
expect(report.unreachable).toEqual([]);
expect(report.ok).toBe(true);
});
it('lists host-only references separately as not-probed', async () => {
const report = await auditDocs({ pages, probe: async () => true });
expect(report.unprobed).toContainEqual(
expect.objectContaining({ host: '192.168.1.99', port: null })
);
});
});
describe('parseUrl', () => {
it('extracts host + explicit port', () => {
expect(parseUrl('http://192.168.1.225:8384')).toEqual({ host: '192.168.1.225', port: 8384 });
});
it('defaults port by scheme', () => {
expect(parseUrl('https://gramps.hynesy.com')).toEqual({ host: 'gramps.hynesy.com', port: 443 });
expect(parseUrl('http://192.168.1.99')).toEqual({ host: '192.168.1.99', port: 80 });
});
it('returns null for junk', () => { expect(parseUrl('not a url')).toBeNull(); });
});
describe('auditServices', () => {
const services = [
{ id: 'gitea', url: 'http://192.168.1.223:3000' },
{ id: 'magicmirror', url: 'http://192.168.1.27:8080' } // moved away — should be unreachable
];
it('flags services whose url does not answer', async () => {
const probe = async (host) => host !== '192.168.1.27';
const r = await auditServices({ services, probe });
expect(r.summary.probed).toBe(2);
expect(r.ok).toBe(false);
expect(r.unreachable).toEqual([
expect.objectContaining({ id: 'magicmirror', host: '192.168.1.27', port: 8080 })
]);
});
});
describe('runAudit', () => {
it('is ok only when both docs and services are clean', async () => {
const r = await runAudit({
pages: [{ title: 'P', body_md: '192.168.1.223:3000' }],
services: [{ id: 'gitea', url: 'http://192.168.1.223:3000' }],
probe: async () => true
});
expect(r.ok).toBe(true);
expect(r.docs.ok).toBe(true);
expect(r.services.ok).toBe(true);
});
});

View File

@@ -0,0 +1,63 @@
import { describe, it, expect } from 'vitest';
import { normalizeCluster, clusterHealth } from '../../lib/proxmox/cluster.js';
// Fixtures mirror the real PVE payload shapes from this cluster.
const STATUS = [
{ type: 'cluster', name: 'HZ-cluster', quorate: 1, nodes: 2 },
{ type: 'node', name: 'z', online: 1, local: 1, ip: '192.168.1.124' },
{ type: 'node', name: 'Z3', online: 1, local: 0, ip: '192.168.1.125' }
];
const HA = [
{ type: 'quorum', id: 'quorum', quorate: 1, status: 'OK' },
{ type: 'master', id: 'master', node: 'Z3', status: 'Z3 (active, ...)' },
{ type: 'fencing', id: 'fencing', 'armed-state': 'armed' },
{ type: 'lrm', id: 'lrm:z', node: 'z' },
{ type: 'service', id: 'service:ct:104', sid: 'ct:104', state: 'started', node: 'z' },
{ type: 'service', id: 'service:ct:111', sid: 'ct:111', state: 'error', node: 'z' }
];
describe('normalizeCluster', () => {
it('reports quorate, node online counts, master and HA service errors', () => {
const r = normalizeCluster(STATUS, HA);
expect(r.name).toBe('HZ-cluster');
expect(r.quorate).toBe(true);
expect(r.nodes_total).toBe(2);
expect(r.nodes_online).toBe(2);
expect(r.nodes.map(n => n.name).sort()).toEqual(['Z3', 'z']); // both nodes present
expect(r.ha.quorum_ok).toBe(true);
expect(r.ha.master).toBe('Z3');
expect(r.ha.fencing).toBe('armed');
expect(r.ha.services_total).toBe(2);
expect(r.ha.services_error).toBe(1); // the ct:111 'error'
});
it('flags loss of quorum and an offline node', () => {
const r = normalizeCluster(
[{ type: 'cluster', name: 'HZ-cluster', quorate: 0, nodes: 2 },
{ type: 'node', name: 'z', online: 0 }, { type: 'node', name: 'Z3', online: 1 }],
[{ type: 'quorum', quorate: 0, status: 'No quorum!' }]
);
expect(r.quorate).toBe(false);
expect(r.nodes_online).toBe(1);
expect(r.ha.quorum_ok).toBe(false);
});
});
describe('clusterHealth', () => {
it('returns proxmox_not_configured without a token', async () => {
const r = await clusterHealth({ apiUrl: '', token: '' });
expect(r.error).toBe('proxmox_not_configured');
});
it('fetches + normalizes via injected fetch', async () => {
const fetchImpl = async (url) => ({
ok: true,
json: async () => ({ data: url.includes('ha/status') ? HA : STATUS })
});
const r = await clusterHealth({ apiUrl: 'https://pve:8006', token: 'tok', fetchImpl });
expect(r.quorate).toBe(true);
expect(r.nodes_online).toBe(2);
expect(r.ha.master).toBe('Z3');
expect(typeof r.at).toBe('number');
});
});