feat(infra): commit live infra-audit/cluster work to reconcile git with prod
This work (network_hosts inventory + infra_audit MCP tool, /api/cluster + Sacred Valley cluster card, topbar cluster-health pill + SW self-heal) was built in an earlier session and DEPLOYED to CT 311 as alpha.24–26, but was never committed to git — prod was running code absent from the repo. Commits it as-is (already prod-validated) so git matches the live state, and restores its alpha.24/25/26 CHANGELOG entries. Files are disjoint from the fold-in work; both now ship together under alpha.27. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
13
CHANGELOG.md
13
CHANGELOG.md
@@ -9,6 +9,19 @@ Format: [Keep a Changelog](https://keepachangelog.com).
|
||||
- feat: phuryn usage dashboard now reachable at aiusage.hynesy.com behind CF Access.
|
||||
- feat: Sacred Valley AI Usage card opens the in-Void #/ai-usage route.
|
||||
|
||||
## 2.0.0-alpha.26 — Topbar cluster-health pill + always-fresh self-heal
|
||||
- **Topbar cluster-health indicator** (`public/components/topbar.js`): a themed pill left of Inbox/Chat/Owner that polls `/api/cluster` every 30s and shows **healthy** (green) when quorate + all nodes online + HA clean, **HA issue / node down / no quorum** (amber/red) otherwise. Click → Sacred Valley. Reuses the `--ok/--warn/--bad` dot palette.
|
||||
- **Always-fresh self-heal** (`public/index.html`): inline pre-module script unregisters any service worker and clears caches on every load. The legacy Void 1 caching SW (origin-scoped to `void.hynesy.com`) was serving stale assets that survived hard reloads; this removes it on the next load and prevents recurrence on every device. Assets are already served `no-cache`, so with no SW the app is always fresh.
|
||||
|
||||
## 2.0.0-alpha.25 — Cluster health Sacred Valley card
|
||||
- **`GET /api/cluster`** (`lib/proxmox/cluster.js` + route, 10s-cached): read-only Proxmox cluster health — `quorate`, per-node online state, HA master/fencing, and HA service count + error count. Pure `normalizeCluster()` folds `/cluster/status` + `/cluster/ha/status/current`; unit-tested with injected fetch. Uses a **dedicated read-only PVE token** (`PROXMOX_RO_TOKEN`, user `void-ro@pve` with `PVEAuditor` on `/`) — never the power-action token.
|
||||
- **Sacred Valley "Cluster · HZ" card** (`public/views/cards/cluster.js`, registered in `sacred_valley.js`): polls every 30s, shows the quorum badge, node up/down dots, master, and HA-service issues. Reuses the tile status palette (blackflame `--ok`/`--warn`/`--bad`).
|
||||
|
||||
## 2.0.0-alpha.24 — Infra sanity check + LAN host/MAC inventory
|
||||
- **`network_hosts` inventory table** (`migration 023`, repo `lib/db/repos/network_hosts.js`): authoritative id→ip→MAC map of every cluster guest + PVE host + the Pi QDevice, seeded from a live capture. Source of truth for router DHCP reservations (the LAN pool is the whole `.2–.254`, so each pinned guest needs a static IP + a MAC reservation) and for the audit below. Idempotent seed (`ON CONFLICT DO UPDATE`).
|
||||
- **`infra_audit` sanity check** (`lib/infra/audit.js`, `GET /api/infra/audit`, MCP tool `infra_audit` in `blueRegistry`): probes every `192.168.x.y:port` referenced in the Wiki **and** every enabled service URL, reports unreachable endpoints (stale/incorrect IPs or ports) grouped by source, plus inventory hosts missing a MAC. Read-only TCP connects; available to the owner or any authed agent (e.g. Little Blue) so agents can verify the docs/registry match reality.
|
||||
- **Service registry IP fixes**: `magicmirror` → `192.168.1.224`, `obd2` → `192.168.1.225` (moved off contested DHCP-range addresses to static).
|
||||
|
||||
## 2.0.0-alpha.23 — Local/remote-aware service tiles
|
||||
- **Optional `external` URL per service** (`migration 022`, `config/services.json`, repo + `/api/health/services` payload + `svcBody`): Little Blue health-band tiles previously linked to the single LAN `url`, so they opened dead private IPs when browsing remotely (e.g. Gramps `http://192.168.1.99`). Migration adds the column and **backfills** curated domains by id (the live instance is already seeded, so a column-add alone wouldn't populate them); also normalises `jellyfin`/`chaptarr` (which stored a domain in `url`) to LAN `url` + `external`.
|
||||
- **Context-based tile target + one-click alt** (`public/views/service_url.js`, `public/components/service_tile.js`, `public/views/health_band.js`): the tile picks its primary URL from `location.hostname` — public host (e.g. `void.hynesy.com`) opens the domain, private IP/localhost/.local opens the LAN address — and always offers a `⇄` alt to the *other* URL (a reliable manual fallback; an auto-probe can't work because an HTTPS dashboard is blocked from probing `http://` LAN IPs by mixed-content). Services with no `external` are dimmed with a "LAN-only" badge when remote. Tile root is now a `div` with a stretched primary `<a>` + sibling alt `<a>` (no nested anchors). Health checker unchanged (still probes LAN `url` from CT 311).
|
||||
|
||||
Reference in New Issue
Block a user