Void-Homelab/CHANGELOG.md

# Changelog

All notable changes to Void 2.0 are documented here.
Format: [Keep a Changelog](https://keepachangelog.com).

## 2.0.0-alpha.20 — Page ordering + sectioned space view
- **Explicit page ordering** (`migration 020`, `lib/db/repos/pages.js`): pages gain a `position integer` column; `listBySpace` now orders `position, title` instead of alphabetical-only, with a covering index `(space_id, position, title)`. `position` is patchable via `PUT /api/pages/:id`. Backfills all rows to `0` (preserves prior title order until positions are set).
- **Sectioned page tree** (`public/views/space.js`): the flat pages table is replaced by a `parent_id`-grouped tree — top-level pages render as section headers with their children/grandchildren nested. Backward-compatible with flat (un-nested) spaces. Enables the Wiki to read as ordered, sectioned documentation rather than an alphabetical dump.

## 2.0.0-alpha.19 — Whisper GPU sharing + mobile chat Send button + registry
- **Whisper on GPU with graceful CPU fallback** (`workers/void_workers/model.py`): the STT worker uses the in-container NVIDIA driver on the GPU node, and **falls back to CPU on any load failure** (e.g. shared-card VRAM exhaustion) so a transcription never hard-fails. (Passthrough alone gave device nodes but no `libcuda` — the matching userspace driver was installed inside CT 311; see [[gpu-cpu-fallback-for-ha]].)
- **Cooperative GPU sharing with Ollama** (`workers/void_workers/gpu.py`): before loading Whisper on CUDA, the worker asks the co-resident Ollama (CT 102, same A2000) to unload its models (`GET /api/ps` + `POST /api/generate keep_alive:0`) and waits for the card to clear; Ollama reloads on its next request. Best-effort, stdlib-only; toggle `OLLAMA_FREE_BEFORE_STT`, endpoint `OLLAMA_URL`.
- **Mobile chat Send button**: the agent composers (Companion, Yerin, Little Blue) gained a themed Send button — mobile soft keyboards have no reliable Enter-to-send. Wired via `wireAgentChat`'s `sendBtnEl`; Enter-to-send kept for desktop.
- **Service registry**: added **Chaptarr** (Readarr fork, ebooks + audiobooks; mediastack `chaptarr.hynesy.com`) to the homelab health band.

## 2.0.0-alpha.18 — Plan 8b cutover: `void.hynesy.com` now serves Void 2
- **Go-live.** `void.hynesy.com` (CT 301 → Void 1) is repointed at **Void 2** (CT 311, `.216:3000`) at the Traefik edge. Void 1 is now **legacy** — CT 301 stays running untouched as an instant-rollback fallback; nothing is retired or renamed yet. The `-alpha` tag is intentionally **kept** pending owner sign-off.
- **CF Access multi-aud** (`lib/auth/cf_access.js`): `CF_ACCESS_AUD` now accepts a **comma-separated allow-list** so a request through *either* CF Access app — `void.hynesy.com` (aud `0e7190f4…`) or `void2-app.hynesy.com` (aud `a381f270…`) — is honoured as owner. Still fails closed; an unlisted aud is rejected. Prod env updated to carry both auds.
- Cutover is fully reversible: revert the Traefik `void` service URL to `http://192.168.1.11:2424` and `docker restart traefik`.

## 2.0.0-alpha.17 — Settings, project management, terminal, AI Usage, "The Void" space + UI polish
- **Settings** (`#/settings`): API tokens (mint/list/revoke), Agents list with an expandable **profile viewer** (persona/"soul" + capabilities/scopes via `GET /api/agents/:id/profile`), Orthos Mode placeholder.
- **Per-space project management**: Void-1-style expandable cards with inline **status**, **Details**, **Tasks**, **Linked references**, **↻ Research** (Eithan stub → `POST /api/projects/:id/research`), Edit/New modal, Delete-with-confirm. Migration 019 adds research fields; `GET /api/projects/:id/links` resolves linked pages/refs.
- **Terminal tab** (`#/terminal`): embedded blackflame `ttyd` → persistent `tmux`/`claude` on CT 300; works via Traefik (CF-Access) **and** the LAN IP (app proxies `/terminal` + its WebSocket to ttyd).
- **AI Usage** Sacred Valley card + `GET /api/ai-usage` — summarises the Homelab Monitor (Claude tokens + local OpenClaw/Ollama p50/p95).
- **"The Void" space**: Void 1.x / Void 2.0 / Void 3.0 as projects (tasks + linked references), charting the project's evolution.
- **Migration**: BookStack re-imported with Book › Chapter › Page hierarchy; Void 1 project `research_notes` backfilled.
- **UI**: page header actions (Edit/Revisions/Export), breadcrumbs, themed markdown tables, `Cache-Control: no-cache`, live sidebar active-sync, **hybrid sidebar** (Spaces/Agents/Navigate + active pill + agent dots), themed scrollbars + topbar, **+1 font bump**, Sentinel → **Yerin** (red).

## 2.0.0-alpha.16 — Little Blue + action framework (Agent Layer brick 2)
- **Little Blue**, the caretaker fix-it agent, is online at `#/little-blue`: chat + a manual Actions panel. She can **restart whitelisted services** and **power-manage Proxmox guests** — `safe` actions run on her word, `risky` ones queue for your approval.
- **Least-privilege action framework:** a version-controlled whitelist (`config/actions.json`), two server-side-enforced channels (scoped **Proxmox API token** + **SSH forced-command wrapper**), tiered approval, and a full `agent_actions` audit trail. Infra creds live ONLY in the main server; Little Blue's MCP child proposes actions via the local API with a scoped token — it can only name a whitelisted id, never a command.

## 2.0.0-alpha.15 — Yerin online (Agent Layer brick 1)
- **Yerin**, the read-only security agent, is now a usable agent: a global `#/sentinel` chat surface backed by her 5 security tools (audit/agents/pending/exposure/tokens). She investigates + reports; she never acts.
- Extracted the **shared agent-chat foundation** — `runAgentTurn` (backend) + `agent_chat` (frontend) — now used by both Dross and Yerin. Personas live in `lib/ai/personas/`.

## 2.0.0-alpha.14 — MCP HTTP transport for external agents
- **MCP Streamable HTTP** at `/mcp`: external agents can connect over the network, authenticated by a Space-scoped Void agent bearer (owner / CF-Access identities are rejected here — external agents never inherit owner powers; CF Access service tokens gate the hostname at the edge).
- **Read + suggest-only:** a dedicated external registry exposes `search` / `read` / `context` + `propose_change` (which always routes to the pending-changes inbox, `applied:false`). Kept separate from Dross's registry so future companion tools never auto-leak.
- The `read` tool now **enforces Space membership** for bound callers; reads are hard-scoped to the agent's bound Space (client-supplied space args are ignored). Per-token rate limit + audit on every external tool call.

## 2.0.0-alpha.13 — Finer Sacred Valley tile scaling
- Cards now sit on a 12-column grid with a per-card width **−/+ stepper** (span 1–12) in edit mode, replacing the coarse S/M/L. "Small" defaults to 1/6 width (half its previous size) so clock/weather aren't oversized.
- Layout `sizes` now store an integer column span (legacy 's'/'m'/'l' still accepted).

## 2.0.0-alpha.12 — Editable Sacred Valley layout
- "Edit layout" mode on the dashboard: per-card **resize** (S/M/L column span), **show/hide** (with a hidden-cards tray to re-add), clearer **drag-to-reorder** via a dedicated grip handle, and a **Reset** to defaults.
- All changes persist through the existing `/api/dashboard/layout` (order/sizes/hidden) — no backend changes.

## 2.0.0-alpha.11 — DB-backed service registry + LAN auto-discovery
- The health-band registry is now in Postgres (`monitored_services`, migration 015) instead of the hand-edited `config/services.json` — which becomes a one-time boot seed (auto-populated if the table is empty).
- Owner CRUD over the registry: `POST/PATCH/DELETE /api/health/services` (add/edit/enable/disable/remove); `GET /api/health/services` is now DB-backed.
- LAN auto-discovery: `discover.lan` pg-boss worker (pure-Node TCP sweep + HTTP-title probe, no nmap) + `POST /api/health/discover`. Found host:ports become **disabled `discovered` candidates** that never clobber curated entries; `GET /api/health/services/discovered` lists them.
- Dashboard: a "Scan" button + a "Discovered (N new)" section in Little Blue's band, with one-click promote.

## 2.0.0-alpha.10 — Cloudflare Access SSO as owner auth
- Browser requests through the CF tunnel no longer need the owner token copied onto each device: a cryptographically-verified Cloudflare Access JWT (`Cf-Access-Jwt-Assertion`) for an allow-listed email now counts as the owner (`lib/auth/cf_access.js`, wired into `agentOrOwner`).
- Security: verifies signature against the team JWKS + audience (app AUD) + email allow-list; the plain email header is never trusted alone. Fails closed → falls back to the owner token (LAN-direct `:3000` path and dev/tests unaffected).
- Opt-in via env: `CF_ACCESS_TEAM_DOMAIN`, `CF_ACCESS_AUD`, `CF_ACCESS_OWNER_EMAILS` (absent → feature disabled).

## 2.0.0-alpha.9 — Hardening pass (Void 3.0 quick wins)
- Security: prod `void` DB role revoked SUPERUSER (CT 310; `vector` marked trusted so the test harness still creates it as non-superuser). An app-process compromise no longer escalates to full-cluster compromise.
- Security: the `claude` companion subprocess now gets an explicit env allow-list (`buildChildEnv`) instead of the full `process.env` — `OWNER_TOKEN`/`DATABASE_URL`/Karakeep/ANTHROPIC secrets no longer reach the CLI. MCP tools are unaffected (they get DB env via the explicit `--mcp-config`).
- Correctness: pending-change **approve** now claims the change (atomic `WHERE status='pending'`) *before* applying, and reopens it on apply failure — eliminates the re-approvable duplicate after a crash.
- Hardening: `/api/capture/upload` validates `space_id` (UUID + existence); pg pool gets a 30s `statement_timeout`.
- Ops: disabled the failing `syncoid-donatello` timer on Z (pools out pending parts).
- Deferred: per-space tag uniqueness needs a `space_id` column on `tags` → folded into the polymorphic-`space_id` project.

## 2.0.0-alpha.8 — Sacred Valley (Plan 6)
- Two-band #/sacred-valley dashboard: draggable data cards (clock, weather, host-perf, speedtest, jobs, inbox, search) with server-persisted layout (custom CSS-grid reorder, no resize).
- Little Blue Health band: config service registry, 60s pg-boss health checks, grouped status tiles, locally-cached service icons (no CDN leak).
- New endpoints: /api/dashboard/layout, /api/weather, /api/host, /api/speedtest/{history,run}, /api/health/{services,check}, /api/icons/:slug.png.
- Migrations 012 (dashboard_layout), 013 (speedtest_results), 014 (service_status).

## [2.0.0-alpha.7] — 2026-06-02

### Security & hardening

- **`pending_changes.action` CHECK fix** (migration 009): `upsert` is now a valid
  suggestion action (approval dispatches to `refsRepo.upsertByExternal`); resource
  dependency mutations (`add_dependency`/`remove_dependency`) are now owner-only.
- **Constant-time owner-token comparison** (`lib/auth/safe_compare.js`) — replaces
  `===`, closing a timing side-channel on `OWNER_TOKEN`.
- **O(1) token verification** (migration 010): selector+verifier split replaces the
  O(n) bcrypt scan over all tokens. New format `vk_<selector>.<verifier>`; legacy
  tokens still verify. Dropped the useless `idx_agent_tokens_hash`.
- **`pool.js` error handler** — an idle pooled-client error no longer crashes the
  process.
- **`context` tool** projects a safe column allow-list for resources (no
  `monitoring`/`metadata` blobs); now also handles `resource` views.
- **`applyPendingChange`** guards the `upsert` arm (clear `ValidationError`).

### Added (Yerin — security agent)

- Read-only `securityRegistry` (`lib/ai/agent/tools/security/`) with five tools:
  `audit_log`, `agent_inventory`, `pending_review`, `resource_exposure`,
  `token_audit` — no secret material in any output.
- Migration 011 seeds the read-only `yerin` agent.
- The stdio MCP server selects its toolset via `VOID_TOOL_REGISTRY`
  (`security` → Yerin's tools; default → Dross's companion tools).

## [2.0.0-alpha.6] — 2026-06-01

### Changed (Plan 5b: companion backend → Claude CLI subprocess)

- **Companion model backend switched from the Anthropic API to the `claude`
  CLI subprocess**, authenticated by the owner's **Claude Max subscription**
  (no API key — the Agent SDK can't use subscription auth headlessly, and Max
  doesn't issue API keys). Mirrors Void 1.0's `lib/agent.js`: spawn `claude`
  with `ANTHROPIC_API_KEY`/`ANTHROPIC_AUTH_TOKEN` stripped so it uses the
  logged-in subscription. The CLI owns the agentic loop; the four companion
  tools are exposed to it via a local **stdio MCP server** (`lib/mcp/`).
- `lib/ai/claude_cli.js` — spawns `claude --print --output-format stream-json
  --include-partial-messages --append-system-prompt … (--session-id | --resume)
  --mcp-config … --strict-mcp-config --tools … --allowedTools …`, maps stream-json
  → `{delta,tool,tool_result,result,error}`. Prompt fed via **stdin** (variadic
  `--tools` would eat a positional). Multi-turn continuity via `--resume`.
- `lib/mcp/companion-stdio.js` — stdio MCP server re-exposing `companionRegistry`;
  per-turn Space/agent context passed via env in the `--mcp-config`.
- `propose_change` now stamps the current Space onto created space-scoped
  entities (model can't know the Space uuid).
- CT 311 runs the `claude` CLI (logged in as `void`, `HOME=/var/lib/void`).
- Built-in CLI tools (Bash/Read/Write/…) disabled via `--tools`; the companion
  has only the four `mcp__void__*` tools.
- The old `@anthropic-ai/sdk` API-key path (`lib/ai/anthropic.js`, `runTurn`)
  is retained in-tree but no longer the companion's execution path.

## [2.0.0-alpha.5] — 2026-06-01

### Added (Plan 5: Companion chat)

- **Right-rail companion chat** — an always-visible, per-Space AI assistant.
  Label-led turns (YOU / Companion) with left/right alignment, live
  tool-activity chips, streamed answers (markdown via DOMPurify), and inline
  approve/reject draft cards. Loads its space's history on first paint via the
  `space-active` state event.
- **Lean agent runtime** (`lib/ai/agent/runtime.js`) on the Anthropic SDK
  directly — no Mastra. `runTurn` drives a tool-use loop (max-iteration
  guarded), streams text deltas, and emits `tool` / `delta` / `draft` events.
  `callModel` is injectable (the SSE endpoint takes a fake in tests, so the
  suite never hits the network).
- **Extensible shared tool registry** (`lib/ai/agent/registry.js`) with four
  v1 tools: `search` (hybrid FTS), `read`, `context` (resolves the active
  view), and `propose_change`. Adding a tool is a one-line `registerTool`;
  a future MCP server re-exposes the same defs.
- **`propose_change` never applies** — it only writes a `pending_changes` row,
  capability-gated via `canAct` (default `suggest`). Prompt-injection
  containment is structural: a poisoned document can at most produce a draft
  the owner must approve. Drafts render inline in chat AND in the Inbox (same
  row; approving from either flips it).
- **Companion API** — `GET /api/spaces/:id/companion` (history) and
  `POST /api/spaces/:id/companion/turn` (SSE). One ambient conversation per
  Space (`conversations.space_id` via migration 007); one assistant message
  per turn with the tool trace + draft ids in `metadata`.
- **`@anthropic-ai/sdk`** dependency; key resolved via the `env:`/`file:`
  `vault_path` resolver (`lib/ai/secret.js`) — Vaultwarden swap still deferred.
- Default model `claude-sonnet-4-6`, overridable per-agent (`agents.model`)
  and via `ANTHROPIC_MODEL` — the seam for scope-C local personas.

## [2.0.0-alpha.4] — 2026-06-01

### Added (Plan 4: Python void-workers)

- **`void-workers.service`** — Python 3.13 service alongside `void-server`
  on CT 311. psycopg-based pg-boss client matches Node's claim/finish
  semantics via `SELECT ... FOR UPDATE SKIP LOCKED`. Forces
  `client_encoding=UTF8` on every connection (void2-db cluster is
  SQL_ASCII).
- **`extract.pdf`** — `pdftotext -layout` first; per-page `pdftoppm`
  rasterization + Tesseract OCR fallback when extraction yields
  < 200 chars.
- **`extract.image`** — Tesseract OCR (English) for images stored in
  the blob store.
- **`ingest.video`** — `yt-dlp` metadata + audio extract + faster-whisper
  (`small.en` default). CUDA at startup; CPU fallback when HA failover
  to Z3 (no GPU) happens. URLs validated as http(s) and `--` separator
  passed to yt-dlp to defeat argv smuggling.
- **`sync.source_doc`** — fetches `upstream_url` via Python `safe_fetch`
  (port of the Node helper) + sha256-diffs against the prior body_sha
  in metadata; updates body_text only when content changed.
- **Node `blob.js`** fans out to `extract.pdf` / `extract.image` after
  creating PDF / image refs.
- **Node `capture.js`** routes `youtube.com` / `youtu.be` / `vimeo.com`
  URLs to `ingest.video` instead of `ingest.url`.
- **Daily cron** (`lib/cron/sync_source_docs.js`) enqueues
  `sync.source_doc` jobs at 03:00 local for every `source_docs` row
  with `sync_source='url'`.
- **CT 311 infrastructure**: resized to 6 cores / 8 GB RAM, NVIDIA
  RTX A2000 device-nodes passed through (shared with CT 102's Ollama).
- **`deploy/push-workers.sh`** + `deploy/void-workers.service` — push
  the workers package, chown to `voidworkers`, recreate the venv, install
  deps under `su voidworkers -c`, restart the unit.

## [2.0.0-alpha.3] — 2026-06-01

### Added (Plan 3: Capture pipeline + hybrid search)
- **pg-boss job queue** embedded in void-server (Node). Queue tables live
  alongside Void's in the shared void2-db. Tests manage their own boss
  lifecycle via `stopBoss()` / `waitForJob()` helpers.
- `/api/jobs` (owner-only) — list / get / retry / delete with state and
  name filters. Minimal `#/jobs` SPA view fronts it, polling every 10 s.
- **`/api/capture`** POST — URL → `ingest.url` job. Idempotent by
  `sha256(space_id + url)` stored as `refs.external_id`; duplicate POST
  returns the existing `ref_id`.
- **`/api/capture/upload`** — multipart file → `ingest.blob` job →
  content-addressed `/var/lib/void/blobs/<sha-prefix>/<sha>` →
  `refs` row. Drag-drop in the SPA wired to the main panel; `space_id`
  pre-filled from the last-viewed space.
- **`ingest.url` worker** — `@mozilla/readability` + `jsdom` extract;
  fetch protected by `lib/ingest/safe_fetch.js` (SSRF mitigations:
  http(s) only; DNS-resolved hostnames checked against loopback /
  RFC1918 / link-local / CGNAT / metadata; resolved IP pinned via an
  undici dispatcher to defeat DNS rebinding; redirects re-validated).
- **`ingest.blob` worker** — content-addressed storage,
  image/pdf/file kind classification.
- **`embed.text` worker** — Ollama `nomic-embed-text` (768 dims) padded
  to `vector(1024)`; emits a `worker`-actor audit log entry.
- **Repo-level triggers** — pages/refs/source_docs `create` and
  `update` enqueue an `embed.text` job with a singleton key so rapid
  edits coalesce. No-op when the queue is not running (tests).
- **Hybrid `/api/search`** — FTS + pgvector ANN unioned with reciprocal
  rank fusion (k=60). Vector branch silently skipped when Ollama times
  out, leaving FTS-only results — graceful degrade.
- **`/api/ingest/karakeep`** — HMAC-verified webhook. Enqueues
  `ingest.karakeep` for `bookmark.created`; worker fetches the bookmark
  via Karakeep's API, normalizes to a `refs` row tagged
  `source_kind='karakeep'`.

### Deferred (Plan 4+)

- Python `void-workers` service for Whisper / Tesseract OCR / yt-dlp
  (heavy ML).
- AI Space/Project suggestion on capture.
- Embedding chunks table (whole-doc embedding only in Plan 3).
- pdftotext for born-digital PDFs.
- `pg LISTEN/NOTIFY` real-time Jobs UI.

## [2.0.0-alpha.2] — 2026-06-01

### Added (Plan 2: API surface + UI shell)
- REST routes for the full entity tree:
  - `/api/spaces`, `/api/projects`, `/api/tasks` (with project + space scoping)
  - `/api/pages` + page revisions + `/api/pages/:id/backlinks`
  - `/api/refs` + `/api/refs/upsert`
  - `/api/resources` + dependencies + change history
  - `/api/resources/:id/source-docs` + `/api/source-docs/:id/resync` (gated by `ENABLE_RESYNC`)
  - `/api/agents` (owner-only) + agent token mint/revoke
  - `/api/conversations` + nested `/messages`
  - `/api/tags` + entity-scoped attach/detach via `/api/:entity_type/:entity_id/tags`
  - `/api/links` (POST/GET from|to/DELETE) for polymorphic entity links
  - `/api/pending-changes` + approve/reject with dispatch table covering
    page/project/task/ref/resource/source_doc × create/update/delete
  - `/api/audit/entity/:type/:id` + `/api/audit/actor`
  - `/api/search` unified FTS across pages, refs, source docs, messages
- Agent bearer auth middleware + capability tiering: owner allow, agent
  `write+scope` → allow, agent `suggest` → 202 + pending row, else 403.
- Approve and reject emit explicit `approve` / `reject` entries in the
  audit log with the original agent id preserved in the diff.
- Static SPA shell served from `public/`:
  - Three-column Cradle aesthetic (blackflame palette, Cinzel display
    headings, Cormorant Garamond body)
  - Hash-based router with views for home / space / project / page /
    reference / resource / search / inbox / sacred valley
  - `dom.js` safe builders — no `innerHTML` on API data anywhere; the
    explicit `html:` opt-in is used only by the markdown editor's
    preview pane, which sanitizes with DOMPurify
  - Sidebar Spaces tree with lazy project expansion, bottom Navigate
    section, pending-count badge shared with the topbar bell via a tiny
    `state.js` event bus
  - Topbar: brand, capture modal stub, global search (Enter →
    `#/search?q=`), pending bell, owner toggle
  - Page editor: split-pane markdown via marked + DOMPurify, save
    PATCHes `/api/pages/:id`, backlinks card
  - Reference detail: media block (image / YouTube embed / link),
    summary, metadata table, tag attach/detach, linked-from list
  - Resource detail: status header, dependencies + source docs +
    runbook pages columns, change history
  - Inbox: pending changes grouped by agent, approve → navigate to the
    resulting entity
- Test coverage: 185 tests across 43 files (113 new for Plan 2 routes +
  search + GET / shell smoke).

### Security follow-ups (deferred)
- Polymorphic IDOR risk on entity_links / entity_tags / attachments —
  acceptable today since the entire API is owner-token gated and there
  is one tenant; see `docs/security-followups.md` for the tighten-now
  vs defer decision.
- `pending_changes.action` CHECK constraint blocks `'upsert'` /
  `'add_dependency'` / `'remove_dependency'` actions emitted by some
  routes' `divertToPending` paths. Latent — only fires when an agent at
  suggest tier hits those specific endpoints. Mitigation options
  documented in `docs/security-followups.md`.

## [Unreleased]

### Added
- Initial repo scaffolding

### Added (Plan 1: Foundation)
- LXC provisioning for `void2-db` (Postgres 16 + pgvector) and `void2-app`
- Schema migrations 001-006 covering core, knowledge, resources, agents, cross-cutting, audit
- Repos with capability-checked `actor` parameter and audit trail
- Real audit log with redaction of sensitive keys (token, password, api_key, etc.)
- `pending_changes` table for agent suggestions awaiting owner approval
- Capability check module (allow / suggest / deny) for user vs agent actors
- Owner-token bearer auth
- Express server with `/health` and smoke `/api/spaces`
- Test coverage: 72 tests across migrations, repos, capability, owner middleware, server