Void-Homelab/CHANGELOG.md

# Changelog

All notable changes to Void 2.0 are documented here.
Format: [Keep a Changelog](https://keepachangelog.com).

## [2.0.0-alpha.7] — 2026-06-02

### Security & hardening

- **`pending_changes.action` CHECK fix** (migration 009): `upsert` is now a valid
  suggestion action (approval dispatches to `refsRepo.upsertByExternal`); resource
  dependency mutations (`add_dependency`/`remove_dependency`) are now owner-only.
- **Constant-time owner-token comparison** (`lib/auth/safe_compare.js`) — replaces
  `===`, closing a timing side-channel on `OWNER_TOKEN`.
- **O(1) token verification** (migration 010): selector+verifier split replaces the
  O(n) bcrypt scan over all tokens. New format `vk_<selector>.<verifier>`; legacy
  tokens still verify. Dropped the useless `idx_agent_tokens_hash`.
- **`pool.js` error handler** — an idle pooled-client error no longer crashes the
  process.
- **`context` tool** projects a safe column allow-list for resources (no
  `monitoring`/`metadata` blobs); now also handles `resource` views.
- **`applyPendingChange`** guards the `upsert` arm (clear `ValidationError`).

### Added (Yerin — security agent)

- Read-only `securityRegistry` (`lib/ai/agent/tools/security/`) with five tools:
  `audit_log`, `agent_inventory`, `pending_review`, `resource_exposure`,
  `token_audit` — no secret material in any output.
- Migration 011 seeds the read-only `yerin` agent.
- The stdio MCP server selects its toolset via `VOID_TOOL_REGISTRY`
  (`security` → Yerin's tools; default → Dross's companion tools).

## [2.0.0-alpha.6] — 2026-06-01

### Changed (Plan 5b: companion backend → Claude CLI subprocess)

- **Companion model backend switched from the Anthropic API to the `claude`
  CLI subprocess**, authenticated by the owner's **Claude Max subscription**
  (no API key — the Agent SDK can't use subscription auth headlessly, and Max
  doesn't issue API keys). Mirrors Void 1.0's `lib/agent.js`: spawn `claude`
  with `ANTHROPIC_API_KEY`/`ANTHROPIC_AUTH_TOKEN` stripped so it uses the
  logged-in subscription. The CLI owns the agentic loop; the four companion
  tools are exposed to it via a local **stdio MCP server** (`lib/mcp/`).
- `lib/ai/claude_cli.js` — spawns `claude --print --output-format stream-json
  --include-partial-messages --append-system-prompt … (--session-id | --resume)
  --mcp-config … --strict-mcp-config --tools … --allowedTools …`, maps stream-json
  → `{delta,tool,tool_result,result,error}`. Prompt fed via **stdin** (variadic
  `--tools` would eat a positional). Multi-turn continuity via `--resume`.
- `lib/mcp/companion-stdio.js` — stdio MCP server re-exposing `companionRegistry`;
  per-turn Space/agent context passed via env in the `--mcp-config`.
- `propose_change` now stamps the current Space onto created space-scoped
  entities (model can't know the Space uuid).
- CT 311 runs the `claude` CLI (logged in as `void`, `HOME=/var/lib/void`).
- Built-in CLI tools (Bash/Read/Write/…) disabled via `--tools`; the companion
  has only the four `mcp__void__*` tools.
- The old `@anthropic-ai/sdk` API-key path (`lib/ai/anthropic.js`, `runTurn`)
  is retained in-tree but no longer the companion's execution path.

## [2.0.0-alpha.5] — 2026-06-01

### Added (Plan 5: Companion chat)

- **Right-rail companion chat** — an always-visible, per-Space AI assistant.
  Label-led turns (YOU / Companion) with left/right alignment, live
  tool-activity chips, streamed answers (markdown via DOMPurify), and inline
  approve/reject draft cards. Loads its space's history on first paint via the
  `space-active` state event.
- **Lean agent runtime** (`lib/ai/agent/runtime.js`) on the Anthropic SDK
  directly — no Mastra. `runTurn` drives a tool-use loop (max-iteration
  guarded), streams text deltas, and emits `tool` / `delta` / `draft` events.
  `callModel` is injectable (the SSE endpoint takes a fake in tests, so the
  suite never hits the network).
- **Extensible shared tool registry** (`lib/ai/agent/registry.js`) with four
  v1 tools: `search` (hybrid FTS), `read`, `context` (resolves the active
  view), and `propose_change`. Adding a tool is a one-line `registerTool`;
  a future MCP server re-exposes the same defs.
- **`propose_change` never applies** — it only writes a `pending_changes` row,
  capability-gated via `canAct` (default `suggest`). Prompt-injection
  containment is structural: a poisoned document can at most produce a draft
  the owner must approve. Drafts render inline in chat AND in the Inbox (same
  row; approving from either flips it).
- **Companion API** — `GET /api/spaces/:id/companion` (history) and
  `POST /api/spaces/:id/companion/turn` (SSE). One ambient conversation per
  Space (`conversations.space_id` via migration 007); one assistant message
  per turn with the tool trace + draft ids in `metadata`.
- **`@anthropic-ai/sdk`** dependency; key resolved via the `env:`/`file:`
  `vault_path` resolver (`lib/ai/secret.js`) — Vaultwarden swap still deferred.
- Default model `claude-sonnet-4-6`, overridable per-agent (`agents.model`)
  and via `ANTHROPIC_MODEL` — the seam for scope-C local personas.

## [2.0.0-alpha.4] — 2026-06-01

### Added (Plan 4: Python void-workers)

- **`void-workers.service`** — Python 3.13 service alongside `void-server`
  on CT 311. psycopg-based pg-boss client matches Node's claim/finish
  semantics via `SELECT ... FOR UPDATE SKIP LOCKED`. Forces
  `client_encoding=UTF8` on every connection (void2-db cluster is
  SQL_ASCII).
- **`extract.pdf`** — `pdftotext -layout` first; per-page `pdftoppm`
  rasterization + Tesseract OCR fallback when extraction yields
  < 200 chars.
- **`extract.image`** — Tesseract OCR (English) for images stored in
  the blob store.
- **`ingest.video`** — `yt-dlp` metadata + audio extract + faster-whisper
  (`small.en` default). CUDA at startup; CPU fallback when HA failover
  to Z3 (no GPU) happens. URLs validated as http(s) and `--` separator
  passed to yt-dlp to defeat argv smuggling.
- **`sync.source_doc`** — fetches `upstream_url` via Python `safe_fetch`
  (port of the Node helper) + sha256-diffs against the prior body_sha
  in metadata; updates body_text only when content changed.
- **Node `blob.js`** fans out to `extract.pdf` / `extract.image` after
  creating PDF / image refs.
- **Node `capture.js`** routes `youtube.com` / `youtu.be` / `vimeo.com`
  URLs to `ingest.video` instead of `ingest.url`.
- **Daily cron** (`lib/cron/sync_source_docs.js`) enqueues
  `sync.source_doc` jobs at 03:00 local for every `source_docs` row
  with `sync_source='url'`.
- **CT 311 infrastructure**: resized to 6 cores / 8 GB RAM, NVIDIA
  RTX A2000 device-nodes passed through (shared with CT 102's Ollama).
- **`deploy/push-workers.sh`** + `deploy/void-workers.service` — push
  the workers package, chown to `voidworkers`, recreate the venv, install
  deps under `su voidworkers -c`, restart the unit.

## [2.0.0-alpha.3] — 2026-06-01

### Added (Plan 3: Capture pipeline + hybrid search)
- **pg-boss job queue** embedded in void-server (Node). Queue tables live
  alongside Void's in the shared void2-db. Tests manage their own boss
  lifecycle via `stopBoss()` / `waitForJob()` helpers.
- `/api/jobs` (owner-only) — list / get / retry / delete with state and
  name filters. Minimal `#/jobs` SPA view fronts it, polling every 10 s.
- **`/api/capture`** POST — URL → `ingest.url` job. Idempotent by
  `sha256(space_id + url)` stored as `refs.external_id`; duplicate POST
  returns the existing `ref_id`.
- **`/api/capture/upload`** — multipart file → `ingest.blob` job →
  content-addressed `/var/lib/void/blobs/<sha-prefix>/<sha>` →
  `refs` row. Drag-drop in the SPA wired to the main panel; `space_id`
  pre-filled from the last-viewed space.
- **`ingest.url` worker** — `@mozilla/readability` + `jsdom` extract;
  fetch protected by `lib/ingest/safe_fetch.js` (SSRF mitigations:
  http(s) only; DNS-resolved hostnames checked against loopback /
  RFC1918 / link-local / CGNAT / metadata; resolved IP pinned via an
  undici dispatcher to defeat DNS rebinding; redirects re-validated).
- **`ingest.blob` worker** — content-addressed storage,
  image/pdf/file kind classification.
- **`embed.text` worker** — Ollama `nomic-embed-text` (768 dims) padded
  to `vector(1024)`; emits a `worker`-actor audit log entry.
- **Repo-level triggers** — pages/refs/source_docs `create` and
  `update` enqueue an `embed.text` job with a singleton key so rapid
  edits coalesce. No-op when the queue is not running (tests).
- **Hybrid `/api/search`** — FTS + pgvector ANN unioned with reciprocal
  rank fusion (k=60). Vector branch silently skipped when Ollama times
  out, leaving FTS-only results — graceful degrade.
- **`/api/ingest/karakeep`** — HMAC-verified webhook. Enqueues
  `ingest.karakeep` for `bookmark.created`; worker fetches the bookmark
  via Karakeep's API, normalizes to a `refs` row tagged
  `source_kind='karakeep'`.

### Deferred (Plan 4+)

- Python `void-workers` service for Whisper / Tesseract OCR / yt-dlp
  (heavy ML).
- AI Space/Project suggestion on capture.
- Embedding chunks table (whole-doc embedding only in Plan 3).
- pdftotext for born-digital PDFs.
- `pg LISTEN/NOTIFY` real-time Jobs UI.

## [2.0.0-alpha.2] — 2026-06-01

### Added (Plan 2: API surface + UI shell)
- REST routes for the full entity tree:
  - `/api/spaces`, `/api/projects`, `/api/tasks` (with project + space scoping)
  - `/api/pages` + page revisions + `/api/pages/:id/backlinks`
  - `/api/refs` + `/api/refs/upsert`
  - `/api/resources` + dependencies + change history
  - `/api/resources/:id/source-docs` + `/api/source-docs/:id/resync` (gated by `ENABLE_RESYNC`)
  - `/api/agents` (owner-only) + agent token mint/revoke
  - `/api/conversations` + nested `/messages`
  - `/api/tags` + entity-scoped attach/detach via `/api/:entity_type/:entity_id/tags`
  - `/api/links` (POST/GET from|to/DELETE) for polymorphic entity links
  - `/api/pending-changes` + approve/reject with dispatch table covering
    page/project/task/ref/resource/source_doc × create/update/delete
  - `/api/audit/entity/:type/:id` + `/api/audit/actor`
  - `/api/search` unified FTS across pages, refs, source docs, messages
- Agent bearer auth middleware + capability tiering: owner allow, agent
  `write+scope` → allow, agent `suggest` → 202 + pending row, else 403.
- Approve and reject emit explicit `approve` / `reject` entries in the
  audit log with the original agent id preserved in the diff.
- Static SPA shell served from `public/`:
  - Three-column Cradle aesthetic (blackflame palette, Cinzel display
    headings, Cormorant Garamond body)
  - Hash-based router with views for home / space / project / page /
    reference / resource / search / inbox / sacred valley
  - `dom.js` safe builders — no `innerHTML` on API data anywhere; the
    explicit `html:` opt-in is used only by the markdown editor's
    preview pane, which sanitizes with DOMPurify
  - Sidebar Spaces tree with lazy project expansion, bottom Navigate
    section, pending-count badge shared with the topbar bell via a tiny
    `state.js` event bus
  - Topbar: brand, capture modal stub, global search (Enter →
    `#/search?q=`), pending bell, owner toggle
  - Page editor: split-pane markdown via marked + DOMPurify, save
    PATCHes `/api/pages/:id`, backlinks card
  - Reference detail: media block (image / YouTube embed / link),
    summary, metadata table, tag attach/detach, linked-from list
  - Resource detail: status header, dependencies + source docs +
    runbook pages columns, change history
  - Inbox: pending changes grouped by agent, approve → navigate to the
    resulting entity
- Test coverage: 185 tests across 43 files (113 new for Plan 2 routes +
  search + GET / shell smoke).

### Security follow-ups (deferred)
- Polymorphic IDOR risk on entity_links / entity_tags / attachments —
  acceptable today since the entire API is owner-token gated and there
  is one tenant; see `docs/security-followups.md` for the tighten-now
  vs defer decision.
- `pending_changes.action` CHECK constraint blocks `'upsert'` /
  `'add_dependency'` / `'remove_dependency'` actions emitted by some
  routes' `divertToPending` paths. Latent — only fires when an agent at
  suggest tier hits those specific endpoints. Mitigation options
  documented in `docs/security-followups.md`.

## [Unreleased]

### Added
- Initial repo scaffolding

### Added (Plan 1: Foundation)
- LXC provisioning for `void2-db` (Postgres 16 + pgvector) and `void2-app`
- Schema migrations 001-006 covering core, knowledge, resources, agents, cross-cutting, audit
- Repos with capability-checked `actor` parameter and audit trail
- Real audit log with redaction of sensitive keys (token, password, api_key, etc.)
- `pending_changes` table for agent suggestions awaiting owner approval
- Capability check module (allow / suggest / deny) for user vs agent actors
- Owner-token bearer auth
- Express server with `/health` and smoke `/api/spaces`
- Test coverage: 72 tests across migrations, repos, capability, owner middleware, server