Files
Void-Homelab/CHANGELOG.md
2026-06-02 00:24:34 +10:00

240 lines
13 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Changelog
All notable changes to Void 2.0 are documented here.
Format: [Keep a Changelog](https://keepachangelog.com).
## [2.0.0-alpha.7] — 2026-06-02
### Security & hardening
- **`pending_changes.action` CHECK fix** (migration 009): `upsert` is now a valid
suggestion action (approval dispatches to `refsRepo.upsertByExternal`); resource
dependency mutations (`add_dependency`/`remove_dependency`) are now owner-only.
- **Constant-time owner-token comparison** (`lib/auth/safe_compare.js`) — replaces
`===`, closing a timing side-channel on `OWNER_TOKEN`.
- **O(1) token verification** (migration 010): selector+verifier split replaces the
O(n) bcrypt scan over all tokens. New format `vk_<selector>.<verifier>`; legacy
tokens still verify. Dropped the useless `idx_agent_tokens_hash`.
- **`pool.js` error handler** — an idle pooled-client error no longer crashes the
process.
- **`context` tool** projects a safe column allow-list for resources (no
`monitoring`/`metadata` blobs); now also handles `resource` views.
- **`applyPendingChange`** guards the `upsert` arm (clear `ValidationError`).
### Added (Yerin — security agent)
- Read-only `securityRegistry` (`lib/ai/agent/tools/security/`) with five tools:
`audit_log`, `agent_inventory`, `pending_review`, `resource_exposure`,
`token_audit` — no secret material in any output.
- Migration 011 seeds the read-only `yerin` agent.
- The stdio MCP server selects its toolset via `VOID_TOOL_REGISTRY`
(`security` → Yerin's tools; default → Dross's companion tools).
## [2.0.0-alpha.6] — 2026-06-01
### Changed (Plan 5b: companion backend → Claude CLI subprocess)
- **Companion model backend switched from the Anthropic API to the `claude`
CLI subprocess**, authenticated by the owner's **Claude Max subscription**
(no API key — the Agent SDK can't use subscription auth headlessly, and Max
doesn't issue API keys). Mirrors Void 1.0's `lib/agent.js`: spawn `claude`
with `ANTHROPIC_API_KEY`/`ANTHROPIC_AUTH_TOKEN` stripped so it uses the
logged-in subscription. The CLI owns the agentic loop; the four companion
tools are exposed to it via a local **stdio MCP server** (`lib/mcp/`).
- `lib/ai/claude_cli.js` — spawns `claude --print --output-format stream-json
--include-partial-messages --append-system-prompt … (--session-id | --resume)
--mcp-config … --strict-mcp-config --tools … --allowedTools …`, maps stream-json
→ `{delta,tool,tool_result,result,error}`. Prompt fed via **stdin** (variadic
`--tools` would eat a positional). Multi-turn continuity via `--resume`.
- `lib/mcp/companion-stdio.js` — stdio MCP server re-exposing `companionRegistry`;
per-turn Space/agent context passed via env in the `--mcp-config`.
- `propose_change` now stamps the current Space onto created space-scoped
entities (model can't know the Space uuid).
- CT 311 runs the `claude` CLI (logged in as `void`, `HOME=/var/lib/void`).
- Built-in CLI tools (Bash/Read/Write/…) disabled via `--tools`; the companion
has only the four `mcp__void__*` tools.
- The old `@anthropic-ai/sdk` API-key path (`lib/ai/anthropic.js`, `runTurn`)
is retained in-tree but no longer the companion's execution path.
## [2.0.0-alpha.5] — 2026-06-01
### Added (Plan 5: Companion chat)
- **Right-rail companion chat** — an always-visible, per-Space AI assistant.
Label-led turns (YOU / Companion) with left/right alignment, live
tool-activity chips, streamed answers (markdown via DOMPurify), and inline
approve/reject draft cards. Loads its space's history on first paint via the
`space-active` state event.
- **Lean agent runtime** (`lib/ai/agent/runtime.js`) on the Anthropic SDK
directly — no Mastra. `runTurn` drives a tool-use loop (max-iteration
guarded), streams text deltas, and emits `tool` / `delta` / `draft` events.
`callModel` is injectable (the SSE endpoint takes a fake in tests, so the
suite never hits the network).
- **Extensible shared tool registry** (`lib/ai/agent/registry.js`) with four
v1 tools: `search` (hybrid FTS), `read`, `context` (resolves the active
view), and `propose_change`. Adding a tool is a one-line `registerTool`;
a future MCP server re-exposes the same defs.
- **`propose_change` never applies** — it only writes a `pending_changes` row,
capability-gated via `canAct` (default `suggest`). Prompt-injection
containment is structural: a poisoned document can at most produce a draft
the owner must approve. Drafts render inline in chat AND in the Inbox (same
row; approving from either flips it).
- **Companion API** — `GET /api/spaces/:id/companion` (history) and
`POST /api/spaces/:id/companion/turn` (SSE). One ambient conversation per
Space (`conversations.space_id` via migration 007); one assistant message
per turn with the tool trace + draft ids in `metadata`.
- **`@anthropic-ai/sdk`** dependency; key resolved via the `env:`/`file:`
`vault_path` resolver (`lib/ai/secret.js`) — Vaultwarden swap still deferred.
- Default model `claude-sonnet-4-6`, overridable per-agent (`agents.model`)
and via `ANTHROPIC_MODEL` — the seam for scope-C local personas.
## [2.0.0-alpha.4] — 2026-06-01
### Added (Plan 4: Python void-workers)
- **`void-workers.service`** — Python 3.13 service alongside `void-server`
on CT 311. psycopg-based pg-boss client matches Node's claim/finish
semantics via `SELECT ... FOR UPDATE SKIP LOCKED`. Forces
`client_encoding=UTF8` on every connection (void2-db cluster is
SQL_ASCII).
- **`extract.pdf`** — `pdftotext -layout` first; per-page `pdftoppm`
rasterization + Tesseract OCR fallback when extraction yields
< 200 chars.
- **`extract.image`** — Tesseract OCR (English) for images stored in
the blob store.
- **`ingest.video`** — `yt-dlp` metadata + audio extract + faster-whisper
(`small.en` default). CUDA at startup; CPU fallback when HA failover
to Z3 (no GPU) happens. URLs validated as http(s) and `--` separator
passed to yt-dlp to defeat argv smuggling.
- **`sync.source_doc`** — fetches `upstream_url` via Python `safe_fetch`
(port of the Node helper) + sha256-diffs against the prior body_sha
in metadata; updates body_text only when content changed.
- **Node `blob.js`** fans out to `extract.pdf` / `extract.image` after
creating PDF / image refs.
- **Node `capture.js`** routes `youtube.com` / `youtu.be` / `vimeo.com`
URLs to `ingest.video` instead of `ingest.url`.
- **Daily cron** (`lib/cron/sync_source_docs.js`) enqueues
`sync.source_doc` jobs at 03:00 local for every `source_docs` row
with `sync_source='url'`.
- **CT 311 infrastructure**: resized to 6 cores / 8 GB RAM, NVIDIA
RTX A2000 device-nodes passed through (shared with CT 102's Ollama).
- **`deploy/push-workers.sh`** + `deploy/void-workers.service` — push
the workers package, chown to `voidworkers`, recreate the venv, install
deps under `su voidworkers -c`, restart the unit.
## [2.0.0-alpha.3] — 2026-06-01
### Added (Plan 3: Capture pipeline + hybrid search)
- **pg-boss job queue** embedded in void-server (Node). Queue tables live
alongside Void's in the shared void2-db. Tests manage their own boss
lifecycle via `stopBoss()` / `waitForJob()` helpers.
- `/api/jobs` (owner-only) — list / get / retry / delete with state and
name filters. Minimal `#/jobs` SPA view fronts it, polling every 10 s.
- **`/api/capture`** POST — URL → `ingest.url` job. Idempotent by
`sha256(space_id + url)` stored as `refs.external_id`; duplicate POST
returns the existing `ref_id`.
- **`/api/capture/upload`** — multipart file → `ingest.blob` job →
content-addressed `/var/lib/void/blobs/<sha-prefix>/<sha>` →
`refs` row. Drag-drop in the SPA wired to the main panel; `space_id`
pre-filled from the last-viewed space.
- **`ingest.url` worker** — `@mozilla/readability` + `jsdom` extract;
fetch protected by `lib/ingest/safe_fetch.js` (SSRF mitigations:
http(s) only; DNS-resolved hostnames checked against loopback /
RFC1918 / link-local / CGNAT / metadata; resolved IP pinned via an
undici dispatcher to defeat DNS rebinding; redirects re-validated).
- **`ingest.blob` worker** — content-addressed storage,
image/pdf/file kind classification.
- **`embed.text` worker** — Ollama `nomic-embed-text` (768 dims) padded
to `vector(1024)`; emits a `worker`-actor audit log entry.
- **Repo-level triggers** — pages/refs/source_docs `create` and
`update` enqueue an `embed.text` job with a singleton key so rapid
edits coalesce. No-op when the queue is not running (tests).
- **Hybrid `/api/search`** — FTS + pgvector ANN unioned with reciprocal
rank fusion (k=60). Vector branch silently skipped when Ollama times
out, leaving FTS-only results — graceful degrade.
- **`/api/ingest/karakeep`** — HMAC-verified webhook. Enqueues
`ingest.karakeep` for `bookmark.created`; worker fetches the bookmark
via Karakeep's API, normalizes to a `refs` row tagged
`source_kind='karakeep'`.
### Deferred (Plan 4+)
- Python `void-workers` service for Whisper / Tesseract OCR / yt-dlp
(heavy ML).
- AI Space/Project suggestion on capture.
- Embedding chunks table (whole-doc embedding only in Plan 3).
- pdftotext for born-digital PDFs.
- `pg LISTEN/NOTIFY` real-time Jobs UI.
## [2.0.0-alpha.2] — 2026-06-01
### Added (Plan 2: API surface + UI shell)
- REST routes for the full entity tree:
- `/api/spaces`, `/api/projects`, `/api/tasks` (with project + space scoping)
- `/api/pages` + page revisions + `/api/pages/:id/backlinks`
- `/api/refs` + `/api/refs/upsert`
- `/api/resources` + dependencies + change history
- `/api/resources/:id/source-docs` + `/api/source-docs/:id/resync` (gated by `ENABLE_RESYNC`)
- `/api/agents` (owner-only) + agent token mint/revoke
- `/api/conversations` + nested `/messages`
- `/api/tags` + entity-scoped attach/detach via `/api/:entity_type/:entity_id/tags`
- `/api/links` (POST/GET from|to/DELETE) for polymorphic entity links
- `/api/pending-changes` + approve/reject with dispatch table covering
page/project/task/ref/resource/source_doc × create/update/delete
- `/api/audit/entity/:type/:id` + `/api/audit/actor`
- `/api/search` unified FTS across pages, refs, source docs, messages
- Agent bearer auth middleware + capability tiering: owner allow, agent
`write+scope` → allow, agent `suggest` → 202 + pending row, else 403.
- Approve and reject emit explicit `approve` / `reject` entries in the
audit log with the original agent id preserved in the diff.
- Static SPA shell served from `public/`:
- Three-column Cradle aesthetic (blackflame palette, Cinzel display
headings, Cormorant Garamond body)
- Hash-based router with views for home / space / project / page /
reference / resource / search / inbox / sacred valley
- `dom.js` safe builders — no `innerHTML` on API data anywhere; the
explicit `html:` opt-in is used only by the markdown editor's
preview pane, which sanitizes with DOMPurify
- Sidebar Spaces tree with lazy project expansion, bottom Navigate
section, pending-count badge shared with the topbar bell via a tiny
`state.js` event bus
- Topbar: brand, capture modal stub, global search (Enter →
`#/search?q=`), pending bell, owner toggle
- Page editor: split-pane markdown via marked + DOMPurify, save
PATCHes `/api/pages/:id`, backlinks card
- Reference detail: media block (image / YouTube embed / link),
summary, metadata table, tag attach/detach, linked-from list
- Resource detail: status header, dependencies + source docs +
runbook pages columns, change history
- Inbox: pending changes grouped by agent, approve → navigate to the
resulting entity
- Test coverage: 185 tests across 43 files (113 new for Plan 2 routes +
search + GET / shell smoke).
### Security follow-ups (deferred)
- Polymorphic IDOR risk on entity_links / entity_tags / attachments —
acceptable today since the entire API is owner-token gated and there
is one tenant; see `docs/security-followups.md` for the tighten-now
vs defer decision.
- `pending_changes.action` CHECK constraint blocks `'upsert'` /
`'add_dependency'` / `'remove_dependency'` actions emitted by some
routes' `divertToPending` paths. Latent — only fires when an agent at
suggest tier hits those specific endpoints. Mitigation options
documented in `docs/security-followups.md`.
## [Unreleased]
### Added
- Initial repo scaffolding
### Added (Plan 1: Foundation)
- LXC provisioning for `void2-db` (Postgres 16 + pgvector) and `void2-app`
- Schema migrations 001-006 covering core, knowledge, resources, agents, cross-cutting, audit
- Repos with capability-checked `actor` parameter and audit trail
- Real audit log with redaction of sensitive keys (token, password, api_key, etc.)
- `pending_changes` table for agent suggestions awaiting owner approval
- Capability check module (allow / suggest / deny) for user vs agent actors
- Owner-token bearer auth
- Express server with `/health` and smoke `/api/spaces`
- Test coverage: 72 tests across migrations, repos, capability, owner middleware, server