7.9 KiB
7.9 KiB
Changelog
All notable changes to Void 2.0 are documented here. Format: Keep a Changelog.
[2.0.0-alpha.4] — 2026-06-01
Added (Plan 4: Python void-workers)
void-workers.service— Python 3.13 service alongsidevoid-serveron CT 311. psycopg-based pg-boss client matches Node's claim/finish semantics viaSELECT ... FOR UPDATE SKIP LOCKED. Forcesclient_encoding=UTF8on every connection (void2-db cluster is SQL_ASCII).extract.pdf—pdftotext -layoutfirst; per-pagepdftoppmrasterization + Tesseract OCR fallback when extraction yields < 200 chars.extract.image— Tesseract OCR (English) for images stored in the blob store.ingest.video—yt-dlpmetadata + audio extract + faster-whisper (small.endefault). CUDA at startup; CPU fallback when HA failover to Z3 (no GPU) happens. URLs validated as http(s) and--separator passed to yt-dlp to defeat argv smuggling.sync.source_doc— fetchesupstream_urlvia Pythonsafe_fetch(port of the Node helper) + sha256-diffs against the prior body_sha in metadata; updates body_text only when content changed.- Node
blob.jsfans out toextract.pdf/extract.imageafter creating PDF / image refs. - Node
capture.jsroutesyoutube.com/youtu.be/vimeo.comURLs toingest.videoinstead ofingest.url. - Daily cron (
lib/cron/sync_source_docs.js) enqueuessync.source_docjobs at 03:00 local for everysource_docsrow withsync_source='url'. - CT 311 infrastructure: resized to 6 cores / 8 GB RAM, NVIDIA RTX A2000 device-nodes passed through (shared with CT 102's Ollama).
deploy/push-workers.sh+deploy/void-workers.service— push the workers package, chown tovoidworkers, recreate the venv, install deps undersu voidworkers -c, restart the unit.
[2.0.0-alpha.3] — 2026-06-01
Added (Plan 3: Capture pipeline + hybrid search)
- pg-boss job queue embedded in void-server (Node). Queue tables live
alongside Void's in the shared void2-db. Tests manage their own boss
lifecycle via
stopBoss()/waitForJob()helpers. /api/jobs(owner-only) — list / get / retry / delete with state and name filters. Minimal#/jobsSPA view fronts it, polling every 10 s./api/capturePOST — URL →ingest.urljob. Idempotent bysha256(space_id + url)stored asrefs.external_id; duplicate POST returns the existingref_id./api/capture/upload— multipart file →ingest.blobjob → content-addressed/var/lib/void/blobs/<sha-prefix>/<sha>→refsrow. Drag-drop in the SPA wired to the main panel;space_idpre-filled from the last-viewed space.ingest.urlworker —@mozilla/readability+jsdomextract; fetch protected bylib/ingest/safe_fetch.js(SSRF mitigations: http(s) only; DNS-resolved hostnames checked against loopback / RFC1918 / link-local / CGNAT / metadata; resolved IP pinned via an undici dispatcher to defeat DNS rebinding; redirects re-validated).ingest.blobworker — content-addressed storage, image/pdf/file kind classification.embed.textworker — Ollamanomic-embed-text(768 dims) padded tovector(1024); emits aworker-actor audit log entry.- Repo-level triggers — pages/refs/source_docs
createandupdateenqueue anembed.textjob with a singleton key so rapid edits coalesce. No-op when the queue is not running (tests). - Hybrid
/api/search— FTS + pgvector ANN unioned with reciprocal rank fusion (k=60). Vector branch silently skipped when Ollama times out, leaving FTS-only results — graceful degrade. /api/ingest/karakeep— HMAC-verified webhook. Enqueuesingest.karakeepforbookmark.created; worker fetches the bookmark via Karakeep's API, normalizes to arefsrow taggedsource_kind='karakeep'.
Deferred (Plan 4+)
- Python
void-workersservice for Whisper / Tesseract OCR / yt-dlp (heavy ML). - AI Space/Project suggestion on capture.
- Embedding chunks table (whole-doc embedding only in Plan 3).
- pdftotext for born-digital PDFs.
pg LISTEN/NOTIFYreal-time Jobs UI.
[2.0.0-alpha.2] — 2026-06-01
Added (Plan 2: API surface + UI shell)
- REST routes for the full entity tree:
/api/spaces,/api/projects,/api/tasks(with project + space scoping)/api/pages+ page revisions +/api/pages/:id/backlinks/api/refs+/api/refs/upsert/api/resources+ dependencies + change history/api/resources/:id/source-docs+/api/source-docs/:id/resync(gated byENABLE_RESYNC)/api/agents(owner-only) + agent token mint/revoke/api/conversations+ nested/messages/api/tags+ entity-scoped attach/detach via/api/:entity_type/:entity_id/tags/api/links(POST/GET from|to/DELETE) for polymorphic entity links/api/pending-changes+ approve/reject with dispatch table covering page/project/task/ref/resource/source_doc × create/update/delete/api/audit/entity/:type/:id+/api/audit/actor/api/searchunified FTS across pages, refs, source docs, messages
- Agent bearer auth middleware + capability tiering: owner allow, agent
write+scope→ allow, agentsuggest→ 202 + pending row, else 403. - Approve and reject emit explicit
approve/rejectentries in the audit log with the original agent id preserved in the diff. - Static SPA shell served from
public/:- Three-column Cradle aesthetic (blackflame palette, Cinzel display headings, Cormorant Garamond body)
- Hash-based router with views for home / space / project / page / reference / resource / search / inbox / sacred valley
dom.jssafe builders — noinnerHTMLon API data anywhere; the explicithtml:opt-in is used only by the markdown editor's preview pane, which sanitizes with DOMPurify- Sidebar Spaces tree with lazy project expansion, bottom Navigate
section, pending-count badge shared with the topbar bell via a tiny
state.jsevent bus - Topbar: brand, capture modal stub, global search (Enter →
#/search?q=), pending bell, owner toggle - Page editor: split-pane markdown via marked + DOMPurify, save
PATCHes
/api/pages/:id, backlinks card - Reference detail: media block (image / YouTube embed / link), summary, metadata table, tag attach/detach, linked-from list
- Resource detail: status header, dependencies + source docs + runbook pages columns, change history
- Inbox: pending changes grouped by agent, approve → navigate to the resulting entity
- Test coverage: 185 tests across 43 files (113 new for Plan 2 routes + search + GET / shell smoke).
Security follow-ups (deferred)
- Polymorphic IDOR risk on entity_links / entity_tags / attachments —
acceptable today since the entire API is owner-token gated and there
is one tenant; see
docs/security-followups.mdfor the tighten-now vs defer decision. pending_changes.actionCHECK constraint blocks'upsert'/'add_dependency'/'remove_dependency'actions emitted by some routes'divertToPendingpaths. Latent — only fires when an agent at suggest tier hits those specific endpoints. Mitigation options documented indocs/security-followups.md.
[Unreleased]
Added
- Initial repo scaffolding
Added (Plan 1: Foundation)
- LXC provisioning for
void2-db(Postgres 16 + pgvector) andvoid2-app - Schema migrations 001-006 covering core, knowledge, resources, agents, cross-cutting, audit
- Repos with capability-checked
actorparameter and audit trail - Real audit log with redaction of sensitive keys (token, password, api_key, etc.)
pending_changestable for agent suggestions awaiting owner approval- Capability check module (allow / suggest / deny) for user vs agent actors
- Owner-token bearer auth
- Express server with
/healthand smoke/api/spaces - Test coverage: 72 tests across migrations, repos, capability, owner middleware, server