Files
Void-Homelab/docs/plan-3-complete.md
root 837bf2a5b4 docs: Plan 3 completion summary
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-01 04:01:12 +10:00

4.4 KiB

Plan 3 — Complete

Date: 2026-06-01 Version: 2.0.0-alpha.3 Tests: 241 passing + 1 gated-skipped across 60 files Commits on main: ~30 covering Plan 3 (fa47419 … head) Snapshots: plan3_pre_phaseA_20260601_0322, plan3_phase_c_20260601_0353, plan3_complete_20260601_0400 on CT 310 + 311.

Scope delivered

Phase A — Queue harness + Jobs API

  • lib/jobs/queue.js singleton over pg-boss v10 with per-name createQueue dedup (deadlock fix).
  • lib/jobs/index.js worker registry; trivial echo worker proved the harness end-to-end.
  • lib/db/repos/jobs.js unifies pgboss.job (partitioned per queue) and pgboss.archive for list/get views; provides retry() and remove().
  • /api/jobs (owner-only) — list, get, retry, delete.
  • #/jobs SPA view stub + sidebar entry.

Phase B — Capture API + URL/blob workers

  • lib/ingest/readability.js@mozilla/readability + jsdom wrapper.
  • lib/ingest/blob_store.js — sha256 content-addressed write, sharded <sha-prefix>/<sha>.
  • lib/ingest/safe_fetch.js — SSRF mitigations: http/https only; DNS resolution; loopback/RFC1918/link-local/CGNAT/metadata blocked; pinned-DNS undici dispatcher to defeat TOCTOU rebinding; per-hop re-validation on redirects.
  • lib/jobs/workers/url.js — fetch + readability extract → refs row; idempotent by sha256(space_id + url) stored as refs.external_id.
  • lib/jobs/workers/blob.js — content-addressed storage + image/pdf/file classification.
  • POST /api/capture + POST /api/capture/upload.
  • lib/ai/ollama.js — thin embedText() wrapper with padTo() helper.
  • lib/jobs/workers/embed.js — embeds pages/refs/source_docs/conversations, pads 768 → 1024, writes embedding, emits worker-actor audit.
  • lib/jobs/triggers.js + repo additions — pages/refs/source_docs create/update fire triggerEmbed with a singleton key.
  • lib/db/repos/search.js rewritten to hybrid (FTS + pgvector ANN + RRF k=60) with graceful FTS-only fallback when Ollama is unreachable.
  • tests/integration/embed_live.test.js — gated end-to-end test (skip if Ollama down).

Phase D — Karakeep webhook + drag-drop UI + Jobs UI

  • lib/karakeep/client.js — thin bearer-token bookmark fetch.
  • lib/jobs/workers/karakeep.js — fetches the bookmark, normalizes to a refs row tagged source_kind='karakeep', idempotent by sha256(space_id + 'karakeep:' + bookmark_id).
  • POST /api/ingest/karakeep — HMAC-verified webhook; bypasses agentOrOwner. Raw body captured via express.json({ verify }).
  • public/components/dropzone.js + wiring on #main.
  • Full Jobs panel with state-grouped rows + retry/delete buttons + 10 s polling.

Security findings handled

Finding Source Resolution
SSRF on ingest.url worker (fetch arbitrary URLs) reviewer lib/ingest/safe_fetch.js with IP-range blocklist + per-hop re-validation
DNS-rebinding TOCTOU in safe_fetch reviewer Pinned undici dispatcher whose lookup() returns the validated IP

Plan 3 introduced no other reviewer findings to defer.

UI smoke (manual, captured by walking through)

  • Drop a file onto the SPA's main panel after navigating to a Space → upload posts; Jobs view shows the new ingest.blob job, then embed.text arrives.
  • POST /api/capture from curl → response carries job_id and idempotency_key; the SPA's Jobs view picks it up.
  • Karakeep webhook (X-Karakeep-Signature valid) → 202 + job_id. Bad signature → 401.

Open items for the user

  • KARAKEEP_API_TOKEN + KARAKEEP_DEFAULT_SPACE_ID — needed in /opt/void-server/.env on CT 311 before Karakeep webhooks do anything useful. KARAKEEP_WEBHOOK_SECRET likewise must match Karakeep's webhook config.
  • BLOB_ROOT=/var/lib/void/blobs on CT 311 — create with mkdir -p /var/lib/void/blobs && chown void:void /var/lib/void/blobs && chmod 750 /var/lib/void/blobs. Add to the deploy README's bootstrap.
  • UPLOAD_TMP=/var/lib/void/uploads-tmp likewise.
  • alpha-3 is not yet deployed to CT 311; alpha-2 is still serving. deploy/push.sh works as-is once the env additions are in place.

What's left after Plan 3

  • Plan 4 — heavy ingest (Whisper transcription, Tesseract OCR, yt-dlp, pdftotext) via the Python void-workers service.
  • Plan 5 — companion chat in the right rail.
  • Plan 6 — Sacred Valley widgets ported from Void 1.x.