# Plan 4 — Complete **Date:** 2026-06-01 **Version:** 2.0.0-alpha.4 **Tests:** 17 Python + 247 Node + 1 gated-skipped (full suite green when run cleanly) **Snapshots:** `plan4_pre_resize_`, `plan4_phase_c_`, `plan4_complete_` on CT 310 + 311. ## Scope delivered ### Phase A — Python harness - `workers/` package with pyproject.toml (Python ≥3.12; CT 311 runs 3.13). - `boss.py` — `SELECT ... FOR UPDATE SKIP LOCKED LIMIT 1` claim, atomic complete/fail, retry semantics matching pg-boss v10 (`retry_count`, `retry_delay`, `retry_backoff`). Forces `client_encoding=UTF8` because void2-db is SQL_ASCII. - `runner.py` — `ThreadPoolExecutor` per registered handler, signal handling, `once=True` mode for tests. - `echo` handler proved the harness end-to-end (Node enqueue → Python claim → output back). - `deploy/void-workers.service` (systemd, `MemoryMax=6G`, runs as `voidworkers`). - `deploy/push-workers.sh` — rsync, chown to `voidworkers`, venv create + `pip install -e ".[all]"` under `su voidworkers -c`, restart unit. Excludes `.env`, `.gitignore`, `.pytest_cache`, `tests/` so deploys are idempotent. ### Phase B — PDF + image OCR - `lib/jobs/workers/blob.js` (Node) — after creating a PDF/image ref, enqueues `extract.pdf` or `extract.image` with `{ref_id, blob_path}`. - `extract.pdf` — `pdftotext -layout` first; per-page `pdftoppm` rasterize + Tesseract OCR fallback when extraction < 200 chars. - `extract.image` — Tesseract OCR via `pytesseract.image_to_string` with English data. - `repo.update_ref` — UPDATE refs + emit `audit_log` row with `actor_kind='worker'`. ### Phase C — Whisper + yt-dlp + GPU - **CT 311 resized** from 4 cores / 4 GB to **6 cores / 8 GB**. - **GPU passthrough** — `/dev/nvidia0`, `/dev/nvidiactl`, `/dev/nvidia-uvm`, `/dev/nvidia-uvm-tools`, `/dev/nvidia-caps/nvidia-cap1` bind-mounted into CT 311 (shared with CT 102's Ollama). - `model.py` — faster-whisper loader. `cuda_available()` probes `ctranslate2.get_cuda_device_count()`; uses CUDA + `float16` when present, CPU + `int8` otherwise. Model cache at `/var/lib/void/whisper-models`. - `ingest.video` — `yt-dlp -J` for metadata + `yt-dlp -x --audio-format opus` for audio. faster-whisper transcribes; audio file deleted. Creates a `refs` row (`kind='video'`, `source_kind='youtube'` or `'video'`) idempotent on `sha256(space_id + url)`. - `lib/api/routes/capture.js` (Node) — detects `youtube.com / youtu.be / vimeo.com` URLs and enqueues `ingest.video` instead of `ingest.url`. ### Phase D — Source-doc sync + alpha-4 - `safe_fetch.py` — Python port of `lib/ingest/safe_fetch.js` (scheme check, IP-range blocklist, redirect re-validation, `VOID_INGEST_ALLOW_PRIVATE` gate). - `sync.source_doc` — `safe_fetch` upstream + sha256 diff against prior `body_sha` in metadata; updates `body_text` only on change. - `lib/cron/sync_source_docs.js` + `lib/cron/index.js` (Node) — `node-cron` schedules `runSync` at 03:00 local time, enqueueing `sync.source_doc` for every row with `sync_source='url'`. - Version bumped to `2.0.0-alpha.4` in `package.json`, `server.js`, and the `/health` test assertion. CHANGELOG appended. ## Security findings handled inline | Finding | Source | Resolution | |---|---|---| | yt-dlp argv flag smuggling in `video.py` | reviewer | `_validate_url` checks scheme is http(s); `--` passed before positional URL to stop flag parsing. | ## UI smoke Plan 4 ships no SPA changes. The existing Plan 3 Jobs view shows extract.pdf / extract.image / ingest.video jobs alongside Node-side ones — both sides write to the same `pgboss.job` rows. ## Open items for the user - **alpha-4 deploy.** Standing rule per Plans 2/3: won't deploy without your explicit OK. alpha-3 stays live until then. - **`WHISPER_MODEL`** default is `small.en`. Bump to `medium.en` once you've stress-tested transcription quality. - **yt-dlp cookies** for age-gated content — add `YT_DLP_COOKIES_FILE` env when wanted (small handler tweak). - **Tesseract languages** beyond English — install via `tesseract-ocr-` packages on CT 311 and pass `lang="..."` to `image_to_string`. ## What's left after Plan 4 - **Plan 5** — Companion chat in right rail. - **Plan 6** — Sacred Valley widgets ported from Void 1.x.