4.2 KiB
4.2 KiB
Plan 4 — Complete
Date: 2026-06-01
Version: 2.0.0-alpha.4
Tests: 17 Python + 247 Node + 1 gated-skipped (full suite green when run cleanly)
Snapshots: plan4_pre_resize_<ts>, plan4_phase_c_<ts>, plan4_complete_<ts> on CT 310 + 311.
Scope delivered
Phase A — Python harness
workers/package with pyproject.toml (Python ≥3.12; CT 311 runs 3.13).boss.py—SELECT ... FOR UPDATE SKIP LOCKED LIMIT 1claim, atomic complete/fail, retry semantics matching pg-boss v10 (retry_count,retry_delay,retry_backoff). Forcesclient_encoding=UTF8because void2-db is SQL_ASCII.runner.py—ThreadPoolExecutorper registered handler, signal handling,once=Truemode for tests.echohandler proved the harness end-to-end (Node enqueue → Python claim → output back).deploy/void-workers.service(systemd,MemoryMax=6G, runs asvoidworkers).deploy/push-workers.sh— rsync, chown tovoidworkers, venv create +pip install -e ".[all]"undersu voidworkers -c, restart unit. Excludes.env,.gitignore,.pytest_cache,tests/so deploys are idempotent.
Phase B — PDF + image OCR
lib/jobs/workers/blob.js(Node) — after creating a PDF/image ref, enqueuesextract.pdforextract.imagewith{ref_id, blob_path}.extract.pdf—pdftotext -layoutfirst; per-pagepdftoppmrasterize + Tesseract OCR fallback when extraction < 200 chars.extract.image— Tesseract OCR viapytesseract.image_to_stringwith English data.repo.update_ref— UPDATE refs + emitaudit_logrow withactor_kind='worker'.
Phase C — Whisper + yt-dlp + GPU
- CT 311 resized from 4 cores / 4 GB to 6 cores / 8 GB.
- GPU passthrough —
/dev/nvidia0,/dev/nvidiactl,/dev/nvidia-uvm,/dev/nvidia-uvm-tools,/dev/nvidia-caps/nvidia-cap1bind-mounted into CT 311 (shared with CT 102's Ollama). model.py— faster-whisper loader.cuda_available()probesctranslate2.get_cuda_device_count(); uses CUDA +float16when present, CPU +int8otherwise. Model cache at/var/lib/void/whisper-models.ingest.video—yt-dlp -Jfor metadata +yt-dlp -x --audio-format opusfor audio. faster-whisper transcribes; audio file deleted. Creates arefsrow (kind='video',source_kind='youtube'or'video') idempotent onsha256(space_id + url).lib/api/routes/capture.js(Node) — detectsyoutube.com / youtu.be / vimeo.comURLs and enqueuesingest.videoinstead ofingest.url.
Phase D — Source-doc sync + alpha-4
safe_fetch.py— Python port oflib/ingest/safe_fetch.js(scheme check, IP-range blocklist, redirect re-validation,VOID_INGEST_ALLOW_PRIVATEgate).sync.source_doc—safe_fetchupstream + sha256 diff against priorbody_shain metadata; updatesbody_textonly on change.lib/cron/sync_source_docs.js+lib/cron/index.js(Node) —node-cronschedulesrunSyncat 03:00 local time, enqueueingsync.source_docfor every row withsync_source='url'.- Version bumped to
2.0.0-alpha.4inpackage.json,server.js, and the/healthtest assertion. CHANGELOG appended.
Security findings handled inline
| Finding | Source | Resolution |
|---|---|---|
yt-dlp argv flag smuggling in video.py |
reviewer | _validate_url checks scheme is http(s); -- passed before positional URL to stop flag parsing. |
UI smoke
Plan 4 ships no SPA changes. The existing Plan 3 Jobs view shows extract.pdf / extract.image / ingest.video jobs alongside Node-side ones — both sides write to the same pgboss.job rows.
Open items for the user
- alpha-4 deploy. Standing rule per Plans 2/3: won't deploy without your explicit OK. alpha-3 stays live until then.
WHISPER_MODELdefault issmall.en. Bump tomedium.enonce you've stress-tested transcription quality.- yt-dlp cookies for age-gated content — add
YT_DLP_COOKIES_FILEenv when wanted (small handler tweak). - Tesseract languages beyond English — install via
tesseract-ocr-<lang>packages on CT 311 and passlang="..."toimage_to_string.
What's left after Plan 4
- Plan 5 — Companion chat in right rail.
- Plan 6 — Sacred Valley widgets ported from Void 1.x.