root 1f0e9a5f1b feat(workers): extract.pdf with Tesseract fallback
pdftotext first; falls back to per-page pdftoppm rasterization +
Tesseract OCR when the extracted text is < 200 chars. Updates
refs.body_text + metadata.extract.{method,chars} via the repo shim;
audit entry emitted with actor_kind='worker'.

born_digital.pdf fixture padded so pdftotext yields > 200 chars and
the test exercises the pdftotext path, not the OCR fallback.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-01 04:59:53 +10:00
2026-06-01 04:39:55 +10:00
2026-05-31 01:22:10 +10:00
2026-05-31 01:22:10 +10:00
2026-05-31 01:22:10 +10:00

Void 2.0

Homelab orchestrator + canonical knowledge store. Cradle-themed. Successor to Void 1.x (CT 301). Spec at /project/docs/superpowers/specs/2026-05-31-void-v2-design.md.

Layout

  • void-server (this repo) — Node API, MCP, UI, cron, agent runtime
  • void-workers — Python ingest workers (separate repo, later plan)

Quick start (dev)

  1. Provision void2-db LXC (see deploy/README.md)
  2. Install Postgres + pgvector on void2-db
  3. npm install
  4. cp .env.example .env and edit
  5. npm run migrate
  6. npm start
  7. curl -H "Authorization: Bearer $OWNER_TOKEN" http://localhost:3000/health
Description
No description provided
Readme 2.6 MiB
Languages
JavaScript 87.5%
CSS 6.4%
Python 5.2%
Shell 0.7%
HTML 0.2%