# Void 2.0 — Homelab Orchestrator & Knowledge Foundation **Status:** IN PROGRESS — brainstorming, not yet a complete design **Started:** 2026-05-31 **Owner:** mrhynesy@gmail.com > This document is being filled in section by section as brainstorming progresses. > Sections below marked `[locked]` are user-approved decisions. Sections marked > `[pending]` are the remaining design work to complete before this becomes a > proper spec ready for the writing-plans skill. --- ## Vision [locked] Replace the current scattered homelab state (Void dashboard, Karakeep bookmarks, BookStack wiki, `/root/.claude/plans/*.md`, auto-memory entries, ad-hoc browser tab groups) with a single **Void 2.0** — a homelab orchestrator that: - Acts as the canonical home for projects, tasks, knowledge, and deployed-resource state - Ingests websites, videos, PDFs, screenshots, and files into a unified library - Mirrors upstream documentation locally for offline + agent access - Surfaces all of it to Claude and local AI agents via MCP, with per-agent permission tiers - Preserves the Void's Cradle-themed aesthetic and agent personas - Stays available during planned host maintenance via `pct migrate` (no automatic failover) - Maintains privacy + security with selective remote access Primary capture pain being solved: **"multiple grouped Chrome tabs as a poor project-management substitute."** Void 2.0 makes that proper. --- ## Direction & HA Shape [locked] **Chosen direction:** Foundation-first Void 2.0 (Option 2 from initial framing). Not an evolution of Void — a clean rebuild with Void as the visible UI on top. **HA model:** Planned-maintenance only. User instructs the stack before host shutdown; Proxmox live-migrates the LXCs to another node (~10-60s pause). No automatic failover, no quorum, no clustering complexity. **Infrastructure:** | LXC | Purpose | Stateful? | |---|---|---| | `void2-db` | Postgres + pgvector | Yes — the canonical store | | `void2-app` | Node API + Python workers + Void UI + cron | No (data in `void2-db`) | Future-improvements list (parked): - Build own bookmark capture front-end to replace Karakeep - Extract MCP server to its own LXC if it grows independent - True clustering if "instant failover" becomes a need --- ## Entity Map [locked] | Entity | Lives in | Contains / links to | |---|---|---| | **Space** | top-level | Projects, Tasks, Pages, Refs, SourceDocs, Conversations, Resources | | **Project** | a Space | Tasks (children); has-many Pages, Refs, SourceDocs, Conversations, Resources | | **Task** | a Space, optionally also a Project | Pages, Refs, Conversations | | **Page** (authored) | tagged | backlinks, attachments — your notes + AI-assisted commentary | | **Reference** (captured) | tagged | source URL, local snapshot, metadata — websites/videos/PDFs/files/images | | **Source Doc** (mirrored upstream) | bound to a Resource | version, last-synced, sync source — official docs from publisher | | **Conversation** | attaches to Space/Project/Task/Resource | Messages — first-class, multi-agent | | **Resource** (deployed service, rich) | a Space | dependencies, credentials refs, source docs, runbook pages, change history, monitoring config | **Relationships are explicit, not implied.** Any entity can attach to any other via typed links (`project_pages`, `task_refs`, `resource_source_docs`, etc.). --- ## Capture Pipeline [locked — day-one inputs] Day-one capture inputs: 1. **URLs / bookmarks** — Karakeep stays as inbox; webhook flows new bookmarks into Void 2.0 as References (with AI-suggested Project/Space tagging) 2. **YouTube / web videos** — `yt-dlp` for metadata + transcript; local Whisper if no transcript; AI summary + chapters via Ollama 3. **PDFs / documents** — text extract or Tesseract OCR; AI summary; full text indexed 4. **Screenshots / images** — Tesseract OCR; AI summary 5. **Generic files** — blob storage on host; indexed by name + tags All AI summarization runs against local Ollama (CT 102). --- ## Agent Model [locked] **Per-agent capability tiers.** Each AI agent (Claude, Mercy, Orthos, Dross, Eithan, Lindon, Yerin, Little Blue, future agents) has its own capability record. - **Default for all agents:** `read` + `suggest`. Agents can search/read anything. Writes are *drafts* in a "pending changes" inbox the user approves. - **Promotable per agent:** `write` capability, scoped (e.g., Mercy gets write-on-Pages but not Resources) - **Audit log:** every agent action recorded with `agent_id` + timestamp + diff MCP surface exposes Void 2.0 to Claude Code, Open WebUI, OpenClaw, and future agents through the same interface. --- ## Build Approach [locked] **Approach A — Greenfield modular monolith.** - New repo at `/project/src/void-v2` - Two processes on `void2-app` LXC: - **`void-server`** (Node) — REST API + MCP + Void UI + cron + light ingest (Karakeep webhook) - **`void-workers`** (Python) — heavy ML ingest: yt-dlp, Whisper, Tesseract, PDF extract, embeddings via Ollama - Postgres + pgvector on `void2-db` LXC - Copy across from current Void (without inheriting its structure): agent persona files, blackflame theme CSS, Cradle naming, cron task list, schema YAMLs as initial Resource seed data - Old Void on CT 301 keeps running until cutover; then archived --- ## Architecture Details [locked] ### Two processes, one job queue, strict boundaries **`void-server` (Node)** owns: HTTP API, MCP server, Void UI, cron, agent runtime, light ingest (Karakeep webhook, manual paste). Internal layout: ``` lib/ db/ Postgres pool, migrations, repos/ (one file per entity) api/ HTTP routes (thin — just call repos) mcp/ MCP server, tool definitions, per-agent capability checks ingest/ Karakeep webhook, manual capture jobs/ Enqueue heavy work for workers (pg-boss client) cron/ Scheduler + one file per task agents/ Cradle persona runtime (Claude subprocess + Ollama via Mastra) ``` **Boundary rule:** HTTP and MCP both reach data only via `repos/`. No raw SQL in routes. Same repos enforce per-agent capability checks. This is what makes any later extraction (e.g., MCP as its own service) painless. **`void-workers` (Python)** owns heavy ML ingest. One worker per kind: `video.py` (yt-dlp + Whisper), `pdf.py` (pdftotext / Tesseract), `image.py` (Tesseract), `file.py` (blob + indexing), `sourcedoc.py` (mirror upstream docs). They poll the job queue, claim work, write results to DB. ### Job queue: pg-boss Postgres-backed, Node + Python clients. We don't add Redis/RabbitMQ — the DB is already there. Failed jobs retry with backoff, then land in a dead-letter table. **Redis rejected** — Postgres-on-local-LXC is sub-millisecond for indexed queries; the bottlenecks in Void 2.0 will be Ollama/Whisper/OCR (seconds–minutes), not the DB. Adding Redis would buy invisible perf wins at the cost of cache invalidation complexity and another LXC to manage. Reconsider only if profiling shows a specific bottleneck. ### Caching, if needed - **In-process LRU** (JS `Map` with size cap) inside `void-server` for hot lookups. Zero ops cost. - **`pg LISTEN/NOTIFY`** for real-time UI updates (transcription progress, etc.) if/when we want them. Built into Postgres — no extra service. ### Cron Lives only in `void-server` (single process — no leader election needed). Light tasks run in-process; heavy tasks enqueue worker jobs. ### Audit log Append-only. Every mutating call (HTTP, MCP, cron, worker) writes one row: `actor_kind`, `actor_id`, `entity_type`, `entity_id`, `action`, `diff`, `occurred_at`. Powers: pending-changes inbox for agent drafts, Resource change history, "who did what when" forensics. --- ## Schema [locked] All ids `uuid` (`gen_random_uuid()`). All entities have `created_at` / `updated_at`. Vector columns are `vector(1024)` everywhere — embeddings from `nomic-embed-text` (768 dims) padded with zeros so model swap to a 1024-dim model is a re-embed pass, not a migration. Slugs unique per-Space. Single implicit user for now; audit columns store `actor_kind` + `actor_id` so multi-user is a non-breaking later migration. ### Core entity tables | Table | Key columns | |---|---| | `spaces` | slug, name, description, theme | | `projects` | space_id, slug, name, status, started_at, completed_at | | `tasks` | space_id, project_id (nullable), title, body, status, priority, due_at, position | | `pages` | space_id, slug, title, body_md, body_html, parent_id, embedding | | `page_revisions` | page_id, body_md, edited_by, created_at | | `refs` | space_id, kind (`url\|video\|pdf\|image\|file`), source_url, title, summary, body_text, blob_path, metadata, embedding, source_kind, external_id | | `source_docs` | resource_id, name, upstream_url, version, format, sync_source, local_path, last_synced, embedding | | `resources` | space_id, slug, name, runtime_type (`lxc\|vm\|docker\|bare-metal`), host, url, version, status, monitoring (jsonb) | | `resource_dependencies` | resource_id, depends_on, kind | | `resource_credentials` | resource_id, label, vault_path, kind, notes | | `conversations` | title, agent_id, participants, summary, embedding | | `messages` | conversation_id, role, agent_id, body, metadata | | `agents` | slug, name, kind, model, persona_path, capabilities (jsonb), scopes (jsonb) | ### Cross-cutting tables | Table | Purpose | |---|---| | `tags` | normalized tag list (name, description, color) | | `entity_tags` | (entity_type, entity_id, tag_id) — polymorphic tagging | | `entity_links` | (from_type, from_id, to_type, to_id, relation) — any-to-any linkage | | `attachments` | (entity_type, entity_id, filename, mime_type, blob_path, checksum) | | `audit_log` | append-only mutation history | | `pending_changes` | agent draft inbox awaiting approval | | `pg-boss` tables | managed by the queue lib | ### Default lifecycle states - Project: `idea | active | paused | done | abandoned` - Task: `todo | doing | blocked | done` - Resource: `running | stopped | down | unknown` (State transitions and automation defined in the Status section, later.) ### Search strategy - **Full-text** — Postgres `tsvector` + GIN on `pages.body_md`, `refs.title+summary+body_text`, `source_docs.body_text`, `messages.body`. One query, all knowledge types. - **Semantic** — pgvector HNSW indexes on `pages.embedding`, `refs.embedding`, `source_docs.embedding`, `conversations.embedding`. Embeddings generated by Ollama at write time, async via worker. - **Combined** — search API does FTS + vector in parallel, fuses with reciprocal-rank fusion. Filters by Space, Project, tags, kind. ### Key design decisions 1. **Polymorphic links over dedicated junction tables** — one `entity_links` table instead of ~20 pairwise junctions. Loses Postgres-enforced FK integrity on polymorphic columns; pays back in flexibility. Periodic integrity-check query catches orphans. 2. **Audit log is the only mutation history** — no per-entity history tables. Powers pending-changes inbox, Resource change history, and forensics from one mechanism. 3. **`page_revisions` is the exception** — full markdown snapshots, not diffs. Disk is cheap; debugging a corrupted page from a 12-step diff chain is not. 4. **JSONB for variable shape** — `metadata` columns on `refs` (kind-specific), `resources` (monitoring config), `agents` (capabilities, scopes). Add fields without migrations. --- ## API Surface [locked] ### REST (Void UI ↔ void-server) Standard CRUD per entity under `/api/`, JSON in/out, errors as `{error: {code, message, details}}`. Pagination via `?limit=&offset=`. Endpoint groups: spaces, projects, tasks, pages (+ revisions, backlinks), refs, source_docs (+ resync), resources (+ dependencies, changes), conversations (+ messages), agents, search (unified FTS + vector with RRF), tags, links, pending-changes (approve/reject), audit, capture (karakeep webhook, manual url, file upload, youtube), jobs (observability). **Auth:** Bearer token. Single owner token for the Void UI. Per-agent tokens in a separate `agent_tokens` table (hashed). Audit log records `actor_kind` + `actor_id` on every mutation. ### MCP (AI agents ↔ void-server) Smaller, task-oriented surface — not full CRUD. Tools enforce per-agent capabilities; default-tier agents get writes routed to `pending_changes`. Initial tools: `void.search`, `void.get_entity`, `void.list_projects`, `void.list_tasks`, `void.related`, `void.read_conversation`, `void.resource_status`, `void.draft_page`, `void.draft_task`, `void.draft_ref`, `void.append_journal`, `void.suggest_link`, `void.update_entity`. **Transport:** both stdio (for Claude Code spawned subprocess) and HTTP/SSE (for Open WebUI, OpenClaw, remote agents). Same tool definitions, two transports. Capability checks happen in tool handlers, which call the same `repos/` as REST — one source of truth, two front doors. --- ## Capture Workers [locked] ### Job kinds (one Python module per kind) `ingest.karakeep`, `ingest.url`, `ingest.youtube`, `ingest.video`, `ingest.pdf`, `ingest.image`, `ingest.file`, `sync.source_doc`, `embed.text`, `summarize.conversation`. ### Job lifecycle ``` queued → claimed → running → done ↘ failed → retry (exp backoff: 10s, 60s, 5m) → dead-letter ``` Workers atomically claim via pg-boss, validate input, check idempotency, do work, write results in a transaction (entity row + audit log + downstream enqueues), mark done. Transient errors retry; permanent errors dead-letter immediately. ### Idempotency Every job carries `idempotency_key`. For URL/Karakeep ingest: `key = sha256(source_url + space_id)`. If a successful job with that key exists, no-op. ### Concurrency (per-kind queues) | Kind | Limit | Reason | |---|---|---| | `ingest.youtube`, `ingest.video` | **1** | Whisper GPU-bound on A2000 6GB | | `ingest.pdf`, `ingest.image` | 2 | Tesseract CPU-bound | | `ingest.url`, `ingest.karakeep`, `ingest.file` | 4 | Network/disk-bound | | `sync.source_doc` | 1 | One source at a time; don't hammer upstream | | `embed.text`, `summarize.conversation` | 2 | Ollama-bound | ### Blob storage Content-addressed on local disk: `/var/lib/void/blobs//`. Deduplicates identical files. ZFS dataset replicated to Leonardo via existing syncoid daily. MinIO is a future option, not day-one. ### Dead-letter & monitoring pg-boss managed dead-letter table. Void UI "Jobs" panel shows pending, running, recent completions, dead-letter with retry/delete actions. ### Downstream chaining Finished jobs enqueue more jobs in the same transaction (e.g., source doc sync → embed each chunk). Keeps everything resumable: if Ollama is down, the entity saves without embedding, embed retries later. --- ## UI / Orchestrator Shape [locked] ### Shell Three columns, Cradle aesthetic preserved (blackflame palette, Cradle naming). - **Sidebar:** Spaces tree on top (collapsible, drag-to-reorder); global views below — Sacred Valley, Agents, Inbox (pending changes with count), Resources cross-space, full Search - **Main pane:** context-dependent view (Space, Project, Page editor, Reference detail, Resource detail, Search, Sacred Valley, Inbox, Conversation) - **Right rail:** always-visible context-aware chat companion, collapsible to slim tab. Agent scoped to current view; per-Space default agent. Drag-handle to resize. - **Top bar:** universal capture button (paste/drop → AI suggests Space+Project → confirm), global search, pending-changes bell with count, user/agent toggle ### Views (main pane) | View | Purpose | |---|---| | Space | Overview of projects, tasks, refs, pages, resources in that space | | Project | Header (status/dates), Tasks, References, Pages, Conversations, Resources | | Page editor | Markdown editor with split preview, FTS in-page, attach upload | | Reference detail | Media preview + AI summary + metadata + tags + linked-from | | Resource detail | Health header + dependencies graph + Source Docs + runbook Pages + change history | | Search | Unified FTS + vector results, grouped by type, sidebar filters | | Sacred Valley | Current gridstack dashboard, preserved (weather, speedtest, host-perf, briefings, service health) | | Inbox | Pending changes grouped by agent, with diff viewer + approve/reject | | Conversation | Full-window chat when right-rail isn't enough | ### Defaults - **Landing page:** last-viewed Space, falling back to a "Home" overview of recent activity across all Spaces - **Sacred Valley:** kept as a named sidebar view (not the default homepage) - **Right-rail chat:** always visible, context-aware, collapsible - **Capture button:** paste-anything modal → AI infers kind (URL/file/text) → suggests Space+Project from content + tags → user confirms or overrides ### Pending Changes Inbox Items grouped by agent. Each shows entity-type icon + agent's reason + diff viewer + approve/reject. Approving runs the mutation through the same repo as a direct write would (single code path). --- ## Security & Auth [locked] ### Authentication layers | Layer | Mechanism | Scope | |---|---|---| | Owner via browser/mobile | Cloudflare Access (Google IDP, restricted email) → CF Tunnel → Void 2.0 | Full owner | | AI agents via MCP | Bearer tokens, bcrypt-hashed in `agent_tokens`. Scoped by `agents.capabilities + scopes` | Per-agent tiered | | void2-app → void2-db | Dedicated Postgres user, limited grants, LAN-only | Service account | | void2-app → Ollama | LAN, no auth | LAN-only | ### Remote-access boundary | Surface | Reachable how | Behind CF Access? | |---|---|---| | `void.hynesy.com` (UI) | CF Tunnel | Yes — Google auth, your email | | `mcp.void.hynesy.com` (MCP HTTP/SSE for remote agents) | CF Tunnel | Yes — CF Access Service Tokens | | Internal MCP (Claude Code, Open WebUI on CT 103) | Direct LAN | No — local | | Postgres | LAN-only, firewalled | n/a | ### Secrets handling - Bootstrap secrets in `.env` files on each LXC, `chmod 600`, owned by service user - `resource_credentials.vault_path` is a *pointer string* (`env:NAME`, `file:/path`, or future `vault:id`). Void 2.0 resolver reads from env or file. Schema unchanged if/when we swap to Vaultwarden — only the resolver changes. - Agent tokens shown plaintext **once** at creation, then bcrypt-hashed. - No secrets in audit log (per-entity redaction before write). ### Privacy posture - All AI inference local by default (Ollama on CT 102) - Claude API calls cross to Anthropic — documented egress channel; PII flagging not in v1 - Audit log retains every mutation for forensics ### Backup posture - ZFS daily syncoid replication of `void2-db` + blob datasets to Leonardo - Postgres `pg_dump` cron daily (restore-test friendly, independent of ZFS) - Encrypted ZFS datasets for any off-site replica targets later (Farm) ### Out of scope (v1) mTLS between internal services, field-level encryption in DB, HSMs, PII detection before LLM egress. --- ## Future Improvements (deferred) These are intentionally **not** day-one work. Tracked so they don't get forgotten: - **Vaultwarden secrets store** — user explicitly asked to be reminded. Day-one resolver was designed so this is a swap, not a schema change. See [auto-memory: project_void_v2_vaultwarden_followup]. - **Own bookmark capture front-end** to replace Karakeep - **MinIO** for blob storage (S3-compatible access from elsewhere) - **Extract MCP** to its own LXC if it grows independently - **True clustering / instant failover** (Patroni) if zero-downtime maintenance becomes needed - **PII detection** before Anthropic API egress - **Mobile-optimized capture flow** (PWA install, share-target intent on Android) - **Local STT** (Whisper) for voice notes as a capture kind - **RSS / email** ingest --- ## Naming & Versioning [locked] This project is **Void 2.0** — a full remaster of the existing Void (retroactively "Void 1.x") with the same Cradle aesthetic, expanded into a homelab orchestrator + canonical knowledge store. "Codex" is **not** a name — just a way we referenced the data-layer concept during brainstorming. There is no `Codex` brand or module; the data layer is `lib/db/` / `lib/repos/` inside `void-server`. ### Repo / process / LXC naming - **Repo:** `/project/src/void-v2` - **Processes:** `void-server` (Node), `void-workers` (Python) - **LXCs during cutover:** `void2-db`, `void2-app` (the `2` suffix avoids clashing with current CT 301 `void`). After CT 301 retirement: rename to plain `void-db`, `void-app`. - **Domains:** `void.hynesy.com` (UI), `mcp.void.hynesy.com` (MCP HTTP/SSE) - **MCP tool prefix:** `void.search`, `void.draft_page`, etc. ### Version strategy Semver: `MAJOR.MINOR.PATCH`. - **2.0.0** — initial Void 2.0 release after Void 1.x retirement - Minor bumps for added features, patch bumps for fixes - Major bumps reserved for architecture/schema changes that require migrations ### CHANGELOG `CHANGELOG.md` at the root of `/project/src/void-v2`, following the [Keep a Changelog](https://keepachangelog.com) convention. Entry for **2.0.0** captures the differences from Void 1.x at a high level (architecture, schema, capture pipeline, agent model, naming). Subsequent releases get their own sections. Each entry: Added / Changed / Deprecated / Removed / Fixed. A separate `docs/VERSION_HISTORY.md` carries the **narrative** version history — when each release happened, the headline thinking behind it, deferred items rolled in, lessons. Lives alongside the design spec for long-term archaeology. Each `MAJOR.x.x` release gets a section. --- ## Migration / Cutover Plan [locked] ### Existing data inventory | Source | Location | Volume | Maps to | |---|---|---|---| | Void 1.x SQLite | CT 301 | wiki_pages (~25), messages, projects, conversations | Void 2.0 `pages`, `messages` (grouped into `conversations`), `projects` | | BookStack | CT 104 MariaDB | ~17+ pages, hierarchy | `pages` (parent_id preserved); dedupe vs already-imported wiki_pages | | Karakeep | CT 100 | bookmarks + AI summaries + tags | `refs` (kind=url), `external_id` = karakeep id | | `/root/.claude/plans/*.md` | filesystem | 5 plan files | `pages` under each plan's Project | | Void 1.x agent personas | `/project/src/void/characters/` | 7 agents × 3 files | `agents.persona_path` | | Void 1.x schema YAMLs | `/project/src/void/schemas/` | 11 services | `resources` seed data + `resources.monitoring` jsonb | | Void 1.x code (theme, cron logic) | source | selective | Reused inside `void-server` | | Auto-memory entries | `/root/.claude/projects/-project/memory/*.md` | ~30 entries | **Mirrored** — see below | ### Migration script structure Python migration tool in `void-workers/migrate/` with sub-commands: ``` void-migrate bookstack --source-db void-migrate karakeep --source-db void-migrate void1-sqlite --source-db void-migrate plans --source-dir /root/.claude/plans/ void-migrate memory --source-dir /root/.claude/projects/-project/memory/ void-migrate void1-schemas --source-dir /project/src/void/schemas/ void-migrate void1-personas --source-dir /project/src/void/characters/ ``` Each command is **idempotent** — uses source IDs / file paths as `external_id` so re-runs upsert rather than duplicate. ### Auto-memory: one-way mirror (files stay primary) Auto-memory files remain the source-of-truth — Claude Code's harness reads them directly across sessions. A worker mirrors them into Void 2.0 as Pages under a "Memory" Space: - Mirror runs on file change (inotify) and nightly as safety net - Pages get `external_id = file path`, idempotent upsert - Edits in Void 2.0 UI flow back to files via a `::memory-update` marker (same pattern Path B established) - Auto-memory remains canonical; Void 2.0 view is searchable, MCP-readable, visible in the UI ### Cutover: stand up alongside, big-bang switch with grace period 1. Build Void 2.0 on new LXCs (`void2-db`, `void2-app`) without touching CT 301 2. Run migration scripts (read-only access to BookStack + Karakeep + Void 1.x DBs) 3. Verify counts + spot-check content 4. **Cutover day:** swap `void.hynesy.com` CF tunnel target from CT 301 to `void2-app` 5. **Grace period (30 days):** CT 301 stays read-only as fallback 6. **Retire CT 301:** snapshot, stop, rename `void2-*` LXCs to `void-*` ### Cron / scheduled task migration Existing Void 1.x cron (Dross briefing, Yerin alerts, Little Blue heal, hourly speedtest, Orthos council) ports directly to `void-server/lib/cron/tasks/`. Same logic, same timing, against Void 2.0's data. --- ## Testing Approach [locked] | Layer | Coverage | How | |---|---|---| | Unit | Repos, capability checks, helpers (slug gen, idempotency keys, embedding pad/truncate) | Node: vitest. Python: pytest. | | Integration | REST + MCP tools against a test DB | Postgres-in-docker; schema applied from migrations; reset per test | | E2E | Happy paths: create Space/Project, capture URL, search, approve pending change, attach ref | Playwright against running test instance | | Manual (runbook'd) | Capture workers (Whisper, OCR), agent runtime (Claude subprocess + Ollama), CF Access flows | `docs/testing/manual.md` — too heavy or external for CI | | Migration scripts | All `void-migrate` sub-commands | Fixture DBs for BookStack + Void 1.x + Karakeep; assert counts + spot-check content | **Coverage target:** ~70% on `lib/` modules. Lower on routes/UI — covered by integration + E2E instead. No coverage chasing. **CI:** GitHub Actions if you mirror to a remote; local pre-push hook otherwise. Runs unit + integration on every change to `void-server` or `void-workers`. --- ## Status / Lifecycle Model [locked] | Entity | States | Transitions | Automation | |---|---|---|---| | Project | `idea`, `active`, `paused`, `done`, `abandoned` | Free (any-to-any) | None; manual | | Task | `todo`, `doing`, `blocked`, `done` | Free | `done` sets `completed_at` | | Resource | `running`, `stopped`, `down`, `unknown` | Auto + manual override | Health check cron updates; manual override pins until `maintenance_until` | | Conversation | `open`, `summarized`, `archived` | Auto with overrides | `summarize.conversation` worker moves to `summarized` after 24h idle | | Reference | `ingested`, `indexed`, `enriched` | Worker-driven | Pipeline: capture → FTS indexed → embedded + AI summary done | | Pending Change | `pending`, `approved`, `rejected` | User-driven | None | **Free transitions** everywhere user-facing. Homelab work is rarely linear; the audit log captures every transition. **Resource status reconciliation:** health check cron writes `status` and `last_check`. Manual override during planned maintenance pins state until a `maintenance_until` timestamp. --- ## Pending Sections — to complete before this is plan-ready (All sections locked. Spec ready for user review.) --- ## Decision Log | Date | Decision | Why | |---|---|---| | 2026-05-30 | Foundation-first Void 2.0 over evolve-Void | Long-term HA requirement makes single-LXC SQLite a dead end | | 2026-05-30 | 2 LXCs, planned-migration HA | User confirmed instant failover not needed | | 2026-05-30 | Postgres + pgvector (no separate Qdrant) | Simpler — one DB does relational + vector | | 2026-05-30 | Three-tier Space → Project → Task with sibling tasks | Matches how user organizes; allows ad-hoc TODOs | | 2026-05-30 | Pages + References + Source Docs as three knowledge types | Authored vs captured vs upstream-mirrored are genuinely different | | 2026-05-30 | Conversations first-class, attach to other entities | "Create project from chat" + AI needs prior conversation context | | 2026-05-30 | Rich Resource entity (dependencies, creds refs, change history) | User wants real orchestrator, not just inventory | | 2026-05-30 | Keep Karakeep as bookmark inbox; webhook into Void 2.0 | Karakeep works; building own is a deferred improvement | | 2026-05-30 | Day-one capture: URLs, videos, PDFs, images, files | Full pipeline, no half-measures | | 2026-05-30 | Agents: read+suggest default, per-agent tiered promotion | Balance usefulness with safety | | 2026-05-30 | Greenfield Void 2.0 (Approach A), copy valuable bits from Void | Clean break from accumulated Void shape | | 2026-05-31 | Two-process layout (Node server + Python workers) on one LXC | Right-tool-per-job; Python for ML, Node for API/UI/cron | | 2026-05-31 | pg-boss job queue (not Redis/RabbitMQ) | Postgres is already there; one fewer service | | 2026-05-31 | Skip Redis cache | DB isn't the bottleneck; Ollama/Whisper/OCR are. Reconsider only if profiling shows it. | | 2026-05-31 | Audit log is append-only, polymorphic | One mechanism for change history + agent action tracking + pending-changes inbox | | 2026-05-31 | `vector(1024)` everywhere with zero-padding for 768-dim embeds | Model swap is a re-embed pass, not a DDL migration | | 2026-05-31 | Polymorphic `entity_links` over ~20 pairwise junction tables | Flexibility wins at this scale; periodic integrity check covers FK gap | | 2026-05-31 | Single implicit user; audit columns ready for multi-user later | Multi-user is a non-breaking migration if ever needed | | 2026-05-31 | MCP exposes task-oriented tools, not raw CRUD | Smaller surface for agents = safer + clearer semantics | | 2026-05-31 | MCP supports both stdio + HTTP/SSE | Covers Claude Code (stdio) and network agents (HTTP) without bridges | | 2026-05-31 | pg-boss with per-kind concurrency limits | GPU/CPU/network workloads have different parallelism needs | | 2026-05-31 | Idempotency keys on all ingest jobs | Webhook replays + manual retries shouldn't duplicate content | | 2026-05-31 | Content-addressed blob store; ZFS replicated via syncoid | Free dedup + your existing replication covers it | | 2026-05-31 | Whisper concurrency stays at 1 | Conservative; tune after deploy if A2000 has headroom | | 2026-05-31 | Three-column shell (sidebar / main / right-rail chat) | Matches orchestrator + chat-with-context workflow | | 2026-05-31 | Sacred Valley kept as sidebar view, not landing page | Frees landing for last-viewed Space; dashboard still one click away | | 2026-05-31 | Right-rail chat always visible, context-aware | Friction-free 'ask Mercy about this' across all views | | 2026-05-31 | Universal capture button with AI Space/Project suggestion | One capture surface for all content kinds; reduces friction over per-page add-ref | | 2026-05-31 | CF Access on UI + MCP-HTTP; LAN-direct for internal agents | Matches owner-via-internet + agent-on-LAN access patterns | | 2026-05-31 | Env+file vault_path resolver day-one; Vaultwarden swap later | Pragmatic start; resolver swap doesn't change schema | | 2026-05-31 | Agent tokens bcrypt-hashed, plaintext shown once | Standard bearer-token hygiene | | 2026-05-31 | mTLS / field-level encryption deferred from v1 | Single-trust-domain LAN homelab; ZFS-at-rest covers it for now | | 2026-05-31 | Renamed from "Codex" to **Void 2.0** | Preserve Cradle aesthetic + naming continuity from Void 1.x | | 2026-05-31 | CHANGELOG.md (Keep a Changelog) + VERSION_HISTORY.md (narrative) | User wants major-version comparison + readable narrative archaeology | | 2026-05-31 | Auto-memory: one-way mirror, files stay primary | Harness keeps working; knowledge stays unified | | 2026-05-31 | Big-bang cutover with 30-day grace period on CT 301 | Minimal complexity; safety net against forgotten data | | 2026-05-31 | Free state transitions; audit log records every change | Homelab work is rarely linear; don't over-validate | | 2026-05-31 | Test coverage target ~70% on lib/, manual runbook for ML/agent flows | Where automation cost exceeds value, document instead |