docs: move void-v2 specs + plans into the repo
All Void 2.0 superpowers specs and implementation plans now live at
docs/superpowers/{specs,plans}/ inside the repo. Previously they were
at /project/docs/superpowers/ which was not under git.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
3267
docs/superpowers/plans/2026-05-31-void-v2-plan1-foundation.md
Normal file
3267
docs/superpowers/plans/2026-05-31-void-v2-plan1-foundation.md
Normal file
File diff suppressed because it is too large
Load Diff
150
docs/superpowers/plans/2026-05-31-void-v2-plan1-progress.md
Normal file
150
docs/superpowers/plans/2026-05-31-void-v2-plan1-progress.md
Normal file
@@ -0,0 +1,150 @@
|
|||||||
|
# Void 2.0 Plan 1 — Execution Progress
|
||||||
|
|
||||||
|
**Updated:** 2026-05-31 (session paused at 92% context)
|
||||||
|
**Plan:** `/project/docs/superpowers/plans/2026-05-31-void-v2-plan1-foundation.md`
|
||||||
|
**Spec:** `/project/docs/superpowers/specs/2026-05-31-void-v2-design.md`
|
||||||
|
**Repo:** `/project/src/void-v2/` (git init done, on branch `main`)
|
||||||
|
**Execution mode:** subagent-driven-development
|
||||||
|
|
||||||
|
## Status by Phase
|
||||||
|
|
||||||
|
| Phase | Plan Tasks | Status | Notes |
|
||||||
|
|---|---|---|---|
|
||||||
|
| A — Scaffolding | 1, 4 | **DONE** | Repo init + Node project init. 2 commits. |
|
||||||
|
| B — Infrastructure | 2, 3 | **DONE** | void2-db (CT 310 @ 192.168.1.12) + void2-app (CT 311 @ 192.168.1.13) running on Z. Postgres 16.14 + pgvector 0.8.2 + pgcrypto 1.3. DB user `void` + database `void` created. Verified reachable from this LXC. `.env` populated. |
|
||||||
|
| C — Migrations | 5, 6, 8, 10, 12, 14, 16 | **READY** | DB is reachable — can dispatch Task 5 implementer. |
|
||||||
|
| D — Repos | 7, 9, 11, 13, 15, 16-real | Blocked by B | All entity repos. |
|
||||||
|
| E — Auth + Server | 17, 18, 19, 20 | Blocked by B | Capability check, owner middleware, Express, /health smoke. |
|
||||||
|
| F — Deploy + docs | 21, 22 | Blocked by E | systemd, push.sh, completion doc. |
|
||||||
|
|
||||||
|
## Completed Tasks
|
||||||
|
|
||||||
|
### Task 1 — Repo Scaffolding [commit 0ede9fe]
|
||||||
|
- 5 files: `.gitignore`, `README.md`, `CHANGELOG.md`, `docs/VERSION_HISTORY.md`, `.env.example`
|
||||||
|
- Spec reviewed ✅, code quality reviewed ✅
|
||||||
|
|
||||||
|
### Task 4 — Node Project Init [commit 45186f7]
|
||||||
|
- `package.json` (name=void-server, version=2.0.0-alpha.1, type=module)
|
||||||
|
- 7 runtime deps + 2 dev deps installed
|
||||||
|
- `vitest.config.js`, `lib/log.js` (pino logger)
|
||||||
|
- **NOT YET REVIEWED** — spec + code quality review skipped due to token pressure. Re-do on resume if desired, OR proceed (Task 4 is mechanical setup with no functional code).
|
||||||
|
|
||||||
|
## Deviations from Plan
|
||||||
|
|
||||||
|
1. **Express 5.2.1 installed** (plan said "Express 4"). Express 5 is the current `npm install express` default. **Likely fine** — Express 5 changed middleware error handling (promises auto-catch) and removed deprecated APIs, but our usage (json body, simple middlewares) works on both. Flag if any test fails with Express 5-specific behaviour.
|
||||||
|
2. **Debian 13** used for LXCs (plan said 12). Only `debian-13-standard_13.1-2` template was on Z. No functional impact.
|
||||||
|
3. **Storage: `localzfs` on Z** (plan said `donatello-zfs`). Donatello + Leonardo ZFS pools are OFFLINE — leftover from your 2026-05-22 SATA bus incident. **HA migrate is NOT blocked** — `localzfs` is the standard pattern here (CT 104, 105, 106, 108-112 all run on it) and PVE storage replication to Z3 every 15 min is configured. Replication jobs `310-0` and `311-0` added with `*/15` schedule, matching the rest of the fleet. `pct migrate 310 z3` will work like every other CT. Donatello/Leonardo restoration is a separate issue, not Void-blocking.
|
||||||
|
4. **`su - postgres` instead of `sudo -u postgres`** — Debian 13 minimal doesn't ship sudo. Not a deviation in outcome, just adjusted command form.
|
||||||
|
5. **DB password stored at `/root/void2-db-pass.txt` on Z** (chmod 600). Also baked into `/project/src/void-v2/.env` on this LXC.
|
||||||
|
|
||||||
|
## Awaiting User — Phase B (Tasks 2 + 3)
|
||||||
|
|
||||||
|
These need PVE host access. Agent inside /project cannot run them.
|
||||||
|
|
||||||
|
### Task 2 — Provision LXCs on PVE host `z`
|
||||||
|
|
||||||
|
Pick two free CT IDs (suggestion: 310 = `void2-db`, 311 = `void2-app`) and two free IPs on 192.168.1.0/24.
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# On PVE host as root
|
||||||
|
pct create 310 local:vztmpl/debian-12-standard_12.7-1_amd64.tar.zst \
|
||||||
|
--hostname void2-db \
|
||||||
|
--cores 2 --memory 4096 --swap 1024 \
|
||||||
|
--net0 name=eth0,bridge=vmbr0,ip=192.168.1.X/24,gw=192.168.1.1 \
|
||||||
|
--storage donatello-zfs --rootfs donatello-zfs:32 \
|
||||||
|
--unprivileged 1 --features nesting=1 --onboot 1 --start 0
|
||||||
|
cat >> /etc/pve/lxc/310.conf <<'EOF'
|
||||||
|
lxc.apparmor.profile: unconfined
|
||||||
|
EOF
|
||||||
|
pct start 310
|
||||||
|
|
||||||
|
pct create 311 local:vztmpl/debian-12-standard_12.7-1_amd64.tar.zst \
|
||||||
|
--hostname void2-app \
|
||||||
|
--cores 4 --memory 4096 --swap 1024 \
|
||||||
|
--net0 name=eth0,bridge=vmbr0,ip=192.168.1.Y/24,gw=192.168.1.1 \
|
||||||
|
--storage donatello-zfs --rootfs donatello-zfs:16 \
|
||||||
|
--unprivileged 1 --features nesting=1 --onboot 1 --start 0
|
||||||
|
cat >> /etc/pve/lxc/311.conf <<'EOF'
|
||||||
|
lxc.apparmor.profile: unconfined
|
||||||
|
EOF
|
||||||
|
pct start 311
|
||||||
|
```
|
||||||
|
|
||||||
|
### Task 3 — Install Postgres 16 + pgvector on void2-db (CT 310)
|
||||||
|
|
||||||
|
```bash
|
||||||
|
pct enter 310
|
||||||
|
|
||||||
|
apt update
|
||||||
|
apt install -y curl ca-certificates gnupg
|
||||||
|
install -d /usr/share/postgresql-common/pgdg
|
||||||
|
curl -fsSL https://www.postgresql.org/media/keys/ACCC4CF8.asc \
|
||||||
|
-o /usr/share/postgresql-common/pgdg/apt.postgresql.org.asc
|
||||||
|
echo "deb [signed-by=/usr/share/postgresql-common/pgdg/apt.postgresql.org.asc] \
|
||||||
|
https://apt.postgresql.org/pub/repos/apt $(. /etc/os-release; echo $VERSION_CODENAME)-pgdg main" \
|
||||||
|
> /etc/apt/sources.list.d/pgdg.list
|
||||||
|
apt update
|
||||||
|
apt install -y postgresql-16 postgresql-16-pgvector
|
||||||
|
|
||||||
|
sed -i "s/^#listen_addresses.*/listen_addresses = '*'/" /etc/postgresql/16/main/postgresql.conf
|
||||||
|
|
||||||
|
cat >> /etc/postgresql/16/main/pg_hba.conf <<'EOF'
|
||||||
|
host void void 192.168.1.0/24 scram-sha-256
|
||||||
|
EOF
|
||||||
|
|
||||||
|
systemctl restart postgresql
|
||||||
|
|
||||||
|
DB_PASS=$(openssl rand -base64 24)
|
||||||
|
echo "Generated DB password (SAVE THIS): $DB_PASS"
|
||||||
|
|
||||||
|
sudo -u postgres psql <<EOF
|
||||||
|
CREATE USER void WITH PASSWORD '$DB_PASS';
|
||||||
|
CREATE DATABASE void OWNER void;
|
||||||
|
\c void
|
||||||
|
CREATE EXTENSION IF NOT EXISTS vector;
|
||||||
|
CREATE EXTENSION IF NOT EXISTS pgcrypto;
|
||||||
|
EOF
|
||||||
|
```
|
||||||
|
|
||||||
|
**SAVE THE DB PASSWORD** somewhere safe — needed for `.env` on resume.
|
||||||
|
|
||||||
|
### After user completes Tasks 2 + 3
|
||||||
|
|
||||||
|
Report back with:
|
||||||
|
1. `void2-db` IP
|
||||||
|
2. `void2-app` IP
|
||||||
|
3. The DB password
|
||||||
|
4. Confirmation that `psql -h <void2-db-ip> -U void -d void -c 'SELECT version();'` succeeds from any LAN host
|
||||||
|
|
||||||
|
Then update `/project/src/void-v2/.env`:
|
||||||
|
```
|
||||||
|
DATABASE_URL=postgres://void:<PASSWORD>@<VOID2_DB_IP>:5432/void
|
||||||
|
OWNER_TOKEN=<generate via openssl rand -base64 24>
|
||||||
|
PORT=3000
|
||||||
|
LOG_LEVEL=info
|
||||||
|
NODE_ENV=development
|
||||||
|
```
|
||||||
|
|
||||||
|
## How to Resume
|
||||||
|
|
||||||
|
In the next session, say something like:
|
||||||
|
> "Resume Plan 1 Void 2.0 execution. Read `/project/docs/superpowers/plans/2026-05-31-void-v2-plan1-progress.md` — Phases A+B done, start dispatching from Task 5."
|
||||||
|
|
||||||
|
Resume agent should:
|
||||||
|
1. Read this progress file
|
||||||
|
2. Verify DB still reachable: `PGPASSWORD=$(ssh root@192.168.1.124 'grep DB_PASS /root/void2-db-pass.txt | cut -d= -f2') psql -h 192.168.1.12 -U void -d void -c 'SELECT 1;'`
|
||||||
|
3. Continue dispatching implementer subagents for Task 5 onward
|
||||||
|
4. Token-saving advice: skip code-quality review for trivial scaffolding tasks (the spec compliance review is sufficient there); do full two-stage review for repo + auth + server code
|
||||||
|
|
||||||
|
## Next subagent dispatch (when ready)
|
||||||
|
|
||||||
|
**Task 5: Postgres Pool + Migration Runner** — full task text in plan file lines covering DB pool, migration runner with idempotency, test helpers.
|
||||||
|
|
||||||
|
## Quick environment cheatsheet for resume
|
||||||
|
|
||||||
|
- Repo: `/project/src/void-v2/` (on `main`, 2 commits)
|
||||||
|
- DB: `192.168.1.12:5432`, user `void`, db `void`, password in `.env` and at `/root/void2-db-pass.txt` on Z (192.168.1.124)
|
||||||
|
- App LXC ready (CT 311 @ 192.168.1.13) but Node not installed there yet (Task 21 handles that)
|
||||||
|
- `cd /project/src/void-v2 && npm test` should work once tests exist; `.env` will be picked up by `dotenv`
|
||||||
|
- SSH to Z: `ssh root@192.168.1.124` (key auth works)
|
||||||
|
- SSH to void2-db/app: `pct exec 310 -- bash` / `pct exec 311 -- bash` from Z (via the SSH chain)
|
||||||
536
docs/superpowers/plans/2026-05-31-void-v2-plan2-api-and-shell.md
Normal file
536
docs/superpowers/plans/2026-05-31-void-v2-plan2-api-and-shell.md
Normal file
@@ -0,0 +1,536 @@
|
|||||||
|
# Void 2.0 — Plan 2: Core REST API + Void UI Shell
|
||||||
|
|
||||||
|
**Goal:** Expose all Plan 1 repos as a REST API and ship the Cradle-themed Void UI shell on top.
|
||||||
|
|
||||||
|
**Architecture:** Thin Express routes in `lib/api/routes/` call the existing `lib/db/repos/` (no raw SQL in routes). Shared zod-validate + error middleware. Static SPA in `public/` consumed by the bearer-protected `/api/*`. Agent bearer auth composes with owner. FTS-only search; vector search is Plan 3. Capture endpoints + jobs surface are Plan 3+.
|
||||||
|
|
||||||
|
**Tech stack:** Express 5, zod 4 (already in deps), supertest 7, vanilla ES modules in browser, marked.js (CDN) for markdown render. No new server deps beyond `marked` if we choose to vendor it.
|
||||||
|
|
||||||
|
**Out of scope (deferred):**
|
||||||
|
- Vector/RRF search (needs embeddings — Plan 3)
|
||||||
|
- Capture endpoints (`/api/capture/*`) and `/api/jobs` (needs pg-boss — Plan 3)
|
||||||
|
- MCP server (Plan 5)
|
||||||
|
- Sacred Valley gridstack widgets (Plan 6) — ship a placeholder card
|
||||||
|
- E2E Playwright tests (Plan 8 sweep)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## File structure
|
||||||
|
|
||||||
|
```
|
||||||
|
lib/api/
|
||||||
|
index.js # registers routers onto app
|
||||||
|
errors.js # NotFoundError, ValidationError, asyncWrap, errorMiddleware
|
||||||
|
validate.js # validate({ body?, params?, query? }) using zod
|
||||||
|
pagination.js # parsePagination(req) → { limit, offset }
|
||||||
|
middleware/
|
||||||
|
agent_auth.js # bearer → agent actor; composes with ownerOnly
|
||||||
|
routes/
|
||||||
|
spaces.js
|
||||||
|
projects.js
|
||||||
|
tasks.js
|
||||||
|
pages.js
|
||||||
|
refs.js
|
||||||
|
resources.js
|
||||||
|
source_docs.js
|
||||||
|
conversations.js
|
||||||
|
messages.js
|
||||||
|
agents.js
|
||||||
|
tags.js
|
||||||
|
links.js
|
||||||
|
pending_changes.js
|
||||||
|
audit.js
|
||||||
|
search.js
|
||||||
|
|
||||||
|
public/
|
||||||
|
index.html # SPA shell
|
||||||
|
style.css # blackflame palette + three-column layout
|
||||||
|
app.js # bootstrap, router, fetch wrapper (auth header)
|
||||||
|
router.js # hash router
|
||||||
|
api.js # typed-ish wrappers over fetch
|
||||||
|
components/
|
||||||
|
sidebar.js
|
||||||
|
topbar.js
|
||||||
|
rightrail.js
|
||||||
|
markdown_editor.js
|
||||||
|
views/
|
||||||
|
space.js
|
||||||
|
project.js
|
||||||
|
page.js
|
||||||
|
reference.js
|
||||||
|
resource.js
|
||||||
|
search.js
|
||||||
|
inbox.js
|
||||||
|
sacred_valley.js # placeholder
|
||||||
|
home.js # landing fallback
|
||||||
|
vendor/
|
||||||
|
marked.min.js # vendored, no CDN at runtime
|
||||||
|
|
||||||
|
tests/api/
|
||||||
|
helpers.js # createApp + auth headers + reset/migrate
|
||||||
|
spaces.test.js
|
||||||
|
projects.test.js
|
||||||
|
tasks.test.js
|
||||||
|
pages.test.js
|
||||||
|
refs.test.js
|
||||||
|
resources.test.js
|
||||||
|
source_docs.test.js
|
||||||
|
conversations.test.js
|
||||||
|
messages.test.js
|
||||||
|
agents.test.js
|
||||||
|
tags.test.js
|
||||||
|
links.test.js
|
||||||
|
pending_changes.test.js
|
||||||
|
audit.test.js
|
||||||
|
search.test.js
|
||||||
|
agent_auth.test.js
|
||||||
|
validate.test.js
|
||||||
|
errors.test.js
|
||||||
|
```
|
||||||
|
|
||||||
|
`server.js` shrinks: build `app`, mount static, mount `/api` router from `lib/api/index.js`. Drop the inline `/api/spaces` smoke route.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Conventions (apply to every route task)
|
||||||
|
|
||||||
|
1. **TDD always.** Write the failing supertest test first. Run it red. Then write the route. Run it green. Commit.
|
||||||
|
2. **Route file shape:**
|
||||||
|
```js
|
||||||
|
import { Router } from 'express';
|
||||||
|
import * as repo from '../../db/repos/<name>.js';
|
||||||
|
import { validate } from '../validate.js';
|
||||||
|
import { asyncWrap } from '../errors.js';
|
||||||
|
import { z } from 'zod';
|
||||||
|
export const router = Router();
|
||||||
|
```
|
||||||
|
3. **No raw SQL in routes** — every data access is `repo.fn(...)`.
|
||||||
|
4. **Mutations pass `req.actor`** to the repo.
|
||||||
|
5. **Errors:** throw `new NotFoundError(...)` / `new ValidationError(...)`. The shared error middleware shapes them as `{error:{code,message,details?}}`. Use `asyncWrap` or rely on Express 5's native promise handling (already default in 5.2).
|
||||||
|
6. **Pagination:** all `GET` list endpoints accept `?limit=&offset=` via `parsePagination`. Default `limit=50`, max `200`.
|
||||||
|
7. **Status codes:** `201` for create, `200` for read/update, `204` for delete, `400` for validation errors, `401` for unauthenticated, `403` for capability deny, `404` not found, `409` for conflicts.
|
||||||
|
8. **Test file shape:** import `tests/api/helpers.js`'s `setup()` which calls `resetDb` + `migrateUp` + returns `{ app, ownerHeaders }`. Each test seeds the minimum it needs (e.g. one space) via repos, then hits the route.
|
||||||
|
9. **Commit per task** with message `feat(api): <entity> routes` or `feat(ui): <view>` etc.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Task list
|
||||||
|
|
||||||
|
### Phase A — Plumbing
|
||||||
|
|
||||||
|
#### Task 1: Error + validation + pagination plumbing
|
||||||
|
|
||||||
|
**Files:** create `lib/api/errors.js`, `lib/api/validate.js`, `lib/api/pagination.js`, `lib/api/index.js`; create `tests/api/helpers.js`, `tests/api/errors.test.js`, `tests/api/validate.test.js`.
|
||||||
|
|
||||||
|
- `errors.js`: export classes `NotFoundError`, `ValidationError`, `ConflictError`, `ForbiddenError` each with `.code` and `.status`; export `errorMiddleware(err, req, res, next)` that maps known errors to `{error:{code,message,details?}}` with correct status, logs unknowns at 500.
|
||||||
|
- `validate.js`: `validate({ body, params, query })` returns middleware that runs the relevant zod schemas, on parse failure throws `ValidationError` with zod's `error.issues` as `details`.
|
||||||
|
- `pagination.js`: `parsePagination(req, { defaultLimit=50, max=200 })` → `{ limit, offset }`, throws `ValidationError` if out of range.
|
||||||
|
- `lib/api/index.js`: exports `mountApi(app)` that mounts an `/api` router (initially empty) under `ownerOnly`. We'll register each route module here in later tasks.
|
||||||
|
- Update `server.js` to call `mountApi(app)` and remove the inline `/api/spaces` route. The existing server smoke test must keep passing — it should now route through the new `spaces` router (added in Task 2). Until then, expect that test to break — fix it in Task 2.
|
||||||
|
|
||||||
|
Tests: validate.test exercises happy + zod failure; errors.test exercises the middleware status mapping and JSON shape.
|
||||||
|
|
||||||
|
Commit: `feat(api): error + validate + pagination plumbing`.
|
||||||
|
|
||||||
|
#### Task 2: Spaces routes
|
||||||
|
|
||||||
|
**Files:** create `lib/api/routes/spaces.js`, `tests/api/spaces.test.js`. Register router in `lib/api/index.js`.
|
||||||
|
|
||||||
|
Endpoints:
|
||||||
|
- `GET /api/spaces` → `repo.list()`
|
||||||
|
- `POST /api/spaces` body `{slug,name,description?,theme?}` → `repo.create(body, req.actor)`; `201`
|
||||||
|
- `GET /api/spaces/:id` → `repo.getById`; 404 if missing
|
||||||
|
- `GET /api/spaces/by-slug/:slug` → `repo.getBySlug`; 404 if missing
|
||||||
|
- `PATCH /api/spaces/:id` partial body → `repo.update`
|
||||||
|
- `DELETE /api/spaces/:id` → `repo.del`; `204`
|
||||||
|
|
||||||
|
Tests: list empty → `[]`; create → 201 + record exists in DB; create with bad slug → 400 + zod details; get unknown id → 404; patch updates; delete then get → 404. Re-enable existing `tests/server.test.js` expectations (the `[]` smoke should now serve from this router).
|
||||||
|
|
||||||
|
Commit: `feat(api): spaces routes`.
|
||||||
|
|
||||||
|
#### Task 3: Projects routes
|
||||||
|
|
||||||
|
**Files:** `lib/api/routes/projects.js`, `tests/api/projects.test.js`.
|
||||||
|
|
||||||
|
Endpoints:
|
||||||
|
- `GET /api/spaces/:space_id/projects?status=` → `repo.listBySpace`
|
||||||
|
- `POST /api/spaces/:space_id/projects` body `{slug,name,description?,status?,started_at?}`
|
||||||
|
- `GET /api/projects/:id`
|
||||||
|
- `PATCH /api/projects/:id`
|
||||||
|
- `DELETE /api/projects/:id`
|
||||||
|
|
||||||
|
Tests: list filter by status; create rejects unknown space (FK error → map to 400 with code `invalid_space`); patch flips status to `done` (does not auto-set `completed_at` — that's a client/UI concern, mirroring repo behavior). Verify `completed_at` only changes when caller passes it.
|
||||||
|
|
||||||
|
Commit: `feat(api): projects routes`.
|
||||||
|
|
||||||
|
#### Task 4: Tasks routes
|
||||||
|
|
||||||
|
**Files:** `lib/api/routes/tasks.js`, `tests/api/tasks.test.js`.
|
||||||
|
|
||||||
|
Endpoints:
|
||||||
|
- `GET /api/spaces/:space_id/tasks?status=` → `repo.listBySpace`
|
||||||
|
- `GET /api/projects/:project_id/tasks` → `repo.listByProject`
|
||||||
|
- `POST /api/spaces/:space_id/tasks` body `{project_id?,title,body?,priority?,due_at?,position?}`
|
||||||
|
- `GET /api/tasks/:id`
|
||||||
|
- `PATCH /api/tasks/:id`
|
||||||
|
- `DELETE /api/tasks/:id`
|
||||||
|
|
||||||
|
Tests: create sibling task (no project_id); create child task (project_id present); listByProject ordered by `position NULLS LAST, created_at`; patch with `status:'done'` sets `completed_at`.
|
||||||
|
|
||||||
|
Commit: `feat(api): tasks routes`.
|
||||||
|
|
||||||
|
#### Task 5: Pages routes (+ revisions, backlinks)
|
||||||
|
|
||||||
|
**Files:** `lib/api/routes/pages.js`, `tests/api/pages.test.js`.
|
||||||
|
|
||||||
|
Endpoints:
|
||||||
|
- `GET /api/spaces/:space_id/pages` → `repo.listBySpace`
|
||||||
|
- `POST /api/spaces/:space_id/pages` body `{slug,title,body_md?,parent_id?}`
|
||||||
|
- `GET /api/pages/:id`
|
||||||
|
- `GET /api/spaces/:space_id/pages/by-slug/:slug`
|
||||||
|
- `PATCH /api/pages/:id`
|
||||||
|
- `DELETE /api/pages/:id`
|
||||||
|
- `GET /api/pages/:id/revisions` → `repo.listRevisions`
|
||||||
|
- `GET /api/pages/:id/backlinks` → `links.listTo('page', id)` joined with the source entity's title for display (route does the join via repos: read each from_type/from_id and resolve title)
|
||||||
|
|
||||||
|
Tests: create with body_md writes a revision; update body_md adds a revision; revisions ordered desc; backlinks returns rows when a `entity_links` row points at the page.
|
||||||
|
|
||||||
|
Commit: `feat(api): pages routes`.
|
||||||
|
|
||||||
|
#### Task 6: Refs routes
|
||||||
|
|
||||||
|
**Files:** `lib/api/routes/refs.js`, `tests/api/refs.test.js`.
|
||||||
|
|
||||||
|
Endpoints:
|
||||||
|
- `GET /api/refs?space_id=&kind=&limit=&offset=` → `repo.list`
|
||||||
|
- `POST /api/refs` body matches `repo.create` (all `FIELDS` whitelist)
|
||||||
|
- `GET /api/refs/:id`
|
||||||
|
- `PATCH /api/refs/:id`
|
||||||
|
- `DELETE /api/refs/:id`
|
||||||
|
- `POST /api/refs/upsert` body must include `space_id+source_kind+external_id` → `repo.upsertByExternal`
|
||||||
|
|
||||||
|
Tests: list with `kind=url` filter; upsert twice with same external_id returns the same row id with updated fields; pagination caps at 200.
|
||||||
|
|
||||||
|
Commit: `feat(api): refs routes`.
|
||||||
|
|
||||||
|
#### Task 7: Resources routes (+ deps)
|
||||||
|
|
||||||
|
**Files:** `lib/api/routes/resources.js`, `tests/api/resources.test.js`.
|
||||||
|
|
||||||
|
Endpoints:
|
||||||
|
- `GET /api/spaces/:space_id/resources` → `repo.listBySpace`
|
||||||
|
- `POST /api/spaces/:space_id/resources` body matches resource FIELDS
|
||||||
|
- `GET /api/resources/:id`
|
||||||
|
- `PATCH /api/resources/:id`
|
||||||
|
- `DELETE /api/resources/:id`
|
||||||
|
- `POST /api/resources/:id/dependencies` body `{depends_on, kind?}` → `repo.addDependency`; 400 on self-dep
|
||||||
|
- `GET /api/resources/:id/dependencies` → `repo.listDependencies`
|
||||||
|
- `DELETE /api/resources/:id/dependencies/:dep_id` → `repo.removeDependency`
|
||||||
|
- `GET /api/resources/:id/source-docs` → `source_docs.listByResource`
|
||||||
|
- `GET /api/resources/:id/changes` → `audit.listForEntity('resource', id)` — the resource change history is the audit log filtered to that resource
|
||||||
|
|
||||||
|
Tests: dependency create rejects self; cross-space dependency rejected by composite FK → mapped to 400 with `cross_space` code; listing dependencies returns the right rows; changes endpoint returns audit entries (create + each patch).
|
||||||
|
|
||||||
|
Commit: `feat(api): resources routes`.
|
||||||
|
|
||||||
|
#### Task 8: Source docs routes
|
||||||
|
|
||||||
|
**Files:** `lib/api/routes/source_docs.js`, `tests/api/source_docs.test.js`.
|
||||||
|
|
||||||
|
Endpoints:
|
||||||
|
- `POST /api/resources/:resource_id/source-docs` body matches source_docs FIELDS (minus resource_id, taken from URL)
|
||||||
|
- `GET /api/source-docs/:id`
|
||||||
|
- `PATCH /api/source-docs/:id`
|
||||||
|
- `DELETE /api/source-docs/:id`
|
||||||
|
- `POST /api/source-docs/:id/resync` — stub for now: returns `202 {queued:true, note:"workers land in Plan 3"}`. Behind a feature flag check `if (process.env.ENABLE_RESYNC === 'true')` → 503 otherwise. Document in the route comment that this hooks into `pg-boss` in Plan 3.
|
||||||
|
|
||||||
|
Tests: create requires resource_id from URL; resync returns 202/503 based on env.
|
||||||
|
|
||||||
|
Commit: `feat(api): source-docs routes`.
|
||||||
|
|
||||||
|
### Phase B — Agents + auth
|
||||||
|
|
||||||
|
#### Task 9: Agent bearer auth middleware
|
||||||
|
|
||||||
|
**Files:** create `lib/api/middleware/agent_auth.js`, `tests/api/agent_auth.test.js`. Modify `lib/api/index.js`.
|
||||||
|
|
||||||
|
`agent_auth.js` exports `agentOrOwner(req, res, next)`:
|
||||||
|
1. Read `Authorization: Bearer <token>` (401 if absent).
|
||||||
|
2. If token equals `OWNER_TOKEN` → `req.actor = { kind:'user', id:null }`; next().
|
||||||
|
3. Else `agents.verifyToken(token)`:
|
||||||
|
- null → 401.
|
||||||
|
- row → `req.actor = { kind:'agent', id:row.id, capabilities:row.capabilities, scopes:row.scopes }`; next().
|
||||||
|
|
||||||
|
`lib/api/index.js`: swap `ownerOnly` for `agentOrOwner` on the `/api` router. Owner tests continue to pass (same token path). New agent token tests pass.
|
||||||
|
|
||||||
|
Tests: missing header → 401; wrong token → 401; owner token → 200 + actor.kind='user'; valid agent token → 200 + actor.kind='agent'; revoked agent token → 401.
|
||||||
|
|
||||||
|
Commit: `feat(api): agent bearer auth`.
|
||||||
|
|
||||||
|
#### Task 10: Capability enforcement on mutating routes
|
||||||
|
|
||||||
|
**Files:** modify `lib/auth/capability.js` if needed; add helper `lib/api/cap.js`; add tests `tests/api/capability_routes.test.js`.
|
||||||
|
|
||||||
|
Add `requireWrite(entity_type)` middleware that calls `canAct(req.actor, 'write', entity_type)`:
|
||||||
|
- `allow` → next().
|
||||||
|
- `suggest` → divert: write the operation into `pending_changes` instead of running it, return `202 {pending:true, change_id}`. The handler still needs to know what payload to record. Strategy: middleware attaches a `req.divertToPending(payloadFactory)` helper the route calls just before invoking the repo. If `req.actor.kind === 'agent'` and tier is `suggest`, route does `await pending_changes.create({agent_id, entity_type, entity_id, action, payload, reason})` and returns 202.
|
||||||
|
- `deny` → 403.
|
||||||
|
|
||||||
|
For Plan 2, apply to: `POST/PATCH/DELETE` on `pages`, `refs`, `resources`, `source_docs`, `projects`, `tasks`, `tags`, `links`. (Read endpoints stay open to any authed agent.) `agents` writes are **owner-only** regardless (a hard `req.actor.kind === 'user'` check, 403 otherwise).
|
||||||
|
|
||||||
|
Tests: agent at `suggest` tier POSTing a page → 202 + pending row exists; agent at `allow` tier POSTing a page → 201 + page row exists; agent at `deny` tier → 403; agent attempting `POST /api/agents` → 403.
|
||||||
|
|
||||||
|
Commit: `feat(api): capability enforcement on writes`.
|
||||||
|
|
||||||
|
#### Task 11: Agents routes
|
||||||
|
|
||||||
|
**Files:** `lib/api/routes/agents.js`, `tests/api/agents.test.js`.
|
||||||
|
|
||||||
|
All endpoints owner-only (see Task 10 rule). Endpoints:
|
||||||
|
- `GET /api/agents` → `repo.list`
|
||||||
|
- `POST /api/agents` body matches FIELDS → `repo.create`
|
||||||
|
- `GET /api/agents/:id`
|
||||||
|
- `PATCH /api/agents/:id/capabilities` body `{capabilities, scopes}` → `repo.setCapabilities`
|
||||||
|
- `POST /api/agents/:id/tokens` body `{label?}` → `repo.createToken`; response `{id, token}` (plaintext shown **once** — document in comment)
|
||||||
|
- `DELETE /api/agent-tokens/:token_id` → `repo.revokeToken`
|
||||||
|
|
||||||
|
Tests: token mint returns plaintext; mint then auth-as-agent works; revoke then auth fails.
|
||||||
|
|
||||||
|
Commit: `feat(api): agents routes + token mgmt`.
|
||||||
|
|
||||||
|
### Phase C — Cross-cutting
|
||||||
|
|
||||||
|
#### Task 12: Conversations + messages routes
|
||||||
|
|
||||||
|
**Files:** `lib/api/routes/conversations.js`, `lib/api/routes/messages.js`, `tests/api/conversations.test.js`, `tests/api/messages.test.js`.
|
||||||
|
|
||||||
|
Conversations:
|
||||||
|
- `GET /api/conversations?limit=&offset=` → `repo.list`
|
||||||
|
- `POST /api/conversations` → `repo.create`
|
||||||
|
- `GET /api/conversations/:id`
|
||||||
|
- `PATCH /api/conversations/:id/status` body `{status}` → `repo.setStatus`
|
||||||
|
- `PATCH /api/conversations/:id/summary` body `{summary}` → `repo.setSummary`
|
||||||
|
|
||||||
|
Messages:
|
||||||
|
- `GET /api/conversations/:conversation_id/messages?limit=` → `messages.listByConversation`
|
||||||
|
- `POST /api/conversations/:conversation_id/messages` body `{role,body,agent_id?,metadata?}` → `messages.append`
|
||||||
|
|
||||||
|
Tests: append message, list returns it ordered; setSummary flips status to `summarized`.
|
||||||
|
|
||||||
|
Commit: `feat(api): conversations + messages routes`.
|
||||||
|
|
||||||
|
#### Task 13: Tags routes
|
||||||
|
|
||||||
|
**Files:** `lib/api/routes/tags.js`, `tests/api/tags.test.js`.
|
||||||
|
|
||||||
|
Endpoints:
|
||||||
|
- `GET /api/tags` → `tags.list`
|
||||||
|
- `POST /api/tags` body `{name, description?, color?}` → `tags.upsert`
|
||||||
|
- `POST /api/<entity_type>/:entity_id/tags` body `{tag_id}` → `tags.attach`; 204
|
||||||
|
- `DELETE /api/<entity_type>/:entity_id/tags/:tag_id` → `tags.detach`; 204
|
||||||
|
- `GET /api/<entity_type>/:entity_id/tags` → `tags.listForEntity`
|
||||||
|
|
||||||
|
Allow `entity_type` values: `space|project|task|page|ref|resource|source_doc|conversation`. Validate via zod enum.
|
||||||
|
|
||||||
|
Tests: upsert idempotent; attach idempotent on conflict; listForEntity returns tags sorted by name.
|
||||||
|
|
||||||
|
Commit: `feat(api): tags routes`.
|
||||||
|
|
||||||
|
#### Task 14: Entity links routes
|
||||||
|
|
||||||
|
**Files:** `lib/api/routes/links.js`, `tests/api/links.test.js`.
|
||||||
|
|
||||||
|
Endpoints:
|
||||||
|
- `POST /api/links` body `{from_type,from_id,to_type,to_id,relation?}` → `links.create`
|
||||||
|
- `GET /api/links/from/:type/:id` → `links.listFrom`
|
||||||
|
- `GET /api/links/to/:type/:id` → `links.listTo`
|
||||||
|
- `DELETE /api/links/:id` → `links.remove`
|
||||||
|
|
||||||
|
Tests: create twice with same tuple returns same row (ON CONFLICT path); list from/to.
|
||||||
|
|
||||||
|
Commit: `feat(api): links routes`.
|
||||||
|
|
||||||
|
#### Task 15: Pending-changes + audit routes
|
||||||
|
|
||||||
|
**Files:** `lib/api/routes/pending_changes.js`, `lib/api/routes/audit.js`, `tests/api/pending_changes.test.js`, `tests/api/audit.test.js`.
|
||||||
|
|
||||||
|
Pending changes (owner-only):
|
||||||
|
- `GET /api/pending-changes?limit=` → `pending_changes.listPending`
|
||||||
|
- `POST /api/pending-changes/:id/approve` → load row; dispatch by `entity_type+action` through the same repo (`pages.create`, `refs.update`, etc.) using `req.actor` (the approving user); mark row `approved`. Single dispatch helper `applyPendingChange(row, actor)` lives in `lib/api/routes/pending_changes.js` and uses a small switch table mapping `entity_type` → repo module.
|
||||||
|
- `POST /api/pending-changes/:id/reject` → mark `rejected`.
|
||||||
|
|
||||||
|
Audit (owner-only):
|
||||||
|
- `GET /api/audit/entity/:type/:id?limit=` → `audit.listForEntity`
|
||||||
|
- `GET /api/audit/actor?actor_kind=&actor_id=&limit=` → `audit.listByActor`
|
||||||
|
|
||||||
|
Tests: agent at `suggest` POSTs a page (from Task 10) → owner approves → page now exists, pending row is `approved`, audit log shows the create with `actor_kind='user'` (the approver). Reject test marks row `rejected`, no entity created.
|
||||||
|
|
||||||
|
Commit: `feat(api): pending-changes + audit routes`.
|
||||||
|
|
||||||
|
### Phase D — Search
|
||||||
|
|
||||||
|
#### Task 16: FTS search endpoint
|
||||||
|
|
||||||
|
**Files:** create `lib/db/repos/search.js`, `lib/api/routes/search.js`, `tests/repos/search.test.js`, `tests/api/search.test.js`.
|
||||||
|
|
||||||
|
`search.fts({q, space_id?, kinds?, limit, offset})` runs `tsvector @@ plainto_tsquery` against four sources unioned with a `kind` discriminator:
|
||||||
|
- `pages` — fts column already exists in migration 002 (`fts_tsv`); fall back to `to_tsvector('english', title || ' ' || coalesce(body_md,''))` if not.
|
||||||
|
- `refs` — uses `refs.fts_tsv` from migration 002.
|
||||||
|
- `source_docs` — `to_tsvector('english', name || ' ' || coalesce(body_text,''))`.
|
||||||
|
- `messages` — uses `messages.fts_tsv` from migration 004.
|
||||||
|
|
||||||
|
Each branch returns `{kind, id, space_id, title_or_snippet, rank}`. Final SELECT orders by `ts_rank` desc and applies `limit/offset`. `kinds` filter restricts which branches run.
|
||||||
|
|
||||||
|
Endpoint:
|
||||||
|
- `GET /api/search?q=&space_id=&kinds=page,ref&limit=&offset=` → results grouped client-side; server returns flat array with `kind` discriminator.
|
||||||
|
|
||||||
|
Tests (repo): seed 1 page + 1 ref + 1 source_doc + 1 message containing the word "blackflame", search for "blackflame" returns 4 hits, kinds filter narrows correctly. Tests (api): 401 without auth, 200 with results.
|
||||||
|
|
||||||
|
**Vector search and RRF are explicitly deferred to Plan 3** — add a TODO comment in `search.js` linking to the spec section.
|
||||||
|
|
||||||
|
Commit: `feat(api): unified FTS search`.
|
||||||
|
|
||||||
|
### Phase E — Void UI shell
|
||||||
|
|
||||||
|
#### Task 17: Static serving + shell HTML/CSS + SPA bootstrap
|
||||||
|
|
||||||
|
**Files:** create `public/index.html`, `public/style.css`, `public/app.js`, `public/router.js`, `public/api.js`. Modify `server.js` to `app.use(express.static('public'))` BEFORE the `/api` mount and ABOVE the 404 catch-all.
|
||||||
|
|
||||||
|
`index.html`: three-column flex layout: `<aside id="sidebar">`, `<main id="main">`, `<aside id="rightrail">`. Header bar `<header id="topbar">`. Loads `app.js` as `<script type="module">`.
|
||||||
|
|
||||||
|
`style.css`: blackflame palette — copy variables from Void 1.x (`/project/src/void/public/css/` if accessible) or define minimum:
|
||||||
|
```css
|
||||||
|
:root {
|
||||||
|
--bg: #0a0a0e;
|
||||||
|
--panel: #14141c;
|
||||||
|
--border: #2a2a36;
|
||||||
|
--text: #e8e6ed;
|
||||||
|
--muted: #888094;
|
||||||
|
--accent: #ff4f2e; /* blackflame */
|
||||||
|
--accent-dim: #7a2716;
|
||||||
|
}
|
||||||
|
```
|
||||||
|
Three-column grid: `grid-template-columns: 280px 1fr 360px`. Right rail collapsible to 40px. Top bar 48px height. Use system font stack + Cinzel or Cormorant via Google Fonts link for headings (Cradle aesthetic). Document the choice with a comment.
|
||||||
|
|
||||||
|
`router.js`: hash-based router (`#/space/:id`, `#/project/:id`, `#/page/:id`, `#/ref/:id`, `#/resource/:id`, `#/search?q=`, `#/inbox`, `#/sacred-valley`, `#/`). Exports `route(handler)`, `navigate(hash)`, `current()`.
|
||||||
|
|
||||||
|
`api.js`: thin fetch wrapper that reads owner token from `localStorage.void_token`. On 401, prompts for token via a modal. Methods: `api.get`, `api.post`, `api.patch`, `api.del`.
|
||||||
|
|
||||||
|
`app.js`: bootstrap — mount sidebar + topbar + rightrail, register all views via router (most as `views/home.js` stub initially), render current route.
|
||||||
|
|
||||||
|
Tests: smoke `tests/server.test.js` — `GET /` returns 200 + content-type `text/html`. (Don't deep-test the SPA here.)
|
||||||
|
|
||||||
|
Commit: `feat(ui): static shell + router + api wrapper`.
|
||||||
|
|
||||||
|
#### Task 18: Sidebar + topbar components
|
||||||
|
|
||||||
|
**Files:** `public/components/sidebar.js`, `public/components/topbar.js`, `public/components/rightrail.js`.
|
||||||
|
|
||||||
|
`sidebar.js`:
|
||||||
|
- Top section: Spaces tree — `api.get('/api/spaces')`, render as collapsible list. Each space header expands to show projects (`api.get('/api/spaces/:id/projects')` lazy on click). Drag-reorder deferred — render in static order for now.
|
||||||
|
- Bottom section: global links — Sacred Valley, Agents, Inbox (badge with pending count from `api.get('/api/pending-changes')` length), Resources (placeholder route), Search.
|
||||||
|
|
||||||
|
`topbar.js`:
|
||||||
|
- Universal capture button (placeholder: opens a modal that says "Capture lands in Plan 3" — proves the surface).
|
||||||
|
- Global search input (Enter → `router.navigate('#/search?q=' + encoded)`).
|
||||||
|
- Pending bell with count from same poll the sidebar uses (share state via a tiny `public/state.js` event bus).
|
||||||
|
- User/agent toggle (placeholder; owner-only for now).
|
||||||
|
|
||||||
|
`rightrail.js`: collapsible panel with a "Chat lands in Plan 5" placeholder + collapse toggle. Persist collapsed state in `localStorage.void_rail_collapsed`.
|
||||||
|
|
||||||
|
Tests: none — visual review only. Note this in the task block.
|
||||||
|
|
||||||
|
Commit: `feat(ui): sidebar + topbar + rightrail components`.
|
||||||
|
|
||||||
|
#### Task 19: Space view + Project view + Home
|
||||||
|
|
||||||
|
**Files:** `public/views/space.js`, `public/views/project.js`, `public/views/home.js`.
|
||||||
|
|
||||||
|
`home.js`: "Recent activity" — calls `api.get('/api/audit/actor?limit=20')` and renders entity-typed rows linking to their detail view.
|
||||||
|
|
||||||
|
`space.js`: header (name, description), three columns: Projects list, Recent tasks (`api.get('/api/spaces/:id/tasks?status=todo')`), Recent refs/pages. Each item is a hash-link.
|
||||||
|
|
||||||
|
`project.js`: header (name, status, started/completed), Tasks list with inline status toggle (PATCH on click), Pages list, Refs list, "Add task" inline form.
|
||||||
|
|
||||||
|
Tests: none (visual). After this task, manually verify: create a Space via curl, navigate UI, click through.
|
||||||
|
|
||||||
|
Commit: `feat(ui): space + project + home views`.
|
||||||
|
|
||||||
|
#### Task 20: Page editor + Reference detail
|
||||||
|
|
||||||
|
**Files:** `public/views/page.js`, `public/views/reference.js`, `public/components/markdown_editor.js`, `public/vendor/marked.min.js` (vendor from npm `marked` package).
|
||||||
|
|
||||||
|
`markdown_editor.js`: split pane — textarea on left, rendered preview on right via `marked.parse(value)`. Save button calls `api.patch('/api/pages/:id', {body_md})`. Show last revision timestamp.
|
||||||
|
|
||||||
|
`page.js`: header (title, slug), markdown editor, attachments list (read-only for now), backlinks panel calling `/api/pages/:id/backlinks`.
|
||||||
|
|
||||||
|
`reference.js`: media block (image preview if `kind=image`, embed if `kind=video`, link if `kind=url`), AI summary block (`ref.summary`), metadata table, tag list with attach/detach controls, linked-from list (`/api/links/to/ref/:id`).
|
||||||
|
|
||||||
|
Tests: none (visual). Manually: create a page via API, edit + save in UI, confirm revision count increments via API.
|
||||||
|
|
||||||
|
Commit: `feat(ui): page editor + reference detail`.
|
||||||
|
|
||||||
|
#### Task 21: Resource detail + Inbox
|
||||||
|
|
||||||
|
**Files:** `public/views/resource.js`, `public/views/inbox.js`.
|
||||||
|
|
||||||
|
`resource.js`: status header with runtime_type/host/url/status badge, dependencies list (with add via a small inline form), Source Docs list, runbook pages list (entity_links of kind `runbook`), change history (`/api/resources/:id/changes`).
|
||||||
|
|
||||||
|
`inbox.js`: list of pending changes grouped by agent, each item shows entity type icon, action, reason, JSON diff (preformatted), Approve and Reject buttons that call the respective endpoints and re-fetch. On approve, navigate to the resulting entity if the response carries `entity_id`.
|
||||||
|
|
||||||
|
Tests: none (visual). Manually create a fake pending change via API and approve it through the UI.
|
||||||
|
|
||||||
|
Commit: `feat(ui): resource + inbox views`.
|
||||||
|
|
||||||
|
#### Task 22: Search + Sacred Valley placeholder + version bump + CHANGELOG
|
||||||
|
|
||||||
|
**Files:** `public/views/search.js`, `public/views/sacred_valley.js`, `CHANGELOG.md`, `package.json`, `lib/db/migrate.js` (if version constant lives elsewhere — check first).
|
||||||
|
|
||||||
|
`search.js`: reads `?q` from hash, calls `/api/search?q=...`, renders results grouped by `kind`, sidebar filters for `kinds` and `space_id`. Empty state suggests typing.
|
||||||
|
|
||||||
|
`sacred_valley.js`: one placeholder card "Sacred Valley — widgets ported in Plan 6" with a screenshot of the Void 1.x dashboard linked. Keep the route registered so the sidebar link works.
|
||||||
|
|
||||||
|
Bump `package.json` version to `2.0.0-alpha.2`; same in `server.js` `VERSION` constant.
|
||||||
|
|
||||||
|
Add CHANGELOG entry under `## [2.0.0-alpha.2] — 2026-MM-DD` listing all 14 route groups, FTS search, UI shell, sidebar/topbar/rightrail, six views, agent bearer auth + capability dispatch on writes.
|
||||||
|
|
||||||
|
Tests: update `tests/server.test.js` version assertion to match. Re-run full suite — should still be green.
|
||||||
|
|
||||||
|
Commit: `chore: version 2.0.0-alpha.2 + changelog`.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Verification at end of plan
|
||||||
|
|
||||||
|
```bash
|
||||||
|
cd /project/src/void-v2
|
||||||
|
npm test # all plan 1 tests + new api tests green
|
||||||
|
npm run migrate # no-op
|
||||||
|
OWNER_TOKEN=test npm start &
|
||||||
|
sleep 1
|
||||||
|
curl -s localhost:3000/health
|
||||||
|
curl -s -H "Authorization: Bearer test" localhost:3000/api/spaces
|
||||||
|
curl -s -H "Authorization: Bearer test" \
|
||||||
|
-H "Content-Type: application/json" \
|
||||||
|
-d '{"slug":"home","name":"Home"}' \
|
||||||
|
localhost:3000/api/spaces # 201
|
||||||
|
# browser: http://localhost:3000/ → SPA loads, set token, see Home space
|
||||||
|
kill %1
|
||||||
|
```
|
||||||
|
|
||||||
|
Manual UI smoke (record outcome in `docs/plan-2-complete.md`):
|
||||||
|
1. Set token in modal → token persists in localStorage.
|
||||||
|
2. Create space "Home" via UI → space appears in sidebar.
|
||||||
|
3. Create project → appears in space view.
|
||||||
|
4. Create page → editor opens, edit + save → revision count increments via API check.
|
||||||
|
5. Navigate `#/search?q=home` → space title shows up.
|
||||||
|
6. Create pending-change via API → bell shows badge → approve via Inbox → entity created.
|
||||||
|
|
||||||
|
## What's left after Plan 2
|
||||||
|
|
||||||
|
- Plan 3: capture pipeline (pg-boss, Karakeep webhook, URL/YouTube/PDF/image workers, embeddings)
|
||||||
|
- Plan 4: heavy ingest (Whisper, Tesseract, OCR) and `void-workers` Python service
|
||||||
|
- Plan 5: MCP server (stdio + HTTP/SSE) + Cradle agent runtime + right-rail chat
|
||||||
|
- Plan 6: Sacred Valley widgets ported into UI
|
||||||
|
- Plan 7: Void 1.x / BookStack / Karakeep / auto-memory migrations
|
||||||
|
- Plan 8: E2E Playwright + CI + tighten security follow-ups (drop SUPERUSER, fileParallelism revisit, polymorphic space_id question)
|
||||||
2430
docs/superpowers/plans/2026-06-01-void-v2-plan3-capture.md
Normal file
2430
docs/superpowers/plans/2026-06-01-void-v2-plan3-capture.md
Normal file
File diff suppressed because it is too large
Load Diff
656
docs/superpowers/specs/2026-05-31-void-v2-design.md
Normal file
656
docs/superpowers/specs/2026-05-31-void-v2-design.md
Normal file
@@ -0,0 +1,656 @@
|
|||||||
|
# Void 2.0 — Homelab Orchestrator & Knowledge Foundation
|
||||||
|
|
||||||
|
**Status:** IN PROGRESS — brainstorming, not yet a complete design
|
||||||
|
**Started:** 2026-05-31
|
||||||
|
**Owner:** mrhynesy@gmail.com
|
||||||
|
|
||||||
|
> This document is being filled in section by section as brainstorming progresses.
|
||||||
|
> Sections below marked `[locked]` are user-approved decisions. Sections marked
|
||||||
|
> `[pending]` are the remaining design work to complete before this becomes a
|
||||||
|
> proper spec ready for the writing-plans skill.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Vision [locked]
|
||||||
|
|
||||||
|
Replace the current scattered homelab state (Void dashboard, Karakeep bookmarks,
|
||||||
|
BookStack wiki, `/root/.claude/plans/*.md`, auto-memory entries, ad-hoc browser
|
||||||
|
tab groups) with a single **Void 2.0** — a homelab orchestrator that:
|
||||||
|
|
||||||
|
- Acts as the canonical home for projects, tasks, knowledge, and deployed-resource
|
||||||
|
state
|
||||||
|
- Ingests websites, videos, PDFs, screenshots, and files into a unified library
|
||||||
|
- Mirrors upstream documentation locally for offline + agent access
|
||||||
|
- Surfaces all of it to Claude and local AI agents via MCP, with per-agent
|
||||||
|
permission tiers
|
||||||
|
- Preserves the Void's Cradle-themed aesthetic and agent personas
|
||||||
|
- Stays available during planned host maintenance via `pct migrate`
|
||||||
|
(no automatic failover)
|
||||||
|
- Maintains privacy + security with selective remote access
|
||||||
|
|
||||||
|
Primary capture pain being solved: **"multiple grouped Chrome tabs as a poor
|
||||||
|
project-management substitute."** Void 2.0 makes that proper.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Direction & HA Shape [locked]
|
||||||
|
|
||||||
|
**Chosen direction:** Foundation-first Void 2.0 (Option 2 from initial framing).
|
||||||
|
Not an evolution of Void — a clean rebuild with Void as the visible UI on top.
|
||||||
|
|
||||||
|
**HA model:** Planned-maintenance only. User instructs the stack before host
|
||||||
|
shutdown; Proxmox live-migrates the LXCs to another node (~10-60s pause). No
|
||||||
|
automatic failover, no quorum, no clustering complexity.
|
||||||
|
|
||||||
|
**Infrastructure:**
|
||||||
|
|
||||||
|
| LXC | Purpose | Stateful? |
|
||||||
|
|---|---|---|
|
||||||
|
| `void2-db` | Postgres + pgvector | Yes — the canonical store |
|
||||||
|
| `void2-app` | Node API + Python workers + Void UI + cron | No (data in `void2-db`) |
|
||||||
|
|
||||||
|
Future-improvements list (parked):
|
||||||
|
- Build own bookmark capture front-end to replace Karakeep
|
||||||
|
- Extract MCP server to its own LXC if it grows independent
|
||||||
|
- True clustering if "instant failover" becomes a need
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Entity Map [locked]
|
||||||
|
|
||||||
|
| Entity | Lives in | Contains / links to |
|
||||||
|
|---|---|---|
|
||||||
|
| **Space** | top-level | Projects, Tasks, Pages, Refs, SourceDocs, Conversations, Resources |
|
||||||
|
| **Project** | a Space | Tasks (children); has-many Pages, Refs, SourceDocs, Conversations, Resources |
|
||||||
|
| **Task** | a Space, optionally also a Project | Pages, Refs, Conversations |
|
||||||
|
| **Page** (authored) | tagged | backlinks, attachments — your notes + AI-assisted commentary |
|
||||||
|
| **Reference** (captured) | tagged | source URL, local snapshot, metadata — websites/videos/PDFs/files/images |
|
||||||
|
| **Source Doc** (mirrored upstream) | bound to a Resource | version, last-synced, sync source — official docs from publisher |
|
||||||
|
| **Conversation** | attaches to Space/Project/Task/Resource | Messages — first-class, multi-agent |
|
||||||
|
| **Resource** (deployed service, rich) | a Space | dependencies, credentials refs, source docs, runbook pages, change history, monitoring config |
|
||||||
|
|
||||||
|
**Relationships are explicit, not implied.** Any entity can attach to any other
|
||||||
|
via typed links (`project_pages`, `task_refs`, `resource_source_docs`, etc.).
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Capture Pipeline [locked — day-one inputs]
|
||||||
|
|
||||||
|
Day-one capture inputs:
|
||||||
|
1. **URLs / bookmarks** — Karakeep stays as inbox; webhook flows new bookmarks
|
||||||
|
into Void 2.0 as References (with AI-suggested Project/Space tagging)
|
||||||
|
2. **YouTube / web videos** — `yt-dlp` for metadata + transcript; local Whisper
|
||||||
|
if no transcript; AI summary + chapters via Ollama
|
||||||
|
3. **PDFs / documents** — text extract or Tesseract OCR; AI summary; full text
|
||||||
|
indexed
|
||||||
|
4. **Screenshots / images** — Tesseract OCR; AI summary
|
||||||
|
5. **Generic files** — blob storage on host; indexed by name + tags
|
||||||
|
|
||||||
|
All AI summarization runs against local Ollama (CT 102).
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Agent Model [locked]
|
||||||
|
|
||||||
|
**Per-agent capability tiers.** Each AI agent (Claude, Mercy, Orthos, Dross,
|
||||||
|
Eithan, Lindon, Yerin, Little Blue, future agents) has its own capability record.
|
||||||
|
|
||||||
|
- **Default for all agents:** `read` + `suggest`. Agents can search/read
|
||||||
|
anything. Writes are *drafts* in a "pending changes" inbox the user approves.
|
||||||
|
- **Promotable per agent:** `write` capability, scoped (e.g., Mercy gets
|
||||||
|
write-on-Pages but not Resources)
|
||||||
|
- **Audit log:** every agent action recorded with `agent_id` + timestamp + diff
|
||||||
|
|
||||||
|
MCP surface exposes Void 2.0 to Claude Code, Open WebUI, OpenClaw, and future
|
||||||
|
agents through the same interface.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Build Approach [locked]
|
||||||
|
|
||||||
|
**Approach A — Greenfield modular monolith.**
|
||||||
|
|
||||||
|
- New repo at `/project/src/void-v2`
|
||||||
|
- Two processes on `void2-app` LXC:
|
||||||
|
- **`void-server`** (Node) — REST API + MCP + Void UI + cron + light ingest
|
||||||
|
(Karakeep webhook)
|
||||||
|
- **`void-workers`** (Python) — heavy ML ingest: yt-dlp, Whisper, Tesseract,
|
||||||
|
PDF extract, embeddings via Ollama
|
||||||
|
- Postgres + pgvector on `void2-db` LXC
|
||||||
|
- Copy across from current Void (without inheriting its structure): agent
|
||||||
|
persona files, blackflame theme CSS, Cradle naming, cron task list, schema
|
||||||
|
YAMLs as initial Resource seed data
|
||||||
|
- Old Void on CT 301 keeps running until cutover; then archived
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Architecture Details [locked]
|
||||||
|
|
||||||
|
### Two processes, one job queue, strict boundaries
|
||||||
|
|
||||||
|
**`void-server` (Node)** owns: HTTP API, MCP server, Void UI, cron, agent
|
||||||
|
runtime, light ingest (Karakeep webhook, manual paste). Internal layout:
|
||||||
|
|
||||||
|
```
|
||||||
|
lib/
|
||||||
|
db/ Postgres pool, migrations, repos/ (one file per entity)
|
||||||
|
api/ HTTP routes (thin — just call repos)
|
||||||
|
mcp/ MCP server, tool definitions, per-agent capability checks
|
||||||
|
ingest/ Karakeep webhook, manual capture
|
||||||
|
jobs/ Enqueue heavy work for workers (pg-boss client)
|
||||||
|
cron/ Scheduler + one file per task
|
||||||
|
agents/ Cradle persona runtime (Claude subprocess + Ollama via Mastra)
|
||||||
|
```
|
||||||
|
|
||||||
|
**Boundary rule:** HTTP and MCP both reach data only via `repos/`. No raw SQL in
|
||||||
|
routes. Same repos enforce per-agent capability checks. This is what makes any
|
||||||
|
later extraction (e.g., MCP as its own service) painless.
|
||||||
|
|
||||||
|
**`void-workers` (Python)** owns heavy ML ingest. One worker per kind:
|
||||||
|
`video.py` (yt-dlp + Whisper), `pdf.py` (pdftotext / Tesseract), `image.py`
|
||||||
|
(Tesseract), `file.py` (blob + indexing), `sourcedoc.py` (mirror upstream docs).
|
||||||
|
They poll the job queue, claim work, write results to DB.
|
||||||
|
|
||||||
|
### Job queue: pg-boss
|
||||||
|
|
||||||
|
Postgres-backed, Node + Python clients. We don't add Redis/RabbitMQ — the DB is
|
||||||
|
already there. Failed jobs retry with backoff, then land in a dead-letter table.
|
||||||
|
|
||||||
|
**Redis rejected** — Postgres-on-local-LXC is sub-millisecond for indexed
|
||||||
|
queries; the bottlenecks in Void 2.0 will be Ollama/Whisper/OCR (seconds–minutes),
|
||||||
|
not the DB. Adding Redis would buy invisible perf wins at the cost of cache
|
||||||
|
invalidation complexity and another LXC to manage. Reconsider only if profiling
|
||||||
|
shows a specific bottleneck.
|
||||||
|
|
||||||
|
### Caching, if needed
|
||||||
|
|
||||||
|
- **In-process LRU** (JS `Map` with size cap) inside `void-server` for hot
|
||||||
|
lookups. Zero ops cost.
|
||||||
|
- **`pg LISTEN/NOTIFY`** for real-time UI updates (transcription progress, etc.)
|
||||||
|
if/when we want them. Built into Postgres — no extra service.
|
||||||
|
|
||||||
|
### Cron
|
||||||
|
|
||||||
|
Lives only in `void-server` (single process — no leader election needed).
|
||||||
|
Light tasks run in-process; heavy tasks enqueue worker jobs.
|
||||||
|
|
||||||
|
### Audit log
|
||||||
|
|
||||||
|
Append-only. Every mutating call (HTTP, MCP, cron, worker) writes one row:
|
||||||
|
`actor_kind`, `actor_id`, `entity_type`, `entity_id`, `action`, `diff`,
|
||||||
|
`occurred_at`. Powers: pending-changes inbox for agent drafts, Resource change
|
||||||
|
history, "who did what when" forensics.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Schema [locked]
|
||||||
|
|
||||||
|
All ids `uuid` (`gen_random_uuid()`). All entities have `created_at` /
|
||||||
|
`updated_at`. Vector columns are `vector(1024)` everywhere — embeddings from
|
||||||
|
`nomic-embed-text` (768 dims) padded with zeros so model swap to a 1024-dim
|
||||||
|
model is a re-embed pass, not a migration. Slugs unique per-Space.
|
||||||
|
Single implicit user for now; audit columns store `actor_kind` + `actor_id` so
|
||||||
|
multi-user is a non-breaking later migration.
|
||||||
|
|
||||||
|
### Core entity tables
|
||||||
|
|
||||||
|
| Table | Key columns |
|
||||||
|
|---|---|
|
||||||
|
| `spaces` | slug, name, description, theme |
|
||||||
|
| `projects` | space_id, slug, name, status, started_at, completed_at |
|
||||||
|
| `tasks` | space_id, project_id (nullable), title, body, status, priority, due_at, position |
|
||||||
|
| `pages` | space_id, slug, title, body_md, body_html, parent_id, embedding |
|
||||||
|
| `page_revisions` | page_id, body_md, edited_by, created_at |
|
||||||
|
| `refs` | space_id, kind (`url\|video\|pdf\|image\|file`), source_url, title, summary, body_text, blob_path, metadata, embedding, source_kind, external_id |
|
||||||
|
| `source_docs` | resource_id, name, upstream_url, version, format, sync_source, local_path, last_synced, embedding |
|
||||||
|
| `resources` | space_id, slug, name, runtime_type (`lxc\|vm\|docker\|bare-metal`), host, url, version, status, monitoring (jsonb) |
|
||||||
|
| `resource_dependencies` | resource_id, depends_on, kind |
|
||||||
|
| `resource_credentials` | resource_id, label, vault_path, kind, notes |
|
||||||
|
| `conversations` | title, agent_id, participants, summary, embedding |
|
||||||
|
| `messages` | conversation_id, role, agent_id, body, metadata |
|
||||||
|
| `agents` | slug, name, kind, model, persona_path, capabilities (jsonb), scopes (jsonb) |
|
||||||
|
|
||||||
|
### Cross-cutting tables
|
||||||
|
|
||||||
|
| Table | Purpose |
|
||||||
|
|---|---|
|
||||||
|
| `tags` | normalized tag list (name, description, color) |
|
||||||
|
| `entity_tags` | (entity_type, entity_id, tag_id) — polymorphic tagging |
|
||||||
|
| `entity_links` | (from_type, from_id, to_type, to_id, relation) — any-to-any linkage |
|
||||||
|
| `attachments` | (entity_type, entity_id, filename, mime_type, blob_path, checksum) |
|
||||||
|
| `audit_log` | append-only mutation history |
|
||||||
|
| `pending_changes` | agent draft inbox awaiting approval |
|
||||||
|
| `pg-boss` tables | managed by the queue lib |
|
||||||
|
|
||||||
|
### Default lifecycle states
|
||||||
|
|
||||||
|
- Project: `idea | active | paused | done | abandoned`
|
||||||
|
- Task: `todo | doing | blocked | done`
|
||||||
|
- Resource: `running | stopped | down | unknown`
|
||||||
|
|
||||||
|
(State transitions and automation defined in the Status section, later.)
|
||||||
|
|
||||||
|
### Search strategy
|
||||||
|
|
||||||
|
- **Full-text** — Postgres `tsvector` + GIN on `pages.body_md`,
|
||||||
|
`refs.title+summary+body_text`, `source_docs.body_text`, `messages.body`.
|
||||||
|
One query, all knowledge types.
|
||||||
|
- **Semantic** — pgvector HNSW indexes on `pages.embedding`, `refs.embedding`,
|
||||||
|
`source_docs.embedding`, `conversations.embedding`. Embeddings generated by
|
||||||
|
Ollama at write time, async via worker.
|
||||||
|
- **Combined** — search API does FTS + vector in parallel, fuses with
|
||||||
|
reciprocal-rank fusion. Filters by Space, Project, tags, kind.
|
||||||
|
|
||||||
|
### Key design decisions
|
||||||
|
|
||||||
|
1. **Polymorphic links over dedicated junction tables** — one `entity_links`
|
||||||
|
table instead of ~20 pairwise junctions. Loses Postgres-enforced FK
|
||||||
|
integrity on polymorphic columns; pays back in flexibility. Periodic
|
||||||
|
integrity-check query catches orphans.
|
||||||
|
2. **Audit log is the only mutation history** — no per-entity history tables.
|
||||||
|
Powers pending-changes inbox, Resource change history, and forensics from
|
||||||
|
one mechanism.
|
||||||
|
3. **`page_revisions` is the exception** — full markdown snapshots, not diffs.
|
||||||
|
Disk is cheap; debugging a corrupted page from a 12-step diff chain is not.
|
||||||
|
4. **JSONB for variable shape** — `metadata` columns on `refs` (kind-specific),
|
||||||
|
`resources` (monitoring config), `agents` (capabilities, scopes). Add fields
|
||||||
|
without migrations.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## API Surface [locked]
|
||||||
|
|
||||||
|
### REST (Void UI ↔ void-server)
|
||||||
|
|
||||||
|
Standard CRUD per entity under `/api/`, JSON in/out, errors as
|
||||||
|
`{error: {code, message, details}}`. Pagination via `?limit=&offset=`.
|
||||||
|
|
||||||
|
Endpoint groups: spaces, projects, tasks, pages (+ revisions, backlinks),
|
||||||
|
refs, source_docs (+ resync), resources (+ dependencies, changes),
|
||||||
|
conversations (+ messages), agents, search (unified FTS + vector with RRF),
|
||||||
|
tags, links, pending-changes (approve/reject), audit, capture
|
||||||
|
(karakeep webhook, manual url, file upload, youtube), jobs (observability).
|
||||||
|
|
||||||
|
**Auth:** Bearer token. Single owner token for the Void UI. Per-agent tokens in
|
||||||
|
a separate `agent_tokens` table (hashed). Audit log records `actor_kind` +
|
||||||
|
`actor_id` on every mutation.
|
||||||
|
|
||||||
|
### MCP (AI agents ↔ void-server)
|
||||||
|
|
||||||
|
Smaller, task-oriented surface — not full CRUD. Tools enforce per-agent
|
||||||
|
capabilities; default-tier agents get writes routed to `pending_changes`.
|
||||||
|
|
||||||
|
Initial tools:
|
||||||
|
`void.search`, `void.get_entity`, `void.list_projects`, `void.list_tasks`,
|
||||||
|
`void.related`, `void.read_conversation`, `void.resource_status`,
|
||||||
|
`void.draft_page`, `void.draft_task`, `void.draft_ref`,
|
||||||
|
`void.append_journal`, `void.suggest_link`, `void.update_entity`.
|
||||||
|
|
||||||
|
**Transport:** both stdio (for Claude Code spawned subprocess) and HTTP/SSE
|
||||||
|
(for Open WebUI, OpenClaw, remote agents). Same tool definitions, two
|
||||||
|
transports. Capability checks happen in tool handlers, which call the same
|
||||||
|
`repos/` as REST — one source of truth, two front doors.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Capture Workers [locked]
|
||||||
|
|
||||||
|
### Job kinds (one Python module per kind)
|
||||||
|
|
||||||
|
`ingest.karakeep`, `ingest.url`, `ingest.youtube`, `ingest.video`,
|
||||||
|
`ingest.pdf`, `ingest.image`, `ingest.file`, `sync.source_doc`, `embed.text`,
|
||||||
|
`summarize.conversation`.
|
||||||
|
|
||||||
|
### Job lifecycle
|
||||||
|
|
||||||
|
```
|
||||||
|
queued → claimed → running → done
|
||||||
|
↘ failed → retry (exp backoff: 10s, 60s, 5m) → dead-letter
|
||||||
|
```
|
||||||
|
|
||||||
|
Workers atomically claim via pg-boss, validate input, check idempotency,
|
||||||
|
do work, write results in a transaction (entity row + audit log + downstream
|
||||||
|
enqueues), mark done. Transient errors retry; permanent errors dead-letter
|
||||||
|
immediately.
|
||||||
|
|
||||||
|
### Idempotency
|
||||||
|
|
||||||
|
Every job carries `idempotency_key`. For URL/Karakeep ingest:
|
||||||
|
`key = sha256(source_url + space_id)`. If a successful job with that key
|
||||||
|
exists, no-op.
|
||||||
|
|
||||||
|
### Concurrency (per-kind queues)
|
||||||
|
|
||||||
|
| Kind | Limit | Reason |
|
||||||
|
|---|---|---|
|
||||||
|
| `ingest.youtube`, `ingest.video` | **1** | Whisper GPU-bound on A2000 6GB |
|
||||||
|
| `ingest.pdf`, `ingest.image` | 2 | Tesseract CPU-bound |
|
||||||
|
| `ingest.url`, `ingest.karakeep`, `ingest.file` | 4 | Network/disk-bound |
|
||||||
|
| `sync.source_doc` | 1 | One source at a time; don't hammer upstream |
|
||||||
|
| `embed.text`, `summarize.conversation` | 2 | Ollama-bound |
|
||||||
|
|
||||||
|
### Blob storage
|
||||||
|
|
||||||
|
Content-addressed on local disk: `/var/lib/void/blobs/<sha-prefix>/<sha>`.
|
||||||
|
Deduplicates identical files. ZFS dataset replicated to Leonardo via existing
|
||||||
|
syncoid daily. MinIO is a future option, not day-one.
|
||||||
|
|
||||||
|
### Dead-letter & monitoring
|
||||||
|
|
||||||
|
pg-boss managed dead-letter table. Void UI "Jobs" panel shows pending,
|
||||||
|
running, recent completions, dead-letter with retry/delete actions.
|
||||||
|
|
||||||
|
### Downstream chaining
|
||||||
|
|
||||||
|
Finished jobs enqueue more jobs in the same transaction (e.g., source doc
|
||||||
|
sync → embed each chunk). Keeps everything resumable: if Ollama is down,
|
||||||
|
the entity saves without embedding, embed retries later.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## UI / Orchestrator Shape [locked]
|
||||||
|
|
||||||
|
### Shell
|
||||||
|
|
||||||
|
Three columns, Cradle aesthetic preserved (blackflame palette, Cradle naming).
|
||||||
|
|
||||||
|
- **Sidebar:** Spaces tree on top (collapsible, drag-to-reorder); global views
|
||||||
|
below — Sacred Valley, Agents, Inbox (pending changes with count), Resources
|
||||||
|
cross-space, full Search
|
||||||
|
- **Main pane:** context-dependent view (Space, Project, Page editor, Reference
|
||||||
|
detail, Resource detail, Search, Sacred Valley, Inbox, Conversation)
|
||||||
|
- **Right rail:** always-visible context-aware chat companion, collapsible to
|
||||||
|
slim tab. Agent scoped to current view; per-Space default agent. Drag-handle
|
||||||
|
to resize.
|
||||||
|
- **Top bar:** universal capture button (paste/drop → AI suggests Space+Project
|
||||||
|
→ confirm), global search, pending-changes bell with count, user/agent toggle
|
||||||
|
|
||||||
|
### Views (main pane)
|
||||||
|
|
||||||
|
| View | Purpose |
|
||||||
|
|---|---|
|
||||||
|
| Space | Overview of projects, tasks, refs, pages, resources in that space |
|
||||||
|
| Project | Header (status/dates), Tasks, References, Pages, Conversations, Resources |
|
||||||
|
| Page editor | Markdown editor with split preview, FTS in-page, attach upload |
|
||||||
|
| Reference detail | Media preview + AI summary + metadata + tags + linked-from |
|
||||||
|
| Resource detail | Health header + dependencies graph + Source Docs + runbook Pages + change history |
|
||||||
|
| Search | Unified FTS + vector results, grouped by type, sidebar filters |
|
||||||
|
| Sacred Valley | Current gridstack dashboard, preserved (weather, speedtest, host-perf, briefings, service health) |
|
||||||
|
| Inbox | Pending changes grouped by agent, with diff viewer + approve/reject |
|
||||||
|
| Conversation | Full-window chat when right-rail isn't enough |
|
||||||
|
|
||||||
|
### Defaults
|
||||||
|
|
||||||
|
- **Landing page:** last-viewed Space, falling back to a "Home" overview of
|
||||||
|
recent activity across all Spaces
|
||||||
|
- **Sacred Valley:** kept as a named sidebar view (not the default homepage)
|
||||||
|
- **Right-rail chat:** always visible, context-aware, collapsible
|
||||||
|
- **Capture button:** paste-anything modal → AI infers kind (URL/file/text)
|
||||||
|
→ suggests Space+Project from content + tags → user confirms or overrides
|
||||||
|
|
||||||
|
### Pending Changes Inbox
|
||||||
|
|
||||||
|
Items grouped by agent. Each shows entity-type icon + agent's reason + diff
|
||||||
|
viewer + approve/reject. Approving runs the mutation through the same repo as
|
||||||
|
a direct write would (single code path).
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Security & Auth [locked]
|
||||||
|
|
||||||
|
### Authentication layers
|
||||||
|
|
||||||
|
| Layer | Mechanism | Scope |
|
||||||
|
|---|---|---|
|
||||||
|
| Owner via browser/mobile | Cloudflare Access (Google IDP, restricted email) → CF Tunnel → Void 2.0 | Full owner |
|
||||||
|
| AI agents via MCP | Bearer tokens, bcrypt-hashed in `agent_tokens`. Scoped by `agents.capabilities + scopes` | Per-agent tiered |
|
||||||
|
| void2-app → void2-db | Dedicated Postgres user, limited grants, LAN-only | Service account |
|
||||||
|
| void2-app → Ollama | LAN, no auth | LAN-only |
|
||||||
|
|
||||||
|
### Remote-access boundary
|
||||||
|
|
||||||
|
| Surface | Reachable how | Behind CF Access? |
|
||||||
|
|---|---|---|
|
||||||
|
| `void.hynesy.com` (UI) | CF Tunnel | Yes — Google auth, your email |
|
||||||
|
| `mcp.void.hynesy.com` (MCP HTTP/SSE for remote agents) | CF Tunnel | Yes — CF Access Service Tokens |
|
||||||
|
| Internal MCP (Claude Code, Open WebUI on CT 103) | Direct LAN | No — local |
|
||||||
|
| Postgres | LAN-only, firewalled | n/a |
|
||||||
|
|
||||||
|
### Secrets handling
|
||||||
|
|
||||||
|
- Bootstrap secrets in `.env` files on each LXC, `chmod 600`, owned by service user
|
||||||
|
- `resource_credentials.vault_path` is a *pointer string* (`env:NAME`,
|
||||||
|
`file:/path`, or future `vault:id`). Void 2.0 resolver reads from env or file.
|
||||||
|
Schema unchanged if/when we swap to Vaultwarden — only the resolver changes.
|
||||||
|
- Agent tokens shown plaintext **once** at creation, then bcrypt-hashed.
|
||||||
|
- No secrets in audit log (per-entity redaction before write).
|
||||||
|
|
||||||
|
### Privacy posture
|
||||||
|
|
||||||
|
- All AI inference local by default (Ollama on CT 102)
|
||||||
|
- Claude API calls cross to Anthropic — documented egress channel; PII flagging
|
||||||
|
not in v1
|
||||||
|
- Audit log retains every mutation for forensics
|
||||||
|
|
||||||
|
### Backup posture
|
||||||
|
|
||||||
|
- ZFS daily syncoid replication of `void2-db` + blob datasets to Leonardo
|
||||||
|
- Postgres `pg_dump` cron daily (restore-test friendly, independent of ZFS)
|
||||||
|
- Encrypted ZFS datasets for any off-site replica targets later (Farm)
|
||||||
|
|
||||||
|
### Out of scope (v1)
|
||||||
|
|
||||||
|
mTLS between internal services, field-level encryption in DB, HSMs, PII
|
||||||
|
detection before LLM egress.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Future Improvements (deferred)
|
||||||
|
|
||||||
|
These are intentionally **not** day-one work. Tracked so they don't get
|
||||||
|
forgotten:
|
||||||
|
|
||||||
|
- **Vaultwarden secrets store** — user explicitly asked to be reminded. Day-one
|
||||||
|
resolver was designed so this is a swap, not a schema change. See
|
||||||
|
[auto-memory: project_void_v2_vaultwarden_followup].
|
||||||
|
- **Own bookmark capture front-end** to replace Karakeep
|
||||||
|
- **MinIO** for blob storage (S3-compatible access from elsewhere)
|
||||||
|
- **Extract MCP** to its own LXC if it grows independently
|
||||||
|
- **True clustering / instant failover** (Patroni) if zero-downtime maintenance becomes needed
|
||||||
|
- **PII detection** before Anthropic API egress
|
||||||
|
- **Mobile-optimized capture flow** (PWA install, share-target intent on Android)
|
||||||
|
- **Local STT** (Whisper) for voice notes as a capture kind
|
||||||
|
- **RSS / email** ingest
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Naming & Versioning [locked]
|
||||||
|
|
||||||
|
This project is **Void 2.0** — a full remaster of the existing Void
|
||||||
|
(retroactively "Void 1.x") with the same Cradle aesthetic, expanded into a
|
||||||
|
homelab orchestrator + canonical knowledge store. "Codex" is **not** a name —
|
||||||
|
just a way we referenced the data-layer concept during brainstorming. There
|
||||||
|
is no `Codex` brand or module; the data layer is `lib/db/` / `lib/repos/`
|
||||||
|
inside `void-server`.
|
||||||
|
|
||||||
|
### Repo / process / LXC naming
|
||||||
|
|
||||||
|
- **Repo:** `/project/src/void-v2`
|
||||||
|
- **Processes:** `void-server` (Node), `void-workers` (Python)
|
||||||
|
- **LXCs during cutover:** `void2-db`, `void2-app` (the `2` suffix avoids
|
||||||
|
clashing with current CT 301 `void`). After CT 301 retirement: rename to
|
||||||
|
plain `void-db`, `void-app`.
|
||||||
|
- **Domains:** `void.hynesy.com` (UI), `mcp.void.hynesy.com` (MCP HTTP/SSE)
|
||||||
|
- **MCP tool prefix:** `void.search`, `void.draft_page`, etc.
|
||||||
|
|
||||||
|
### Version strategy
|
||||||
|
|
||||||
|
Semver: `MAJOR.MINOR.PATCH`.
|
||||||
|
- **2.0.0** — initial Void 2.0 release after Void 1.x retirement
|
||||||
|
- Minor bumps for added features, patch bumps for fixes
|
||||||
|
- Major bumps reserved for architecture/schema changes that require migrations
|
||||||
|
|
||||||
|
### CHANGELOG
|
||||||
|
|
||||||
|
`CHANGELOG.md` at the root of `/project/src/void-v2`, following the [Keep a
|
||||||
|
Changelog](https://keepachangelog.com) convention. Entry for **2.0.0**
|
||||||
|
captures the differences from Void 1.x at a high level (architecture, schema,
|
||||||
|
capture pipeline, agent model, naming). Subsequent releases get their own
|
||||||
|
sections. Each entry: Added / Changed / Deprecated / Removed / Fixed.
|
||||||
|
|
||||||
|
A separate `docs/VERSION_HISTORY.md` carries the **narrative** version
|
||||||
|
history — when each release happened, the headline thinking behind it,
|
||||||
|
deferred items rolled in, lessons. Lives alongside the design spec for
|
||||||
|
long-term archaeology. Each `MAJOR.x.x` release gets a section.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Migration / Cutover Plan [locked]
|
||||||
|
|
||||||
|
### Existing data inventory
|
||||||
|
|
||||||
|
| Source | Location | Volume | Maps to |
|
||||||
|
|---|---|---|---|
|
||||||
|
| Void 1.x SQLite | CT 301 | wiki_pages (~25), messages, projects, conversations | Void 2.0 `pages`, `messages` (grouped into `conversations`), `projects` |
|
||||||
|
| BookStack | CT 104 MariaDB | ~17+ pages, hierarchy | `pages` (parent_id preserved); dedupe vs already-imported wiki_pages |
|
||||||
|
| Karakeep | CT 100 | bookmarks + AI summaries + tags | `refs` (kind=url), `external_id` = karakeep id |
|
||||||
|
| `/root/.claude/plans/*.md` | filesystem | 5 plan files | `pages` under each plan's Project |
|
||||||
|
| Void 1.x agent personas | `/project/src/void/characters/` | 7 agents × 3 files | `agents.persona_path` |
|
||||||
|
| Void 1.x schema YAMLs | `/project/src/void/schemas/` | 11 services | `resources` seed data + `resources.monitoring` jsonb |
|
||||||
|
| Void 1.x code (theme, cron logic) | source | selective | Reused inside `void-server` |
|
||||||
|
| Auto-memory entries | `/root/.claude/projects/-project/memory/*.md` | ~30 entries | **Mirrored** — see below |
|
||||||
|
|
||||||
|
### Migration script structure
|
||||||
|
|
||||||
|
Python migration tool in `void-workers/migrate/` with sub-commands:
|
||||||
|
|
||||||
|
```
|
||||||
|
void-migrate bookstack --source-db <conn>
|
||||||
|
void-migrate karakeep --source-db <conn>
|
||||||
|
void-migrate void1-sqlite --source-db <path>
|
||||||
|
void-migrate plans --source-dir /root/.claude/plans/
|
||||||
|
void-migrate memory --source-dir /root/.claude/projects/-project/memory/
|
||||||
|
void-migrate void1-schemas --source-dir /project/src/void/schemas/
|
||||||
|
void-migrate void1-personas --source-dir /project/src/void/characters/
|
||||||
|
```
|
||||||
|
|
||||||
|
Each command is **idempotent** — uses source IDs / file paths as `external_id`
|
||||||
|
so re-runs upsert rather than duplicate.
|
||||||
|
|
||||||
|
### Auto-memory: one-way mirror (files stay primary)
|
||||||
|
|
||||||
|
Auto-memory files remain the source-of-truth — Claude Code's harness reads them
|
||||||
|
directly across sessions. A worker mirrors them into Void 2.0 as Pages under a
|
||||||
|
"Memory" Space:
|
||||||
|
|
||||||
|
- Mirror runs on file change (inotify) and nightly as safety net
|
||||||
|
- Pages get `external_id = file path`, idempotent upsert
|
||||||
|
- Edits in Void 2.0 UI flow back to files via a `::memory-update` marker
|
||||||
|
(same pattern Path B established)
|
||||||
|
- Auto-memory remains canonical; Void 2.0 view is searchable, MCP-readable,
|
||||||
|
visible in the UI
|
||||||
|
|
||||||
|
### Cutover: stand up alongside, big-bang switch with grace period
|
||||||
|
|
||||||
|
1. Build Void 2.0 on new LXCs (`void2-db`, `void2-app`) without touching CT 301
|
||||||
|
2. Run migration scripts (read-only access to BookStack + Karakeep + Void 1.x DBs)
|
||||||
|
3. Verify counts + spot-check content
|
||||||
|
4. **Cutover day:** swap `void.hynesy.com` CF tunnel target from CT 301 to
|
||||||
|
`void2-app`
|
||||||
|
5. **Grace period (30 days):** CT 301 stays read-only as fallback
|
||||||
|
6. **Retire CT 301:** snapshot, stop, rename `void2-*` LXCs to `void-*`
|
||||||
|
|
||||||
|
### Cron / scheduled task migration
|
||||||
|
|
||||||
|
Existing Void 1.x cron (Dross briefing, Yerin alerts, Little Blue heal, hourly
|
||||||
|
speedtest, Orthos council) ports directly to `void-server/lib/cron/tasks/`.
|
||||||
|
Same logic, same timing, against Void 2.0's data.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Testing Approach [locked]
|
||||||
|
|
||||||
|
| Layer | Coverage | How |
|
||||||
|
|---|---|---|
|
||||||
|
| Unit | Repos, capability checks, helpers (slug gen, idempotency keys, embedding pad/truncate) | Node: vitest. Python: pytest. |
|
||||||
|
| Integration | REST + MCP tools against a test DB | Postgres-in-docker; schema applied from migrations; reset per test |
|
||||||
|
| E2E | Happy paths: create Space/Project, capture URL, search, approve pending change, attach ref | Playwright against running test instance |
|
||||||
|
| Manual (runbook'd) | Capture workers (Whisper, OCR), agent runtime (Claude subprocess + Ollama), CF Access flows | `docs/testing/manual.md` — too heavy or external for CI |
|
||||||
|
| Migration scripts | All `void-migrate` sub-commands | Fixture DBs for BookStack + Void 1.x + Karakeep; assert counts + spot-check content |
|
||||||
|
|
||||||
|
**Coverage target:** ~70% on `lib/` modules. Lower on routes/UI — covered by
|
||||||
|
integration + E2E instead. No coverage chasing.
|
||||||
|
|
||||||
|
**CI:** GitHub Actions if you mirror to a remote; local pre-push hook otherwise.
|
||||||
|
Runs unit + integration on every change to `void-server` or `void-workers`.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Status / Lifecycle Model [locked]
|
||||||
|
|
||||||
|
| Entity | States | Transitions | Automation |
|
||||||
|
|---|---|---|---|
|
||||||
|
| Project | `idea`, `active`, `paused`, `done`, `abandoned` | Free (any-to-any) | None; manual |
|
||||||
|
| Task | `todo`, `doing`, `blocked`, `done` | Free | `done` sets `completed_at` |
|
||||||
|
| Resource | `running`, `stopped`, `down`, `unknown` | Auto + manual override | Health check cron updates; manual override pins until `maintenance_until` |
|
||||||
|
| Conversation | `open`, `summarized`, `archived` | Auto with overrides | `summarize.conversation` worker moves to `summarized` after 24h idle |
|
||||||
|
| Reference | `ingested`, `indexed`, `enriched` | Worker-driven | Pipeline: capture → FTS indexed → embedded + AI summary done |
|
||||||
|
| Pending Change | `pending`, `approved`, `rejected` | User-driven | None |
|
||||||
|
|
||||||
|
**Free transitions** everywhere user-facing. Homelab work is rarely linear; the
|
||||||
|
audit log captures every transition.
|
||||||
|
|
||||||
|
**Resource status reconciliation:** health check cron writes `status` and
|
||||||
|
`last_check`. Manual override during planned maintenance pins state until a
|
||||||
|
`maintenance_until` timestamp.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Pending Sections — to complete before this is plan-ready
|
||||||
|
|
||||||
|
(All sections locked. Spec ready for user review.)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Decision Log
|
||||||
|
|
||||||
|
| Date | Decision | Why |
|
||||||
|
|---|---|---|
|
||||||
|
| 2026-05-30 | Foundation-first Void 2.0 over evolve-Void | Long-term HA requirement makes single-LXC SQLite a dead end |
|
||||||
|
| 2026-05-30 | 2 LXCs, planned-migration HA | User confirmed instant failover not needed |
|
||||||
|
| 2026-05-30 | Postgres + pgvector (no separate Qdrant) | Simpler — one DB does relational + vector |
|
||||||
|
| 2026-05-30 | Three-tier Space → Project → Task with sibling tasks | Matches how user organizes; allows ad-hoc TODOs |
|
||||||
|
| 2026-05-30 | Pages + References + Source Docs as three knowledge types | Authored vs captured vs upstream-mirrored are genuinely different |
|
||||||
|
| 2026-05-30 | Conversations first-class, attach to other entities | "Create project from chat" + AI needs prior conversation context |
|
||||||
|
| 2026-05-30 | Rich Resource entity (dependencies, creds refs, change history) | User wants real orchestrator, not just inventory |
|
||||||
|
| 2026-05-30 | Keep Karakeep as bookmark inbox; webhook into Void 2.0 | Karakeep works; building own is a deferred improvement |
|
||||||
|
| 2026-05-30 | Day-one capture: URLs, videos, PDFs, images, files | Full pipeline, no half-measures |
|
||||||
|
| 2026-05-30 | Agents: read+suggest default, per-agent tiered promotion | Balance usefulness with safety |
|
||||||
|
| 2026-05-30 | Greenfield Void 2.0 (Approach A), copy valuable bits from Void | Clean break from accumulated Void shape |
|
||||||
|
| 2026-05-31 | Two-process layout (Node server + Python workers) on one LXC | Right-tool-per-job; Python for ML, Node for API/UI/cron |
|
||||||
|
| 2026-05-31 | pg-boss job queue (not Redis/RabbitMQ) | Postgres is already there; one fewer service |
|
||||||
|
| 2026-05-31 | Skip Redis cache | DB isn't the bottleneck; Ollama/Whisper/OCR are. Reconsider only if profiling shows it. |
|
||||||
|
| 2026-05-31 | Audit log is append-only, polymorphic | One mechanism for change history + agent action tracking + pending-changes inbox |
|
||||||
|
| 2026-05-31 | `vector(1024)` everywhere with zero-padding for 768-dim embeds | Model swap is a re-embed pass, not a DDL migration |
|
||||||
|
| 2026-05-31 | Polymorphic `entity_links` over ~20 pairwise junction tables | Flexibility wins at this scale; periodic integrity check covers FK gap |
|
||||||
|
| 2026-05-31 | Single implicit user; audit columns ready for multi-user later | Multi-user is a non-breaking migration if ever needed |
|
||||||
|
| 2026-05-31 | MCP exposes task-oriented tools, not raw CRUD | Smaller surface for agents = safer + clearer semantics |
|
||||||
|
| 2026-05-31 | MCP supports both stdio + HTTP/SSE | Covers Claude Code (stdio) and network agents (HTTP) without bridges |
|
||||||
|
| 2026-05-31 | pg-boss with per-kind concurrency limits | GPU/CPU/network workloads have different parallelism needs |
|
||||||
|
| 2026-05-31 | Idempotency keys on all ingest jobs | Webhook replays + manual retries shouldn't duplicate content |
|
||||||
|
| 2026-05-31 | Content-addressed blob store; ZFS replicated via syncoid | Free dedup + your existing replication covers it |
|
||||||
|
| 2026-05-31 | Whisper concurrency stays at 1 | Conservative; tune after deploy if A2000 has headroom |
|
||||||
|
| 2026-05-31 | Three-column shell (sidebar / main / right-rail chat) | Matches orchestrator + chat-with-context workflow |
|
||||||
|
| 2026-05-31 | Sacred Valley kept as sidebar view, not landing page | Frees landing for last-viewed Space; dashboard still one click away |
|
||||||
|
| 2026-05-31 | Right-rail chat always visible, context-aware | Friction-free 'ask Mercy about this' across all views |
|
||||||
|
| 2026-05-31 | Universal capture button with AI Space/Project suggestion | One capture surface for all content kinds; reduces friction over per-page add-ref |
|
||||||
|
| 2026-05-31 | CF Access on UI + MCP-HTTP; LAN-direct for internal agents | Matches owner-via-internet + agent-on-LAN access patterns |
|
||||||
|
| 2026-05-31 | Env+file vault_path resolver day-one; Vaultwarden swap later | Pragmatic start; resolver swap doesn't change schema |
|
||||||
|
| 2026-05-31 | Agent tokens bcrypt-hashed, plaintext shown once | Standard bearer-token hygiene |
|
||||||
|
| 2026-05-31 | mTLS / field-level encryption deferred from v1 | Single-trust-domain LAN homelab; ZFS-at-rest covers it for now |
|
||||||
|
| 2026-05-31 | Renamed from "Codex" to **Void 2.0** | Preserve Cradle aesthetic + naming continuity from Void 1.x |
|
||||||
|
| 2026-05-31 | CHANGELOG.md (Keep a Changelog) + VERSION_HISTORY.md (narrative) | User wants major-version comparison + readable narrative archaeology |
|
||||||
|
| 2026-05-31 | Auto-memory: one-way mirror, files stay primary | Harness keeps working; knowledge stays unified |
|
||||||
|
| 2026-05-31 | Big-bang cutover with 30-day grace period on CT 301 | Minimal complexity; safety net against forgotten data |
|
||||||
|
| 2026-05-31 | Free state transitions; audit log records every change | Homelab work is rarely linear; don't over-validate |
|
||||||
|
| 2026-05-31 | Test coverage target ~70% on lib/, manual runbook for ML/agent flows | Where automation cost exceeds value, document instead |
|
||||||
295
docs/superpowers/specs/2026-06-01-void-v2-plan3-capture.md
Normal file
295
docs/superpowers/specs/2026-06-01-void-v2-plan3-capture.md
Normal file
@@ -0,0 +1,295 @@
|
|||||||
|
# Void 2.0 — Plan 3 Design Spec: Capture pipeline + hybrid search
|
||||||
|
|
||||||
|
**Date:** 2026-06-01
|
||||||
|
**Builds on:** Plan 1 (Foundation, complete) and Plan 2 (API + UI shell, complete, version 2.0.0-alpha.2).
|
||||||
|
**Master spec:** `docs/superpowers/specs/2026-05-31-void-v2-design.md` — many decisions inherit from there.
|
||||||
|
|
||||||
|
## Goal
|
||||||
|
|
||||||
|
Wire the Plan 2 SPA's stub Capture button to a real ingest pipeline. Add a pg-boss-backed job queue, capture entry points (URL POST + Karakeep webhook + drag-drop attachment), a URL worker that turns links into `refs`, an embeddings worker that writes vectors into the existing `embedding` columns, and a hybrid FTS+vector search that replaces the Plan 2 FTS-only `/api/search`.
|
||||||
|
|
||||||
|
## Out of scope (Plan 4 and later)
|
||||||
|
|
||||||
|
- Whisper transcription, Tesseract OCR, yt-dlp video ingestion, scanned-PDF OCR.
|
||||||
|
- The Python `void-workers` service. Plan 3 stays single-process Node.
|
||||||
|
- AI Space/Project suggestion on capture (defer; capture takes explicit `space_id`).
|
||||||
|
- Embedding chunks table — Plan 3 uses one whole-doc embedding per entity row; chunks land later once we can measure recall on a real corpus.
|
||||||
|
- MCP server surface. Plan 5+.
|
||||||
|
|
||||||
|
## Decisions locked by brainstorm
|
||||||
|
|
||||||
|
| Question | Answer |
|
||||||
|
|---|---|
|
||||||
|
| Plan 3 slice | Node-side: pg-boss + `/api/capture` POST + Karakeep webhook + URL worker + embed.text worker + hybrid search + Jobs panel. Defers ML-heavy ingest to Plan 4. |
|
||||||
|
| Capture entry points | `/api/capture` POST + Karakeep webhook + drag-drop upload. Inbound email skipped. |
|
||||||
|
| Embedding granularity | Whole-doc per entity row. Add chunks table later. |
|
||||||
|
| Search rollout | `/api/search` replaced in-place with hybrid (FTS + vector via RRF). Vector branch graceful-degrades to FTS-only if Ollama is down or the row lacks an embedding. |
|
||||||
|
| AI Space/Project suggestion | Deferred. Capture requires `space_id`. SPA preselects the user's last-used space from `localStorage`. |
|
||||||
|
| Jobs visibility | `/api/jobs?status=` + `/api/jobs/:id/retry` + `/api/jobs/:id/delete` + a minimal `#/jobs` SPA view (table grouped by status, 10 s polling, retry/delete per row). |
|
||||||
|
| Sequencing | Phase A → B → C → D (matches Plan 2 phasing). Each phase ends green and demoable. |
|
||||||
|
|
||||||
|
## Architecture
|
||||||
|
|
||||||
|
```
|
||||||
|
┌──────────────────────────────────────────┐
|
||||||
|
│ void-server (CT 311, Node, single proc)│
|
||||||
|
│ │
|
||||||
|
/api/capture ───▶ │ routes/capture.js │
|
||||||
|
/api/ingest/ │ routes/ingest.js (Karakeep webhook) │
|
||||||
|
karakeep ─────▶ │ │ │
|
||||||
|
drag-drop ─────▶ │ ▼ │
|
||||||
|
│ jobs/queue.js (pg-boss client) │
|
||||||
|
│ │ │
|
||||||
|
│ ▼ │
|
||||||
|
│ workers/ (in-process pollers) │
|
||||||
|
│ ├─ url.js │
|
||||||
|
│ ├─ karakeep.js │
|
||||||
|
│ ├─ embed.js (Ollama HTTP) │
|
||||||
|
│ └─ blob.js (drag-drop attachments) │
|
||||||
|
│ │ │
|
||||||
|
│ ▼ │
|
||||||
|
│ lib/db/repos/ (existing) + repos/jobs.js│
|
||||||
|
│ │ │
|
||||||
|
└──────┼───────────────────────────────────┘
|
||||||
|
│
|
||||||
|
┌─────────────┼──────────────┐
|
||||||
|
▼ ▼ ▼
|
||||||
|
┌──────────┐ ┌──────────────┐ ┌──────────────┐
|
||||||
|
│ Postgres │ │ Ollama │ │ Blob FS │
|
||||||
|
│ (CT 310, │ │ (CT 102, │ │ /var/lib/ │
|
||||||
|
│ pgvector │ │ nomic- │ │ void/blobs/ │
|
||||||
|
│ + pgboss │ │ embed-text)│ │ │
|
||||||
|
│ tables) │ └──────────────┘ └──────────────┘
|
||||||
|
└──────────┘
|
||||||
|
```
|
||||||
|
|
||||||
|
**Process model.** Workers and HTTP handlers share the void-server Node process. pg-boss polls Postgres on its own interval; HTTP requests enqueue jobs and return immediately with a `job_id`. No separate worker process — that's Plan 4 when the Python service arrives.
|
||||||
|
|
||||||
|
**External dependencies.** Postgres (already there), Ollama on CT 102 at `http://192.168.1.185:11434` (running, `nomic-embed-text` pulled, 768-dim embeddings verified 2026-06-01). Graceful-degrade still applies if it goes down later. Blob storage is local FS on CT 311's root pool, content-addressed.
|
||||||
|
|
||||||
|
**No new entity tables.** refs / pages / source_docs / attachments are reused. The `embedding vector(1024)` columns exist from Plan 1 (migration 002 + 004). pg-boss creates its own schema (`pgboss.*`) on first run.
|
||||||
|
|
||||||
|
## Phase A — Queue + worker harness + Jobs API
|
||||||
|
|
||||||
|
**New files:**
|
||||||
|
- `lib/jobs/queue.js` — singleton pg-boss client; `start()`, `enqueue(name, data, opts)`, `subscribe(name, handler, opts)`.
|
||||||
|
- `lib/jobs/index.js` — registers all worker handlers on start; called from `server.js` boot.
|
||||||
|
- `lib/jobs/workers/echo.js` — trivial worker used to prove the harness. Removed at end of Phase D.
|
||||||
|
- `lib/api/routes/jobs.js` — `GET /api/jobs?state=`, `GET /api/jobs/:id`, `POST /api/jobs/:id/retry`, `DELETE /api/jobs/:id`. Owner-only.
|
||||||
|
- `tests/jobs/queue.test.js` — pg-boss roundtrip: enqueue → handler runs → result.
|
||||||
|
- `tests/api/jobs.test.js` — list/retry/delete via HTTP.
|
||||||
|
|
||||||
|
**Modify:**
|
||||||
|
- `server.js` — call `jobs.start()` on boot, `jobs.shutdown()` on SIGTERM.
|
||||||
|
- `package.json` — add `pg-boss@^10`.
|
||||||
|
- `lib/api/index.js` — mount `/api/jobs`.
|
||||||
|
- `public/router.js` + `public/app.js` + add `public/views/jobs.js` — minimal Jobs view (placeholder for now; fleshed in Phase D).
|
||||||
|
|
||||||
|
**pg-boss config.** One pg-boss instance per process. Uses the existing `DATABASE_URL`. Default `pg-boss` schema name. `newJobCheckIntervalSeconds: 2` (alpha-tier; tighten later if needed). `archiveCompletedAfterSeconds: 86_400` (1 day archive). `deleteAfterDays: 7`.
|
||||||
|
|
||||||
|
**Concurrency limits** per the master spec, surfaced via `subscribe(name, handler, {teamSize, teamConcurrency})`:
|
||||||
|
|
||||||
|
| Worker name | Team size | Reason |
|
||||||
|
|---|---|---|
|
||||||
|
| `ingest.url` | 4 | Network-bound |
|
||||||
|
| `ingest.karakeep` | 4 | Network-bound |
|
||||||
|
| `ingest.blob` | 2 | Disk + sha256 hashing |
|
||||||
|
| `embed.text` | 2 | Ollama-bound (single GPU on CT 102) |
|
||||||
|
|
||||||
|
**Retry policy.** Per-worker `retryLimit: 5`, `retryBackoff: true`, `retryDelay: 10` (seconds). Effective backoff sequence: 10 s, 20 s, 40 s, 80 s, 160 s, then dead-letter. The spec called out 10 s / 60 s / 5 m but pg-boss only exposes exponential backoff with a base delay; the resulting curve is close enough.
|
||||||
|
|
||||||
|
**Dead-letter.** pg-boss's archive table (`pgboss.archive`) keeps failed jobs. `/api/jobs?state=failed` queries it. Manual retry copies to active.
|
||||||
|
|
||||||
|
**Commit:** `feat(jobs): pg-boss harness + Jobs API`.
|
||||||
|
|
||||||
|
## Phase B — Capture API + URL worker + blob storage
|
||||||
|
|
||||||
|
**Capture POST.** `POST /api/capture` (owner or agent with write tier):
|
||||||
|
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"space_id": "uuid",
|
||||||
|
"url": "https://example.com/article",
|
||||||
|
"hint": { "project_id": "uuid?", "title": "string?", "tags": ["string"] }
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Response 202 with `{ job_id, idempotency_key, ref_id?: uuid }`. Idempotency key is `sha256(space_id + url)`. If a ref already exists for that key, the response carries the existing `ref_id` and `job_id: null` (no new job enqueued).
|
||||||
|
|
||||||
|
**URL worker.** `lib/jobs/workers/url.js` for `ingest.url`:
|
||||||
|
|
||||||
|
1. Compute idempotency key. If a `refs` row already exists with `source_kind='url'` and `external_id=<key>`, return its id.
|
||||||
|
2. `fetch(url)` with `User-Agent: void-ingest/2.0` and 15 s timeout.
|
||||||
|
3. Run readability extraction (npm `@mozilla/readability` + `jsdom`). Pull `title`, `byline`, `excerpt`, `textContent`, `siteName`.
|
||||||
|
4. Insert a `refs` row: `kind='url'`, `source_url=url`, `title=readability.title`, `summary=readability.excerpt`, `body_text=readability.textContent` (truncate to 200 kB), `source_kind='url'`, `external_id=<idempotency_key>`, `metadata={ site_name, byline, content_length }`.
|
||||||
|
5. Return the ref. Embedding is handled by Phase C's repo-level trigger that wraps `refs.create`; in Phase B alone the ref simply lacks an embedding until Phase C ships.
|
||||||
|
|
||||||
|
**Drag-drop.** `POST /api/capture/upload` (multipart, owner or agent write):
|
||||||
|
|
||||||
|
- Field `file` — the binary.
|
||||||
|
- Field `space_id` — required.
|
||||||
|
- Field `meta` (json) — optional `{ title, kind, tags }`.
|
||||||
|
|
||||||
|
Multer stages uploads in `/var/lib/void/uploads-tmp/` (size cap 100 MB per file) and the worker moves the file into the content-addressed blob store on success.
|
||||||
|
|
||||||
|
Worker `ingest.blob`:
|
||||||
|
|
||||||
|
1. Stream the upload to a temp file. Hash with sha256 as it streams.
|
||||||
|
2. If `/var/lib/void/blobs/<sha-prefix>/<sha>` exists, this is a duplicate; reuse the existing path.
|
||||||
|
3. Otherwise move the temp file into place.
|
||||||
|
4. Determine `kind` from `Content-Type` / extension: `image` for image/*, `pdf` for application/pdf, `file` for everything else. Video/audio fall through to `file` in Plan 3 (Plan 4 picks them up).
|
||||||
|
5. Insert a `refs` row: `kind=<derived>`, `blob_path=<path>`, `title=filename || sha`, plus metadata.
|
||||||
|
6. Insert via `refs.create`; Phase C's trigger picks up the embed automatically. In Phase B, no embed runs.
|
||||||
|
|
||||||
|
**Blob storage.** New directory `/var/lib/void/blobs/` on CT 311, owned by `void:void`, mode 750. Layout `<first-2-chars-of-sha>/<full-sha>`. Deploy bootstrap step adds the dir creation. Already on `localzfs` so replication picks it up.
|
||||||
|
|
||||||
|
**Files:**
|
||||||
|
- `lib/api/routes/capture.js` — both endpoints + multer config.
|
||||||
|
- `lib/jobs/workers/url.js`, `lib/jobs/workers/blob.js`.
|
||||||
|
- `lib/ingest/readability.js` — wraps `@mozilla/readability` for testability.
|
||||||
|
- `lib/ingest/blob_store.js` — sha + path resolution + write.
|
||||||
|
- `tests/api/capture.test.js`, `tests/jobs/workers/url.test.js`, `tests/jobs/workers/blob.test.js`.
|
||||||
|
|
||||||
|
**Deps to add:** `pg-boss`, `@mozilla/readability`, `jsdom`, `multer`.
|
||||||
|
|
||||||
|
**Commit:** `feat(jobs): capture API + URL + blob workers`.
|
||||||
|
|
||||||
|
## Phase C — Embeddings + hybrid search
|
||||||
|
|
||||||
|
**Ollama client.** `lib/ai/ollama.js`:
|
||||||
|
|
||||||
|
```js
|
||||||
|
async function embedText(text, model = 'nomic-embed-text') {
|
||||||
|
const res = await fetch(`${OLLAMA_URL}/api/embeddings`, {
|
||||||
|
method: 'POST',
|
||||||
|
headers: { 'Content-Type': 'application/json' },
|
||||||
|
body: JSON.stringify({ model, prompt: text }),
|
||||||
|
signal: AbortSignal.timeout(60_000)
|
||||||
|
});
|
||||||
|
if (!res.ok) throw new OllamaError(res.status, await res.text());
|
||||||
|
const j = await res.json();
|
||||||
|
return j.embedding; // 768-dim
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
`OLLAMA_URL` env var, default `http://192.168.1.185:11434`. The 768-dim vector is zero-padded to 1024 to match the `vector(1024)` column (per master spec, eases later model swap).
|
||||||
|
|
||||||
|
**Embed worker.** `embed.text` job payload `{ entity_type, entity_id }`. Worker:
|
||||||
|
|
||||||
|
1. Load the entity row.
|
||||||
|
2. Build the embedding string:
|
||||||
|
- `page`: `${title}\n\n${body_md}`, truncated to ~6 k characters (≈ 1.5 k tokens; well under nomic's 8 k context).
|
||||||
|
- `ref`: `${title || ''}\n${summary || ''}\n${body_text || ''}`, same truncation.
|
||||||
|
- `source_doc`: `${name}\n${body_text || ''}`.
|
||||||
|
- `conversation`: `${title || ''}\n${summary || ''}` — short by design; conversations get richer treatment in Plan 5.
|
||||||
|
3. Call `embedText`. On `OllamaError` or fetch timeout, throw — pg-boss retry kicks in with exponential backoff.
|
||||||
|
4. Zero-pad to 1024, UPDATE the entity's `embedding` column.
|
||||||
|
5. Emit an audit log entry `(actor_kind='worker', action='update', entity_type, entity_id, diff={embedding:'updated'})`.
|
||||||
|
|
||||||
|
**Re-embed triggers.** Write paths (`repo.create`, `repo.update`) for pages/refs/source_docs already exist. Add a small `lib/jobs/triggers.js` that wraps these — after a successful create/update of an embeddable entity, enqueue `embed.text` with a singleton key `${entity_type}:${entity_id}` so rapid re-edits coalesce. The trigger is called from repo level so MCP and cron paths get it too.
|
||||||
|
|
||||||
|
**Hybrid search.** Rewrite `lib/db/repos/search.js::fts` into `search.hybrid({ q, space_id?, kinds?, limit, offset })`:
|
||||||
|
|
||||||
|
1. FTS branch — current Plan 2 query unchanged, returns up to `limit * 3` results with `ts_rank`.
|
||||||
|
2. Vector branch — embed `q` via Ollama (with a 5 s timeout — search must stay snappy). For each kind, run an ANN query against the matching table's `embedding` column using HNSW (`<=>` cosine distance). Returns up to `limit * 3` per kind. If Ollama times out or errors, skip this branch entirely — log a `search.vector_skipped` event and continue with FTS-only.
|
||||||
|
3. RRF fusion — for each unique `(kind, id)`, sum `1 / (60 + rank_fts) + 1 / (60 + rank_vec)`. The `60` constant matches the canonical RRF paper. Sort, slice to `[offset, offset+limit]`.
|
||||||
|
4. Vector-only rows (no FTS match) and FTS-only rows (no embedding yet) both participate; missing rank is treated as infinity, giving `1 / inf = 0` from that branch.
|
||||||
|
|
||||||
|
Result shape unchanged: `{ kind, id, space_id, title_or_snippet, rank }`. The `rank` field now carries the fused RRF score.
|
||||||
|
|
||||||
|
**Files:**
|
||||||
|
- `lib/ai/ollama.js` (new).
|
||||||
|
- `lib/jobs/workers/embed.js` (new).
|
||||||
|
- `lib/jobs/triggers.js` (new).
|
||||||
|
- `lib/db/repos/search.js` (rewrite).
|
||||||
|
- `tests/ai/ollama.test.js` — fetch mock.
|
||||||
|
- `tests/jobs/workers/embed.test.js` — fetch mock; verifies zero-pad + audit.
|
||||||
|
- `tests/repos/search.test.js` (existing) — extended with vector-fixture rows + RRF assertions.
|
||||||
|
|
||||||
|
**Embedding-test strategy.** Tests insert fixture vectors directly (no Ollama needed). One integration test under `tests/integration/embed_live.test.js` hits a real Ollama, marked `skip()` if `OLLAMA_URL` is unreachable.
|
||||||
|
|
||||||
|
**Repos that emit triggers:** pages.create, pages.update, refs.create, refs.update, refs.upsertByExternal, source_docs.create, source_docs.update. Conversation embeds are summary-only and re-fire when `setSummary` is called.
|
||||||
|
|
||||||
|
**Commit:** `feat(jobs): embed worker + hybrid search`.
|
||||||
|
|
||||||
|
## Phase D — Karakeep webhook + drag-drop UI + Jobs UI
|
||||||
|
|
||||||
|
**Karakeep webhook.** `POST /api/ingest/karakeep`. Authenticated by `X-Karakeep-Signature: sha256=<hex>` HMAC of the raw body with `KARAKEEP_WEBHOOK_SECRET` env. If the signature is missing or wrong: 401.
|
||||||
|
|
||||||
|
Payload (Karakeep's webhook shape, normalized): `{ event, bookmark_id, tags }`.
|
||||||
|
|
||||||
|
For `event === 'bookmark.created'`:
|
||||||
|
1. Look up the existing space-mapping from env: `KARAKEEP_DEFAULT_SPACE_ID` (a UUID). Future work: per-tag space routing.
|
||||||
|
2. Enqueue `ingest.karakeep` with `{ bookmark_id, space_id }`.
|
||||||
|
|
||||||
|
`ingest.karakeep` worker:
|
||||||
|
1. Fetch the bookmark via Karakeep's API: `GET https://karakeep.hynesy.com/api/v1/bookmarks/{bookmark_id}` with `KARAKEEP_API_TOKEN`.
|
||||||
|
2. Build the same payload an `ingest.url` job would use (URL + title + tags) and call the URL handler directly. Tags propagate to the `entity_tags` table via repo.
|
||||||
|
3. If Karakeep returns 404 (bookmark deleted), mark the job done — no error.
|
||||||
|
|
||||||
|
**Drag-drop UI.** `public/components/dropzone.js` — wraps a target element, intercepts drag events, POSTs each file to `/api/capture/upload`, shows toast progress. Wire onto `<main>` so dropping anywhere in the main area works. Pre-fills `space_id` with `localStorage.last_space_id` (set when the user navigates to a space view).
|
||||||
|
|
||||||
|
**Jobs UI fill-in.** Expand `public/views/jobs.js`:
|
||||||
|
- Group rows by `state` (active / completed / failed).
|
||||||
|
- Each row: `id (8 chars)`, `name`, `state`, relative `created_at`, `last_error?`, action buttons.
|
||||||
|
- Polls `/api/jobs?state=active,failed` every 10 s.
|
||||||
|
- Retry button POSTs `/api/jobs/:id/retry`; delete button DELETE `/api/jobs/:id`.
|
||||||
|
|
||||||
|
**Files:**
|
||||||
|
- `lib/api/routes/ingest.js`.
|
||||||
|
- `lib/jobs/workers/karakeep.js`.
|
||||||
|
- `lib/karakeep/client.js` — thin wrapper.
|
||||||
|
- `public/components/dropzone.js`.
|
||||||
|
- `public/views/jobs.js` (expand).
|
||||||
|
- `tests/api/ingest.test.js` — HMAC check, valid/invalid signature.
|
||||||
|
- `tests/jobs/workers/karakeep.test.js` — Karakeep API mocked via fetch interceptor.
|
||||||
|
|
||||||
|
**Commit:** `feat(jobs): Karakeep webhook + drag-drop + Jobs UI`.
|
||||||
|
|
||||||
|
## Error handling & idempotency
|
||||||
|
|
||||||
|
- **Idempotency keys.** URL and Karakeep workers compute `sha256(space_id + url)` (URL) or `sha256(space_id + 'karakeep:' + bookmark_id)` (Karakeep). Stored as `refs.external_id` with `source_kind` set to `'url'` or `'karakeep'`. The unique index `idx_refs_external_unique` already enforces this from Plan 1. A duplicate ingest finds the existing ref and short-circuits.
|
||||||
|
- **Singleton embed jobs.** pg-boss `singletonKey: '${entity_type}:${entity_id}'` so rapid edits coalesce into one pending embed. If a job is already in-flight when a new edit lands, a follow-up is enqueued.
|
||||||
|
- **Capture rate limit.** Out of scope. The `agentOrOwner` gate is enough at single-user scale.
|
||||||
|
- **Ollama down.** Embed jobs throw, retry under pg-boss backoff. After dead-letter (≈ 5 min cumulative), entity stays without an embedding; hybrid search falls back to FTS for those rows. Operator restores Ollama, then `POST /api/jobs/:id/retry` or wait for the periodic re-embed cron in a future phase.
|
||||||
|
- **Karakeep down.** Webhook still accepts. The worker dead-letters; tag mapping replays from the operator manually.
|
||||||
|
- **Blob upload partial.** Stream to temp; rename on success only. Failed uploads leave a temp file; a daily cron in Plan 4 sweeps `> 24 h` temps.
|
||||||
|
|
||||||
|
## Observability
|
||||||
|
|
||||||
|
- Pino structured logs already in place. New log keys: `job_id`, `job_name`, `entity_type`, `entity_id`, `idempotency_key`, `outcome`.
|
||||||
|
- `/api/jobs` is the operator surface; the SPA Jobs view fronts it.
|
||||||
|
- pg-boss's archive table is the source of truth for completed/failed jobs; no separate audit needed for job lifecycle (the audit log captures entity-level changes the workers cause).
|
||||||
|
|
||||||
|
## Testing strategy
|
||||||
|
|
||||||
|
- **Unit:** workers and the Ollama client get unit tests with `fetch` mocked (vitest's `vi.fn`).
|
||||||
|
- **Repo:** `tests/repos/search.test.js` extended; new `tests/repos/jobs.test.js` covers `pg-boss`-backed list/retry helpers.
|
||||||
|
- **API:** capture, ingest, jobs routes via supertest. HMAC signature pass/fail. Idempotency on second capture of the same URL.
|
||||||
|
- **Integration (gated):** one test that hits real Ollama; auto-skipped if `OLLAMA_URL` is unreachable. Real pg-boss roundtrips happen inside the existing test DB using `resetDb` + `await pg-boss.stop()` between suites to avoid cross-talk.
|
||||||
|
- **No new vitest config.** `fileParallelism: false` already in place from Plan 1 — pg-boss is happier serialized too.
|
||||||
|
|
||||||
|
## Migrations
|
||||||
|
|
||||||
|
- **No new SQL migrations from Void.** pg-boss creates its own schema on first `start()`.
|
||||||
|
- One-time CT 311 ops: create `/var/lib/void/blobs/` and chown `void:void`.
|
||||||
|
|
||||||
|
## Deploy delta
|
||||||
|
|
||||||
|
- `.env` adds `OLLAMA_URL`, `KARAKEEP_WEBHOOK_SECRET`, `KARAKEEP_API_TOKEN`, `KARAKEEP_API_URL`, `KARAKEEP_DEFAULT_SPACE_ID`. Documented in `deploy/README.md`.
|
||||||
|
- `deploy/push.sh` unchanged (rsync still works).
|
||||||
|
- Snapshot CT 310 + 311 before deploying Plan 3 (standing rule). The Phase A first-deploy is the "major update" — pg-boss creates new tables in the shared DB.
|
||||||
|
|
||||||
|
## Known follow-ups (not Plan 3)
|
||||||
|
|
||||||
|
- AI Space/Project suggestion on capture.
|
||||||
|
- Embedding chunks table.
|
||||||
|
- pdf-text-extract for born-digital PDFs (Plan 4 likely handles this with Tesseract too).
|
||||||
|
- Per-tag Karakeep → Space routing instead of one default space.
|
||||||
|
- Recurring re-embed cron for rows where `embedding IS NULL`.
|
||||||
|
- Real-time Jobs UI via `pg LISTEN/NOTIFY` instead of polling.
|
||||||
|
|
||||||
|
## Open items for the user
|
||||||
|
|
||||||
|
- **Karakeep secrets.** Plan 3 Phase D needs `KARAKEEP_API_TOKEN` (issued from Karakeep settings) and a chosen `KARAKEEP_DEFAULT_SPACE_ID`. Surfaceable when the phase starts.
|
||||||
|
- **The 29-day-old `knowledge_pipeline` memory** (Karakeep → Qdrant → MCP) is now superseded by Void 2.0's pgvector-only architecture. After Plan 3 ships, that memory should be marked obsolete or deleted to avoid future-me reading it as authoritative.
|
||||||
Reference in New Issue
Block a user