Files
Void-Homelab/docs/code-review-2026-06-01.md

69 lines
3.7 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Void 2.0 — Code Review & Optimisation Notes (2026-06-01, alpha.6)
Full read-through of `lib/` (~4.2k LOC), the public SPA, and the Python workers.
Overall the codebase is in good shape: ~1:1 test-to-code ratio, consistent repo +
audit pattern, clean tool registry, disciplined safe-DOM and SSRF handling. Items
below are improvements, ordered by value. Severity ≠ urgency — most are "before
scale / before more agents", not "broken now".
> **Resolved 2026-06-02:** #1 (pool error handler), #2 (upsert-arm guard), #3
> (`verifyToken` O(1) selector+verifier), and #7 (`context` column allow-list)
> are **done** — see the security sweep doc + their tests. Remaining open: #4
> (FTS index alignment — needs a prod `EXPLAIN`), #5 (dedupe bearer parsing), #6
> (doc/code symbol drift).
## Correctness / robustness
1. **`pool.js` has no error handler or statement timeout.** `lib/db/pool.js`
creates a `pg.Pool` with no `pool.on('error', …)`. An error on an *idle*
pooled client (e.g. DB restart, replication failover on the .215 cluster)
emits an `error` event with no listener → **process crash**. Add a logging
error handler. Also consider `statement_timeout` / `query_timeout` so a
pathological query can't pin a connection indefinitely (the pool is only
`max: 10`).
2. **`applyPendingChange`'s `upsert` arm assumes a refs-shaped repo.** After
today's fix, `case 'upsert'` calls `repo.upsertByExternal(...)`. Only
`refsRepo` has that method; the action only originates from the refs route so
it's safe today, but it's an implicit coupling. Cheap guard: assert
`typeof repo.upsertByExternal === 'function'` and throw a clear
`ValidationError` otherwise, so a future entity emitting `upsert` fails loud,
not with a bare `TypeError`.
## Performance / scale
3. **`verifyToken` is O(n·bcrypt) per request.** (Also in the security sweep.)
Loads every non-revoked token and bcrypt-compares each. Fine at 12 agents,
quadratic-feeling as agents grow. Selector+verifier split makes it O(1). This
is the single highest-value optimisation once Yerin/others get their own tokens.
4. **Hybrid-search FTS uses inline `to_tsvector(...)` expressions.** `search.js`
computes `to_tsvector('english', title || …)` at query time per branch. This
only avoids a seq scan if a **GIN index on the identical expression** exists
for each table. Verify the expression indexes match these exact expressions
(incl. the `coalesce(...)` shapes); a drift means silent seq scans. Worth a
`EXPLAIN` check on prod-sized data.
## Cleanliness / maintainability
5. **`owner.js` and `agent_auth.js` duplicate bearer parsing + owner check.**
Both split `Authorization`, both compare `OWNER_TOKEN`. Now that both use
`timingSafeStrEqual`, factor the "parse bearer + is-owner?" step into one
helper to keep the two in lockstep.
6. **Doc/code name drift.** Completion docs reference `divertToPending` /
`applyPendingChange` — those are now the real names (good), but earlier notes
also referenced a `n()` export. Keep `docs/plan-*-complete.md` symbol names in
sync with the code; stale symbol names cost a grep on every revisit.
7. **`context` tool `SELECT *`.** Project an explicit column list (see security
sweep LOW) — also makes the tool's output contract stable for the UI.
## Things that are good (keep doing)
- Tool registry (`createRegistry`) is a clean extension point — the new
`securityRegistry` reused it with zero changes.
- Audit emission + recursive redaction is consistent across repos.
- SSRF `safeFetch` is genuinely well built (DNS-pin TOCTOU defeat + per-hop
re-validation).
- Test isolation on `void_test` with a prod-guard in `tests/helpers/setup.js`.