3.7 KiB
Void 2.0 — Code Review & Optimisation Notes (2026-06-01, alpha.6)
Full read-through of lib/ (~4.2k LOC), the public SPA, and the Python workers.
Overall the codebase is in good shape: ~1:1 test-to-code ratio, consistent repo +
audit pattern, clean tool registry, disciplined safe-DOM and SSRF handling. Items
below are improvements, ordered by value. Severity ≠ urgency — most are "before
scale / before more agents", not "broken now".
Resolved 2026-06-02: #1 (pool error handler), #2 (upsert-arm guard), #3 (
verifyTokenO(1) selector+verifier), and #7 (contextcolumn allow-list) are done — see the security sweep doc + their tests. Remaining open: #4 (FTS index alignment — needs a prodEXPLAIN), #5 (dedupe bearer parsing), #6 (doc/code symbol drift).
Correctness / robustness
-
pool.jshas no error handler or statement timeout.lib/db/pool.jscreates apg.Poolwith nopool.on('error', …). An error on an idle pooled client (e.g. DB restart, replication failover on the .215 cluster) emits anerrorevent with no listener → process crash. Add a logging error handler. Also considerstatement_timeout/query_timeoutso a pathological query can't pin a connection indefinitely (the pool is onlymax: 10). -
applyPendingChange'supsertarm assumes a refs-shaped repo. After today's fix,case 'upsert'callsrepo.upsertByExternal(...). OnlyrefsRepohas that method; the action only originates from the refs route so it's safe today, but it's an implicit coupling. Cheap guard: asserttypeof repo.upsertByExternal === 'function'and throw a clearValidationErrorotherwise, so a future entity emittingupsertfails loud, not with a bareTypeError.
Performance / scale
-
verifyTokenis O(n·bcrypt) per request. (Also in the security sweep.) Loads every non-revoked token and bcrypt-compares each. Fine at 1–2 agents, quadratic-feeling as agents grow. Selector+verifier split makes it O(1). This is the single highest-value optimisation once Yerin/others get their own tokens. -
Hybrid-search FTS uses inline
to_tsvector(...)expressions.search.jscomputesto_tsvector('english', title || …)at query time per branch. This only avoids a seq scan if a GIN index on the identical expression exists for each table. Verify the expression indexes match these exact expressions (incl. thecoalesce(...)shapes); a drift means silent seq scans. Worth aEXPLAINcheck on prod-sized data.
Cleanliness / maintainability
-
owner.jsandagent_auth.jsduplicate bearer parsing + owner check. Both splitAuthorization, both compareOWNER_TOKEN. Now that both usetimingSafeStrEqual, factor the "parse bearer + is-owner?" step into one helper to keep the two in lockstep. -
Doc/code name drift. Completion docs reference
divertToPending/applyPendingChange— those are now the real names (good), but earlier notes also referenced an()export. Keepdocs/plan-*-complete.mdsymbol names in sync with the code; stale symbol names cost a grep on every revisit. -
contexttoolSELECT *. Project an explicit column list (see security sweep LOW) — also makes the tool's output contract stable for the UI.
Things that are good (keep doing)
- Tool registry (
createRegistry) is a clean extension point — the newsecurityRegistryreused it with zero changes. - Audit emission + recursive redaction is consistent across repos.
- SSRF
safeFetchis genuinely well built (DNS-pin TOCTOU defeat + per-hop re-validation). - Test isolation on
void_testwith a prod-guard intests/helpers/setup.js.