Void-Homelab/docs/security-sweep-2026-06-01.md

# Void 2.0 — Security Sweep (2026-06-01, alpha.6)

Fresh security pass over the live alpha.6 surfaces. Severity is for the current
deployment context: **single owner token, owner-only authn, one suggest-tier
companion agent, LAN + Cloudflare-Access perimeter**. Several items move up in
severity once additional agents or wider exposure land.

Legend: ✅ fixed this pass · 🔧 recommended (needs your call / migration) · ℹ️ note.

---

## ✅ FIXED — Owner token compared in non-constant time  (was MEDIUM)
`lib/auth/owner.js` and `lib/api/middleware/agent_auth.js` compared the bearer
token with `===` / `!==`. String `===` short-circuits on the first differing
byte, leaking token length and prefix through response timing — over enough
samples this enables byte-by-byte recovery of `OWNER_TOKEN`.

**Fix:** new `lib/auth/timingSafeStrEqual` (constant-time via
`crypto.timingSafeEqual` with a length pre-check so it never throws on a
length mismatch). Both auth paths now use it. Tests: `tests/auth/safe_compare.test.js`.

---

## ✅ FIXED (2026-06-02) — `verifyToken` O(n) bcrypt scan
Resolved via the selector+verifier split (migration `010_token_selector.sql`).
New tokens are `vk_<selector>.<verifier>`: the non-secret `selector` is indexed
and locates exactly one row (O(1)); only the `verifier` is bcrypt-hashed. Legacy
NULL-selector tokens still verify via a fallback scan over the shrinking legacy
set. Dropped the useless `idx_agent_tokens_hash`. Tests:
`tests/repos/token_selector.test.js`. Original finding below.

## 🔧 ~~HIGH~~ — `verifyToken` does an O(n) bcrypt scan over every token
`lib/db/repos/agents.js::verifyToken` loads **all** non-revoked agent tokens and
runs `bcrypt.compare` against each (cost factor 12 ≈ 250 ms/compare).

- **Auth latency scales linearly with token count.** With N agents/tokens, every
  authenticated request pays N bcrypt comparisons.
- **DoS lever:** an attacker who can hit an authenticated endpoint with a `vk_`-
  prefixed token forces a full-table bcrypt scan per request.

**Recommended fix (needs a migration — left for your sign-off):** give each token
a non-secret lookup key. Store `token_id = first 8 chars of the random body`
(or a separate indexed `selector` column), index it, and `bcrypt.compare` exactly
the one row it points at. Keeps bcrypt's offline-cracking resistance while making
verification O(1). This is the standard "selector + verifier" split.

---

## 🔧 HIGH — `void` DB role still has SUPERUSER  (carried over)
Documented in `security-followups.md`: the `void` role was granted SUPERUSER so
the test harness could `CREATE EXTENSION`. On prod (CT 311 → CT 310 DB) this is
far more privilege than the app needs. Revoke on prod; create extensions once as
a superuser during bootstrap, then run the app as a non-superuser role.

---

## 🔧 MEDIUM — Companion subprocess inherits the full server environment
`lib/ai/claude_cli.js` clones `process.env` for the `claude` child and only
deletes `ANTHROPIC_API_KEY` / `ANTHROPIC_AUTH_TOKEN`. The child therefore also
inherits `OWNER_TOKEN`, `DB_PASS`/connection strings, and the Karakeep secrets.

Today the companion is constrained to `mcp__void__*` tools only (built-ins like
Bash/Read are stripped via `--tools`), so it has no primitive to read its own
env — contained in practice. But it is one config slip (re-enabling a built-in)
away from full secret exposure.

**Recommended (defense in depth):** pass an explicit allow-list env to the child
(HOME, PATH, the few `CLAUDE_*` / `VOID_CLAUDE_HOME` vars, and only the MCP
server's own needs) rather than the whole environment.

---

## ✅ FIXED (2026-06-02) — `context` tool `SELECT *` of the active entity
`lib/ai/agent/tools/context.js` now projects a `SAFE_COLUMNS` allow-list for
`resources` (scalar fields only — never the `monitoring`/`metadata` JSON blobs).
Also added `resource` to the tool's `TABLE` map (it was previously unhandled).
Test in `tests/ai/agent/tools/context.test.js`.

---

## ✅ Reviewed and sound (no action)
- **SSRF guard** (`lib/ingest/safe_fetch.js`): http/https-only, blocks loopback /
  RFC1918 / link-local / CGNAT(100.64) / 0.0.0.0 / IPv6 ULA+link-local+v4-mapped,
  validates **all** DNS records, pins resolved IPs into the undici dispatcher to
  defeat TOCTOU rebinding, and re-validates every redirect hop. Solid.
- **Karakeep webhook HMAC** (`lib/api/routes/ingest.js`): `timingSafeEqual` over
  the raw body, guarded against length-mismatch throw, fails closed on missing
  secret/sig. Good.
- **Audit redaction** (`lib/db/repos/audit.js`): redacts token/password/api_key/
  secret/authorization recursively at write time.
- **Capability model** (`lib/auth/capability.js`): agents default-deny; mutations
  require `write`+scope or fall to `suggest`; unknown actions deny.
- **XSS**: safe-DOM invariant in `public/dom.js`; assistant markdown only via the
  DOMPurify-sanitized path.
- **Bearer-only API** (no cookies) ⇒ CSRF not applicable.

---

## Carried-over from `security-followups.md` (design decisions, your call)
- **[HIGH]** Polymorphic `entity_tags`/`entity_links`/`attachments` lack `space_id`
  (cross-tenant linkage possible at the DB layer). Defensible for single-tenant
  alpha; gate at the REST/MCP layer until multi-tenant is load-bearing.
- **[MEDIUM]** `tags.name` UNIQUE is global, not per-space.
- **[MEDIUM]** No cascade on polymorphic parent delete.