docs(devices): spec for LAN device discovery (MAC inventory + review/name)

Persistent MAC-keyed lan_devices store fed by an hourly arp-scan on CT 311;
diffs new vs known, mirrors the services discovered→promote flow for naming/
editing. Upsert-by-MAC keeps the table bounded. Borrows decoupled-scanner +
MAC-identity lessons from scanopy.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
root
2026-06-08 20:44:19 +10:00
parent 0866459b23
commit d513ca8fa4

View File

@@ -0,0 +1,156 @@
# Design: LAN Device Discovery (MAC inventory + review/name)
**Date:** 2026-06-08
**Status:** Approved (brainstorm), pending implementation plan
**Repo:** void-v2 (CT 311 / `void-app`)
## Summary
Replace the static, hand-maintained `public/devices.json` with a **persistent,
MAC-keyed device store** fed by a recurring ARP scan. Each scan **logs MACs to
the DB and diffs against what's known** — new devices land in a review queue;
known devices just get their IP / `last_seen` / presence updated. The owner can
**add a discovered device, edit it, and give it a name** for reference (mirrors
the Void's existing services "discovered → promote" pattern).
## Background
- **Static today:** `public/views/devices_band.js` does `fetch('/devices.json')`
— a curated, manually-edited list (IP/MAC/vendor/group/flag). It re-reads a
static file; nothing is persisted or diffed.
- **Existing precedent to mirror:** `monitored_services` uses
`source='discovered' AND NOT enabled` as a review queue; `PATCH /services/:id`
promotes + edits. We reproduce this shape for devices.
- **Separate from `network_hosts`:** that table is the homelab-guest inventory
(Proxmox `BC:24:11:*` LXCs, infra_audit). The devices band is IoT / personal /
unknown LAN gear — kept separate.
- **Scan engine:** the Void host (CT 311) has `ip`/`arp` but **not**
`nmap`/`arp-scan`. We add `arp-scan` (chosen for reliable L2 ARP sweeps that
ICMP-blocking devices can't dodge, plus a built-in OUI vendor DB).
### Lessons borrowed from Scanopy (self-hosted discovery tool)
- **Decouple scanner from storage/UI** — the scanner just scans and reports; the
server owns dedup + persistence. → isolated `lib/infra/scan.js`.
- **MAC is the identity, IP is a mutable attribute** — key on MAC, update IP each
scan (handles DHCP churn). → `mac` primary key.
- **Scheduled rescans + timestamp inventory** — periodic batch with
`first_seen`/`last_seen`/`present`, diff by "MAC seen before?". → hourly cron.
- **Vendor via OUI** — `arp-scan` ships an OUI database; vendor is free.
- **Randomized MACs are an open problem** even for Scanopy — so we at least
**flag** locally-administered MACs so the user knows OUI can't ID them.
## Decisions
| Decision | Choice | Rationale |
|---|---|---|
| Scan engine | **`arp-scan --localnet` on CT 311**, hourly cron | Reliable L2 sweep + built-in OUI; self-contained (no external scanner dep). |
| Cadence | **Hourly** (staggered, e.g. `7 * * * *`) | "No rush"; device drift is slow. |
| DB growth | **Upsert by MAC — one row per device, no per-scan history** | Table is bounded by distinct devices ever seen (dozenshundreds), not scan count → no bloat. |
| Identity | **MAC primary key**; IP a mutable column | Survives DHCP IP changes. |
| Review flow | Mirror services `discovered → promote` | New MAC → `status='new'`; owner names/edits → `status='known'`. |
| Source of truth | **DB** (`lan_devices`); `devices.json` becomes the one-time migration seed, then removed | Single source of truth. |
## Architecture
scan (arp-scan) → `parseArpScan` (+randomized flag) → `upsertScan` by MAC →
`markAbsent` for unseen → review queue (`status='new'`) → owner names/groups/promotes
→ known devices render in the band.
## Components
### Migration `024_lan_devices.sql`
Table `lan_devices`:
- `mac text PRIMARY KEY`
- `ip text`, `vendor text`
- `name text` (owner-given reference name, null until named)
- `grp text` (Smart Home | Entertainment | Personal | Network | Flagged)
- `note text`
- `status text NOT NULL DEFAULT 'new'` (`new` | `known` | `ignored`)
- `randomized boolean NOT NULL DEFAULT false` (locally-administered MAC)
- `flagged boolean NOT NULL DEFAULT false`
- `first_seen timestamptz NOT NULL DEFAULT now()`
- `last_seen timestamptz NOT NULL DEFAULT now()`
- `present boolean NOT NULL DEFAULT true`
**Seed (embedded SQL, from the current curated `devices.json`):**
- Devices **with a MAC**: non-flagged → `status='known'` with their name/group;
flagged (e.g. `.15` ASUS) → `status='new'`, `flagged=true`.
- The `.13` Orbi satellite and `.171` Galaxy Tab S4 fixes carry over as `known`.
- MAC-less curated entries (`.21/.22/.34/.35/.51`, currently offline) are **not
seeded** — they reappear as `new` (with a real MAC) the first time they're seen
online. (Documented so it's expected, not a gap.)
### `lib/infra/scan.js` (decoupled scanner)
- `parseArpScan(text) -> [{ ip, mac, vendor, randomized }]`**pure** parser of
`arp-scan` tab-separated output (skips banner/footer); `randomized` = first
octet has the locally-administered bit (`& 0x02`).
- `isRandomizedMac(mac) -> boolean` — pure helper.
- `runScan({ exec }) -> rows` — shells `arp-scan --localnet -x` (interface
auto/`-I eth0`), returns `parseArpScan(stdout)`. `exec` injected for tests.
### `lib/db/repos/lan_devices.js`
- `upsertScan(rows)` — insert unseen MACs as `status='new'`; for existing, update
`ip`, `vendor`, `last_seen=now()`, `present=true` (never overwrite owner
`name`/`grp`/`status`).
- `markAbsent(seenMacs)``present=false` for MACs not in the latest scan.
- `listKnown()` (`status='known'`, grouped by `grp`), `listDiscovered()`
(`status='new'`), `get(mac)`, `update(mac, {name, grp, status, note, flagged})`,
`remove(mac)`. (`ignored` devices show in neither.)
### Cron (`lib/cron/index.js`)
Add hourly (`7 * * * *`): `runScan()``upsertScan``markAbsent`. Wrapped in
try/catch — a scan failure logs and never crashes the cron.
### API `lib/api/routes/devices.js` (mount `/api/devices`, owner-gated)
- `GET /` — known devices grouped for the band.
- `GET /discovered``status='new'` review queue.
- `PATCH /:mac` — set `name`/`grp`/`status`/`note`/`flagged` (this is "add from
discovered" + "edit" + "name"); promoting = `status:'known'`.
- `DELETE /:mac` — remove.
- `POST /scan` — run a scan immediately (owner).
- `:mac` param validated against a MAC regex.
### Frontend
- `public/views/devices_band.js` — fetch `/api/devices` (grouped) instead of the
static file; render the MAC (existing `.dv-mac` style from today's change).
- **Discovered review** — a section/panel listing `/api/devices/discovered`, each
with an **Add / Edit** form (name + group select + notes) that `PATCH`es to
promote; plus inline edit for known devices and an Ignore/Delete action.
- Remove `public/devices.json` (superseded by the DB).
## Infra setup (one-time, on CT 311)
`apt install arp-scan` + grant the binary raw-socket capability so the non-root
`void` service user can run it:
`setcap cap_net_raw,cap_net_admin+eip /usr/sbin/arp-scan`. Captured in
`deploy/README.md`. If the capability/tool is missing, the scan logs a clear
error and the feature degrades to "no new discoveries" (existing data still shows).
## Error handling
- `arp-scan` missing / unprivileged / non-zero exit → `runScan` throws; cron
catches, logs, leaves the DB untouched (known devices still render).
- Empty/garbled scan output → `parseArpScan` returns `[]`; `markAbsent([])` is a
no-op guard (never blanket-marks everything absent on a failed scan).
- Bad MAC in PATCH → 400 via zod.
## Testing
- **`parseArpScan` / `isRandomizedMac`** — pure unit tests (sample arp-scan
output incl. a randomized MAC, banner/footer lines, a malformed line).
- **`lan_devices` repo** (vitest + test DB) — `upsertScan` inserts new vs updates
existing without clobbering owner fields; `markAbsent` flips presence; promote
via `update`.
- **API** (supertest) — `/discovered` lists only `new`; `PATCH` promotes/edits;
owner-gated.
- **Frontend** (jsdom) — band renders groups + MAC from `/api/devices`;
discovered panel renders the add/edit form.
- **Manual** — `POST /api/devices/scan`, confirm new devices appear, name one,
see it move to the band.
## Out of scope (YAGNI)
- Service/port fingerprinting, SNMP/LLDP topology (that's Scanopy's job).
- Multi-subnet/VLAN scanning (single `/24`).
- Auto-pruning stale `new` devices (revisit only if the queue gets noisy).
- Push notifications on new-device discovery.
## References
- Scanopy — github.com/scanopy/scanopy ; scanopy.net (self-hosted discovery/topology, AGPL-3.0).