Files
Void-Homelab/deploy/README.md
2026-06-08 21:28:51 +10:00

162 lines
6.1 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Deploy notes — Void 2.0
## DB role posture (CT 310 — `void2-db`, alpha-9+)
- The `void` DB role is **NOSUPERUSER** (least privilege). It owns the `void` + `void_test`
databases and the `public` schema, so it can run all migrations and the test-harness
`resetDb` without superuser.
- The `vector` (pgvector) extension was marked **trusted** so the non-superuser `void` role
can `CREATE EXTENSION vector` (needed by `tests/helpers/db.js` on each reset):
```
echo 'trusted = true' >> /usr/share/postgresql/16/extension/vector.control
```
**⚠ Re-apply this after any pgvector package upgrade** (the package may overwrite the
control file). `pgcrypto` ships trusted already.
- Revert (emergency): as `postgres` on CT 310, `ALTER ROLE void SUPERUSER;`.
## App deploy (CT 311 — `void2-app`)
One-time setup on the target host:
```bash
# Node 22 (from nodesource if Debian's default is older)
curl -fsSL https://deb.nodesource.com/setup_22.x | bash -
apt install -y nodejs
# Service user + working dir
useradd -r -m -d /opt/void-server void
mkdir -p /opt/void-server
chown void: /opt/void-server
# systemd
install -m 644 void-server.service /etc/systemd/system/void-server.service
systemctl daemon-reload
systemctl enable void-server
# Secrets — /opt/void-server/.env must contain:
# DATABASE_URL=postgres://void:<password>@<db-host>:5432/void
# OWNER_TOKEN=<32+ char secret>
# PORT=3000
# NODE_ENV=production
chmod 600 /opt/void-server/.env
chown void: /opt/void-server/.env
```
Then from the dev box:
```bash
cd /project/src/void-v2
./deploy/push.sh
```
## Maintenance
```bash
journalctl -u void-server -f # follow logs
systemctl status void-server # check status
systemctl restart void-server # cycle
# Run migrations on the deployed copy:
ssh root@void2-app 'cd /opt/void-server && npm run migrate'
```
## Notes
- `.env` is excluded from the rsync to avoid clobbering production secrets with dev values.
- The push script uses `--omit=dev` to skip test deps on the target.
- `tests/` is excluded — they're for the dev environment only.
## Workers (Python void-workers — Plan 4+)
Runs alongside void-server as a second systemd unit.
One-time setup on CT 311:
```bash
apt install -y python3.12 python3.12-venv python3-pip \
ffmpeg tesseract-ocr tesseract-ocr-eng poppler-utils
useradd -r -m -d /opt/void-workers -s /bin/bash voidworkers
mkdir -p /opt/void-workers /var/lib/void/whisper-models
chown voidworkers: /opt/void-workers
chown -R voidworkers: /var/lib/void/whisper-models
# voidworkers needs to read the shared blob store
usermod -aG void voidworkers
chmod -R g+rX /var/lib/void/blobs
install -m 644 deploy/void-workers.service /etc/systemd/system/
systemctl daemon-reload
systemctl enable void-workers
```
`/opt/void-workers/.env` (mode 600, owned by voidworkers):
```
DATABASE_URL=postgres://void:<pw>@192.168.1.215:5432/void
BLOB_ROOT=/var/lib/void/blobs
WHISPER_MODEL=small.en
WHISPER_CACHE=/var/lib/void/whisper-models
```
Deploy after edits:
```bash
cd /project/src/void-v2
./deploy/push-workers.sh
```
## SQL_ASCII cluster note
`void2-db` was initialized as SQL_ASCII (not UTF-8). The data is already
UTF-8 in practice but Python's psycopg refuses to decode without an
explicit `client_encoding=UTF8` parameter. Workers set this on every
connection (`lib/db/pool.py` equivalent in `workers/void_workers/`).
Node's `pg` lib is more lenient and doesn't need this. If you ever
re-initdb the cluster, use `--encoding=UTF8 --locale=C.UTF-8`.
## Plan 6 (alpha-8)
- **Migrations 012014** (dashboard_layout, speedtest_results, service_status) are applied by the standard `npm run migrate` — no manual steps needed.
- **`speedtest-cli` on CT 311** — the hourly speedtest job requires it:
```bash
pip install --break-system-packages speedtest-cli
```
Until installed, speedtest jobs will fail but the Sacred Valley speedtest card still renders any existing history without error.
- **Icon cache** — the server writes cached service icons to `ICON_CACHE` (default `/var/lib/void/icons`) and auto-creates the directory on first use. You can pre-create and own it for clarity:
```bash
mkdir -p /var/lib/void/icons
chown void: /var/lib/void/icons
```
## LAN device discovery (2.1.0)
The hourly device scan (`lib/cron` → `runDeviceScanCycle`) shells `arp-scan`. The
service runs as the non-root `void` user, so `arp-scan` needs a raw-socket
capability:
```bash
apt-get install -y arp-scan
setcap cap_net_raw,cap_net_admin+eip "$(readlink -f "$(command -v arp-scan)")"
# verify as the service user (run from the service WorkingDirectory so the
# OUI vendor files resolve):
runuser -u void -- sh -c 'cd /opt/void-server && arp-scan --localnet --plain | head'
```
**⚠ Re-apply the `setcap` after any `arp-scan` package upgrade** — the upgrade
replaces the binary and drops the capability, after which scans silently find
nothing. `migration 024` creates `lan_devices` and seeds it from the old
`devices.json`, so the band still renders even before the first scan runs.
- **Service registry** — edit `config/services.json` to the real homelab service URLs and CT numbers. The committed seed values are best-guess placeholders and should be updated before the health band is meaningful.
## Deploy safety (push.sh, hardened)
`./deploy/push.sh` now does an atomic-ish, self-verifying deploy:
1. **Snapshots** the current remote code (excl `node_modules`/`.env`) to `/opt/void-server.prev` for rollback.
2. rsyncs the new code (`--delete`; preserves `node_modules` + `.env`).
3. Runs **`npm install --omit=dev` + `npm run migrate`** as part of the deploy (no more separate manual migrate step).
4. Restarts `void-server`.
5. **Health-gates**: polls `/health` until it reports the expected `package.json` version + `db_ok` (≈25s).
6. **Auto-rolls-back** on any failure: restores the `.prev` snapshot, reinstalls, restarts.
Override the health endpoint with `HEALTH_URL=…` if the target IP differs.
Caveat: forward-only migrations are not auto-reverted on rollback (they're additive by convention, so a code rollback against the new schema is safe; a destructive migration needs manual care).