Files
Void-Homelab/deploy/README.md
2026-06-08 21:28:51 +10:00

6.1 KiB
Raw Permalink Blame History

Deploy notes — Void 2.0

DB role posture (CT 310 — void2-db, alpha-9+)

  • The void DB role is NOSUPERUSER (least privilege). It owns the void + void_test databases and the public schema, so it can run all migrations and the test-harness resetDb without superuser.
  • The vector (pgvector) extension was marked trusted so the non-superuser void role can CREATE EXTENSION vector (needed by tests/helpers/db.js on each reset):
    echo 'trusted = true' >> /usr/share/postgresql/16/extension/vector.control
    
    ⚠ Re-apply this after any pgvector package upgrade (the package may overwrite the control file). pgcrypto ships trusted already.
  • Revert (emergency): as postgres on CT 310, ALTER ROLE void SUPERUSER;.

App deploy (CT 311 — void2-app)

One-time setup on the target host:

# Node 22 (from nodesource if Debian's default is older)
curl -fsSL https://deb.nodesource.com/setup_22.x | bash -
apt install -y nodejs

# Service user + working dir
useradd -r -m -d /opt/void-server void
mkdir -p /opt/void-server
chown void: /opt/void-server

# systemd
install -m 644 void-server.service /etc/systemd/system/void-server.service
systemctl daemon-reload
systemctl enable void-server

# Secrets — /opt/void-server/.env must contain:
#   DATABASE_URL=postgres://void:<password>@<db-host>:5432/void
#   OWNER_TOKEN=<32+ char secret>
#   PORT=3000
#   NODE_ENV=production
chmod 600 /opt/void-server/.env
chown void: /opt/void-server/.env

Then from the dev box:

cd /project/src/void-v2
./deploy/push.sh

Maintenance

journalctl -u void-server -f          # follow logs
systemctl status void-server          # check status
systemctl restart void-server         # cycle

# Run migrations on the deployed copy:
ssh root@void2-app 'cd /opt/void-server && npm run migrate'

Notes

  • .env is excluded from the rsync to avoid clobbering production secrets with dev values.
  • The push script uses --omit=dev to skip test deps on the target.
  • tests/ is excluded — they're for the dev environment only.

Workers (Python void-workers — Plan 4+)

Runs alongside void-server as a second systemd unit.

One-time setup on CT 311:

apt install -y python3.12 python3.12-venv python3-pip \
               ffmpeg tesseract-ocr tesseract-ocr-eng poppler-utils

useradd -r -m -d /opt/void-workers -s /bin/bash voidworkers
mkdir -p /opt/void-workers /var/lib/void/whisper-models
chown voidworkers: /opt/void-workers
chown -R voidworkers: /var/lib/void/whisper-models

# voidworkers needs to read the shared blob store
usermod -aG void voidworkers
chmod -R g+rX /var/lib/void/blobs

install -m 644 deploy/void-workers.service /etc/systemd/system/
systemctl daemon-reload
systemctl enable void-workers

/opt/void-workers/.env (mode 600, owned by voidworkers):

DATABASE_URL=postgres://void:<pw>@192.168.1.215:5432/void
BLOB_ROOT=/var/lib/void/blobs
WHISPER_MODEL=small.en
WHISPER_CACHE=/var/lib/void/whisper-models

Deploy after edits:

cd /project/src/void-v2
./deploy/push-workers.sh

SQL_ASCII cluster note

void2-db was initialized as SQL_ASCII (not UTF-8). The data is already UTF-8 in practice but Python's psycopg refuses to decode without an explicit client_encoding=UTF8 parameter. Workers set this on every connection (lib/db/pool.py equivalent in workers/void_workers/). Node's pg lib is more lenient and doesn't need this. If you ever re-initdb the cluster, use --encoding=UTF8 --locale=C.UTF-8.

Plan 6 (alpha-8)

  • Migrations 012014 (dashboard_layout, speedtest_results, service_status) are applied by the standard npm run migrate — no manual steps needed.
  • speedtest-cli on CT 311 — the hourly speedtest job requires it:
    pip install --break-system-packages speedtest-cli
    
    Until installed, speedtest jobs will fail but the Sacred Valley speedtest card still renders any existing history without error.
  • Icon cache — the server writes cached service icons to ICON_CACHE (default /var/lib/void/icons) and auto-creates the directory on first use. You can pre-create and own it for clarity:
    mkdir -p /var/lib/void/icons
    chown void: /var/lib/void/icons
    

LAN device discovery (2.1.0)

The hourly device scan (lib/cronrunDeviceScanCycle) shells arp-scan. The service runs as the non-root void user, so arp-scan needs a raw-socket capability:

apt-get install -y arp-scan
setcap cap_net_raw,cap_net_admin+eip "$(readlink -f "$(command -v arp-scan)")"
# verify as the service user (run from the service WorkingDirectory so the
# OUI vendor files resolve):
runuser -u void -- sh -c 'cd /opt/void-server && arp-scan --localnet --plain | head'

⚠ Re-apply the setcap after any arp-scan package upgrade — the upgrade replaces the binary and drops the capability, after which scans silently find nothing. migration 024 creates lan_devices and seeds it from the old devices.json, so the band still renders even before the first scan runs.

  • Service registry — edit config/services.json to the real homelab service URLs and CT numbers. The committed seed values are best-guess placeholders and should be updated before the health band is meaningful.

Deploy safety (push.sh, hardened)

./deploy/push.sh now does an atomic-ish, self-verifying deploy:

  1. Snapshots the current remote code (excl node_modules/.env) to /opt/void-server.prev for rollback.
  2. rsyncs the new code (--delete; preserves node_modules + .env).
  3. Runs npm install --omit=dev + npm run migrate as part of the deploy (no more separate manual migrate step).
  4. Restarts void-server.
  5. Health-gates: polls /health until it reports the expected package.json version + db_ok (≈25s).
  6. Auto-rolls-back on any failure: restores the .prev snapshot, reinstalls, restarts.

Override the health endpoint with HEALTH_URL=… if the target IP differs. Caveat: forward-only migrations are not auto-reverted on rollback (they're additive by convention, so a code rollback against the new schema is safe; a destructive migration needs manual care).