Files

root a9191cee00 feat(workers): free Ollama VRAM before loading Whisper on the GPU

Whisper (CT 311) and Ollama (CT 102) share one A2000. Before loading
Whisper on CUDA, ask Ollama to unload its models (GET /api/ps then POST
/api/generate keep_alive:0) and wait for the card to clear, so the GPU
load has headroom. Best-effort and stdlib-only; Ollama reloads
cooperatively, and the existing CUDA->CPU fallback covers any failure.
Toggle via OLLAMA_FREE_BEFORE_STT; endpoint via OLLAMA_URL.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

2026-06-05 21:12:05 +10:00

tests

feat(workers): free Ollama VRAM before loading Whisper on the GPU

2026-06-05 21:12:05 +10:00

void_workers

feat(workers): free Ollama VRAM before loading Whisper on the GPU

2026-06-05 21:12:05 +10:00

.gitignore

feat(workers): Python skeleton + config + structlog

2026-06-01 04:41:33 +10:00

pyproject.toml

feat(workers): Python skeleton + config + structlog

2026-06-01 04:41:33 +10:00

README.md

feat(workers): Python skeleton + config + structlog

2026-06-01 04:41:33 +10:00

README.md

void-workers

Python ML ingest service alongside void-server (Node). Sibling of lib/ in the void-v2 repo.

Local dev

cd workers
python3.12 -m venv .venv
. .venv/bin/activate
pip install -e ".[all]"
export DATABASE_URL="postgres://..."
python -m void_workers.runner

Tests

pip install -e ".[test,all]"
DATABASE_URL="postgres://..." pytest -v

See ../docs/superpowers/plans/2026-06-01-void-v2-plan4-workers.md for the full plan and ../docs/superpowers/specs/2026-06-01-void-v2-plan4-workers.md for the design.