faster-whisper (small.en, GPU+CPU fallback) on CT 102 → POST /api/voice/transcribe (multer→whisper client) → mic in the bubble records (MediaRecorder), uploads, drops the transcript into the input to review-and-send. Infra scripts in deploy/whisper/. Retention (P2b) next. NOTE: mic needs a secure context (the https domain), not the LAN IP. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
faster-whisper service (Dross voice STT)
Runs on CT 102 (the Ollama box, 192.168.1.185), bare-metal (no Docker), on the
RTX A2000 with CPU fallback. OpenAI-style /transcribe consumed by void-app
lib/voice/whisper.js (WHISPER_URL=http://192.168.1.185:8001).
Install (on CT 102)
scp deploy/whisper/{server.py,setup.sh} root@192.168.1.185:/opt/whisper_server.py /root/setup.sh
ssh root@192.168.1.185 'bash /root/setup.sh && install -m644 /opt/whisper_server.py /opt/whisper/server.py && systemctl enable --now whisper'
curl http://192.168.1.185:8001/health # {"ok":true,"model":"small.en","device":"cuda"}
- venv at
/opt/whisper/venv; modelsmall.en(envWHISPER_MODEL); CUDA libs vianvidia-cublas-cu12/nvidia-cudnn-cu12pip wheels (LD_LIBRARY_PATH in the unit). - GPU → CPU fallback is in
server.pyload(). - CT 102 disk was expanded +20G (was 89% full) before install.