faster-whisper (small.en, GPU+CPU fallback) on CT 102 → POST /api/voice/transcribe (multer→whisper client) → mic in the bubble records (MediaRecorder), uploads, drops the transcript into the input to review-and-send. Infra scripts in deploy/whisper/. Retention (P2b) next. NOTE: mic needs a secure context (the https domain), not the LAN IP. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
17 lines
920 B
Markdown
17 lines
920 B
Markdown
# faster-whisper service (Dross voice STT)
|
|
|
|
Runs on **CT 102** (the Ollama box, `192.168.1.185`), bare-metal (no Docker), on the
|
|
RTX A2000 with CPU fallback. OpenAI-style `/transcribe` consumed by void-app
|
|
`lib/voice/whisper.js` (`WHISPER_URL=http://192.168.1.185:8001`).
|
|
|
|
## Install (on CT 102)
|
|
```
|
|
scp deploy/whisper/{server.py,setup.sh} root@192.168.1.185:/opt/whisper_server.py /root/setup.sh
|
|
ssh root@192.168.1.185 'bash /root/setup.sh && install -m644 /opt/whisper_server.py /opt/whisper/server.py && systemctl enable --now whisper'
|
|
curl http://192.168.1.185:8001/health # {"ok":true,"model":"small.en","device":"cuda"}
|
|
```
|
|
- venv at `/opt/whisper/venv`; model `small.en` (env `WHISPER_MODEL`); CUDA libs via
|
|
`nvidia-cublas-cu12`/`nvidia-cudnn-cu12` pip wheels (LD_LIBRARY_PATH in the unit).
|
|
- GPU → CPU fallback is in `server.py` `load()`.
|
|
- **CT 102 disk was expanded +20G** (was 89% full) before install.
|