feat(workers): ingest.video via yt-dlp + Whisper

yt-dlp pulls metadata (title, description, uploader, thumbnail) and
bestaudio (opus). faster-whisper transcribes; audio file removed after.
Creates a refs row with kind='video' and source_kind='youtube' for
YouTube URLs, generic 'video' otherwise. Idempotent on
sha256(space_id + url) via refs.external_id.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
root
2026-06-01 10:07:33 +10:00
parent e64f1345f6
commit 1ba7aae439
3 changed files with 142 additions and 1 deletions

View File

@@ -1,7 +1,8 @@
from . import echo, pdf, image
from . import echo, pdf, image, video
REGISTRY = {
echo.NAME: echo.handle,
pdf.NAME: pdf.handle,
image.NAME: image.handle,
video.NAME: video.handle,
}