test(workers): pdf/image test fixtures
born_digital.pdf (pdftotext extractable), scanned.pdf (image-only, OCR fallback target), eng_text.png (clear Tesseract-readable text). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
9
workers/tests/fixtures/README.md
vendored
Normal file
9
workers/tests/fixtures/README.md
vendored
Normal file
@@ -0,0 +1,9 @@
|
||||
# Test fixtures
|
||||
|
||||
Used by `tests/test_pdf.py` and `tests/test_image.py`. Three invariants:
|
||||
|
||||
1. **`born_digital.pdf`** — must contain the literal string `void-workers` when extracted via `pdftotext`. Generated from `/tmp/text.ps` then `ps2pdf`.
|
||||
2. **`scanned.pdf`** — `pdftotext` must return **near-empty** output (the OCR fallback test depends on this). Generated by converting `eng_text.png` to a single-page image-only PDF: `convert -density 200 eng_text.png scanned.pdf`.
|
||||
3. **`eng_text.png`** — must contain the literal string `blackflame palette`, rendered clearly enough for Tesseract to read it. Generated with `convert -size 800x200 xc:white -font DejaVu-Sans -pointsize 36 -fill black -annotate +50+100 "blackflame palette" eng_text.png`.
|
||||
|
||||
Regenerate via the snippets in `../../docs/superpowers/plans/2026-06-01-void-v2-plan4-workers.md` Task B1.
|
||||
Reference in New Issue
Block a user