born_digital.pdf (pdftotext extractable), scanned.pdf (image-only, OCR fallback target), eng_text.png (clear Tesseract-readable text). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
10 lines
861 B
Markdown
10 lines
861 B
Markdown
# Test fixtures
|
|
|
|
Used by `tests/test_pdf.py` and `tests/test_image.py`. Three invariants:
|
|
|
|
1. **`born_digital.pdf`** — must contain the literal string `void-workers` when extracted via `pdftotext`. Generated from `/tmp/text.ps` then `ps2pdf`.
|
|
2. **`scanned.pdf`** — `pdftotext` must return **near-empty** output (the OCR fallback test depends on this). Generated by converting `eng_text.png` to a single-page image-only PDF: `convert -density 200 eng_text.png scanned.pdf`.
|
|
3. **`eng_text.png`** — must contain the literal string `blackflame palette`, rendered clearly enough for Tesseract to read it. Generated with `convert -size 800x200 xc:white -font DejaVu-Sans -pointsize 36 -fill black -annotate +50+100 "blackflame palette" eng_text.png`.
|
|
|
|
Regenerate via the snippets in `../../docs/superpowers/plans/2026-06-01-void-v2-plan4-workers.md` Task B1.
|