born_digital.pdf (pdftotext extractable), scanned.pdf (image-only, OCR fallback target), eng_text.png (clear Tesseract-readable text). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
861 B
861 B
Test fixtures
Used by tests/test_pdf.py and tests/test_image.py. Three invariants:
born_digital.pdf— must contain the literal stringvoid-workerswhen extracted viapdftotext. Generated from/tmp/text.psthenps2pdf.scanned.pdf—pdftotextmust return near-empty output (the OCR fallback test depends on this). Generated by convertingeng_text.pngto a single-page image-only PDF:convert -density 200 eng_text.png scanned.pdf.eng_text.png— must contain the literal stringblackflame palette, rendered clearly enough for Tesseract to read it. Generated withconvert -size 800x200 xc:white -font DejaVu-Sans -pointsize 36 -fill black -annotate +50+100 "blackflame palette" eng_text.png.
Regenerate via the snippets in ../../docs/superpowers/plans/2026-06-01-void-v2-plan4-workers.md Task B1.