Skip to content

fix: use file path instead of stream for PDF thumbnail generation#4334

Merged
ildyria merged 8 commits into
LycheeOrg:masterfrom
mitpjones:fix/pdf-thumbnail-large-files
May 5, 2026
Merged

fix: use file path instead of stream for PDF thumbnail generation#4334
ildyria merged 8 commits into
LycheeOrg:masterfrom
mitpjones:fix/pdf-thumbnail-large-files

Conversation

@mitpjones
Copy link
Copy Markdown
Contributor

@mitpjones mitpjones commented May 4, 2026

Summary

  • Before: Uploading a large (> 50 MB) multi-page PDF produced no thumbnail — the upload appeared to succeed but the thumb was silently missing
  • After: PDF thumbnails are generated correctly regardless of file size (up to 100 MB limit imposed by docker/nginx.conf and PHP settings post_max_size=100M / upload_max_filesize=100M)
  • Root cause: Ghostscript (Imagick's PDF delegate) cannot seek backwards in an anonymous PHP stream, causing silent "Page drawing error" failures. Fixed by passing PDFs to Imagick via a real filesystem path instead of a stream; remote/non-local files are copied to a named temp file first (cleaned up in a finally block)
  • Non-PDF uploads are unaffected

Root cause

When readImageFile() is called with an anonymous stream, Ghostscript receives no filename or format hint and cannot seek backward. Modern PDFs commonly require non-linear reads to resolve cross-reference tables and compressed object streams, so any such PDF fails silently.

Test plan

  • Manual regression test: upload a multi-page PDF > 50 MB without this fix and confirm no thumbnail is generated; apply the fix and confirm a thumbnail is generated for the same file
  • Added testPdfUploadCreatesThumbnail() to PhotosAddHandlerImagickTest — uploads a PDF and asserts a 200×200 thumb is produced
  • Existing image-handler tests continue to pass

Summary by CodeRabbit

  • Bug Fixes

    • Fixed PDF thumbnail generation so first-page thumbnails reliably render for complex/large PDFs and temporary files are always cleaned up on failure.
  • Tests

    • Added a test verifying PDF uploads produce a 200x200 thumbnail variant.
  • Documentation

    • Added feature spec, plan, and task docs describing the PDF thumbnail fix, local vs remote PDF behavior, and acceptance criteria.
  • Chores

    • CI now installs Ghostscript in test runners to support PDF rendering.

mitpjones and others added 5 commits May 4, 2026 19:23
Ghostscript requires a real seekable file path with a .pdf extension to
render PDF pages correctly. Using readImageFile() with an anonymous stream
causes "Page drawing error" for complex PDFs because Ghostscript cannot
seek back through the stream to resolve cross-references and object streams.

For local files, pass the file path directly to readImage('[0]').
For remote files, copy to a named temp file first.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…test

031 was already taken by configurable-webhooks. Renamed folder and updated
all FR/NFR IDs, paths, and references to 038. Also includes post-implementation
cleanup: Safe\fclose, Safe\tempnam, Safe\unlink imports replacing @Unlink
suppression, and the testPdfUploadCreatesThumbnail regression test.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@mitpjones mitpjones requested a review from a team as a code owner May 4, 2026 08:51
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 4, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: c3d875c7-83a3-4db0-84b7-bb271aa3e8f4

📥 Commits

Reviewing files that changed from the base of the PR and between eb26e19 and bac9d75.

📒 Files selected for processing (1)
  • .github/workflows/php_tests.yml
✅ Files skipped from review due to trivial changes (1)
  • .github/workflows/php_tests.yml

📝 Walkthrough

Walkthrough

ImagickHandler::load() now branches PDF vs non‑PDF. PDFs use a real filesystem path (native local, local Flysystem, or a named .pdf temp file for remote sources) and render only the first page via readImage($path . '[0]') with finally cleanup. Non‑PDFs keep stream buffering and readImageFile().

Changes

PDF Thumbnail Generation Fix

Layer / File(s) Summary
Architecture / Spec / Tasks
docs/specs/4-architecture/features/038-pdf-thumbnail-fix/spec.md, plan.md, tasks.md
Adds spec/plan/tasks describing root cause (anonymous PHP streams + Ghostscript), mandates real .pdf paths for PDFs, temp-file handling for remote files, guaranteed cleanup, and marks implementation complete.
Imports / Helpers
app/Image/Handlers/ImagickHandler.php
Adds Safe\* stream/files helpers and file-type imports; introduces private getLocalPath(MediaFile $file): ?string to resolve native/local Flysystem paths or return null for remote files.
Core Implementation
app/Image/Handlers/ImagickHandler.php
Refactors load() to branch early: PDFs -> loadPdf() which uses getLocalPath() or writes a named .pdf temp file, calls Imagick::readImage($pdfPath . '[0]'), and ensures temp-file deletion in finally; non‑PDFs -> loadFromStream() with seekability buffering and readImageFile(); autoRotate() stays after branching.
Tests / CI
tests/ImageProcessing/Image/Handlers/PhotosAddHandlerImagickTest.php, .github/workflows/php_tests.yml
Adds testPdfUploadCreatesThumbnail() verifying PDF thumbnail generation and updates CI to install ghostscript alongside existing image tooling.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

Poem

🐇 I nibbled bytes from streams at night,
I stitched a .pdf path just right.
First page I plucked, a tiny bloom,
No more Ghostscript gloom—hop! —thumbs resume. 🥕✨

🚥 Pre-merge checks | ✅ 1
✅ Passed checks (1 passed)
Check name Status Explanation
Docstring Coverage ✅ Passed Docstring coverage is 83.33% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
docs/specs/4-architecture/features/038-pdf-thumbnail-fix/spec.md (1)

47-59: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Add the required footer at the bottom of this spec.

This new markdown doc ends without the mandated --- + *Last updated: 2026-05-04* footer.

As per coding guidelines, "**/*.md: Use Markdown format for documentation" and "At the bottom of documentation files, add an hr line followed by Last updated: [date of the update]`."


ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 7154ea7d-f5f5-47c2-ba8d-32f7955a8fff

📥 Commits

Reviewing files that changed from the base of the PR and between 2938638 and 9467ea3.

📒 Files selected for processing (5)
  • app/Image/Handlers/ImagickHandler.php
  • docs/specs/4-architecture/features/038-pdf-thumbnail-fix/plan.md
  • docs/specs/4-architecture/features/038-pdf-thumbnail-fix/spec.md
  • docs/specs/4-architecture/features/038-pdf-thumbnail-fix/tasks.md
  • tests/ImageProcessing/Image/Handlers/PhotosAddHandlerImagickTest.php

Comment thread app/Image/Handlers/ImagickHandler.php Outdated
Comment thread docs/specs/4-architecture/features/038-pdf-thumbnail-fix/plan.md
Comment thread docs/specs/4-architecture/features/038-pdf-thumbnail-fix/tasks.md
- Use tempnam() + rename() to avoid leaving an orphaned placeholder file
- Wrap stream copy in its own try/finally to guarantee fclose() runs
- Move readImage() inside the outer try/finally so temp file is cleaned up
  regardless of where an exception is thrown
- Add is_file() guard before unlink() in case creation failed early
- Add Safe\rename import
- Update spec/plan/tasks doc footers and stale last-updated dates

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@ildyria
Copy link
Copy Markdown
Member

ildyria commented May 4, 2026

Hi @mitpjones thank you for your PR.

Can I just ask you to extract the content of the if ($is_pdf) into a private function and similar for the normal path in order to reduce the complexity of the function.

Other than that LGTM. :)

Reduces cyclomatic complexity of load() as requested in PR review.
PDF and stream loading logic now live in dedicated private methods
with their own resource cleanup.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@mitpjones
Copy link
Copy Markdown
Contributor Author

Hi @ildyria,

getting a bit late for me here now. I'll check tomorrow if you want any further changes.

Thanks,
Tim

PDF thumbnail generation requires Ghostscript as an Imagick delegate.
Without it, readImage() on a PDF path fails silently and no thumbnail
is produced, causing testPdfUploadCreatesThumbnail to fail.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@codecov
Copy link
Copy Markdown

codecov Bot commented May 5, 2026

Codecov Report

❌ Patch coverage is 65.62500% with 11 lines in your changes missing coverage. Please review.
✅ Project coverage is 90.36%. Comparing base (2938638) to head (bac9d75).
⚠️ Report is 6 commits behind head on master.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@ildyria ildyria merged commit a931423 into LycheeOrg:master May 5, 2026
45 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants