Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Add basic support for JBIG2 by using jbig2dec #3163

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

stefan6419846
Copy link
Collaborator

Closes #1989.

@stefan6419846
Copy link
Collaborator Author

This requires at least version 0.15 of jbig2dec, which is not properly documented in the changelog and required me to look through the commits to find the offending change and its tags: ArtifexSoftware/jbig2dec@c0353a0

Copy link

codecov bot commented Feb 27, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 96.54%. Comparing base (f15ddca) to head (be7075b).
Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3163      +/-   ##
==========================================
+ Coverage   96.53%   96.54%   +0.01%     
==========================================
  Files          53       53              
  Lines        8914     8946      +32     
  Branches     1634     1639       +5     
==========================================
+ Hits         8605     8637      +32     
  Misses        185      185              
  Partials      124      124              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@j-t-1
Copy link
Contributor

j-t-1 commented Feb 27, 2025

Good new functionality.

The rationale for having decode_parms is so external code has a uniform interface for filters. Previously I found this very confusing, partly due to not knowing the PDF specification well. This has two negatives:

  1. We are relying on comments to explain an argument not being used (rather than this being fully in code by not having the argument, and then not needing the comment).
  2. Coding to external library users who may not even use a uniform interface, complicates our interfaces by having an unused argument. Sometimes this is mentioned in code reviews.

Because this function has kwargs, decode_parms could be swallowed by kwargs, that is remove decode_parms argument when not used.

@stefan6419846
Copy link
Collaborator Author

I have more or less copied the existing filters and used the same approach here. For now, I would stay with the current approach and maybe discuss this in a dedicated issue, although there probably will not be much input about this and there is no real urgency to change the current approach in this regard.

@j-t-1
Copy link
Contributor

j-t-1 commented Feb 27, 2025

Appreciate all the added functionality.

I have more or less copied the existing filters and used the same approach here. For now, I would stay with the current approach and maybe discuss this in a dedicated issue, although there probably will not be much input about this and there is no real urgency to change the current approach in this regard.

Agree low urgency, and understandable good approach.

Just opened the specification and of the ten filters, four do not have parameters:

ASCIIHexDecode
ASCII85Decode
RunLengthDecode
JPXDecode

So I was mistaken about JBIG2Decode parameters, and think that keeping the argument is the right thing to do here for this PR (even though it is not used).

So now we have three cases:

  1. decode_parms used.
  2. decode_parms not used (ASCIIHexDecode, ASCII85Decode, RunLengthDecode, JPXDecode).
  3. decode_parms used by PDF specification and not used by pypdf (JBIG2Decode).

I may raise an issue as suggested, as think that removing the argument when not used by the PDF specification (2) will simplify this for pypdf developers and follow the design principle of moving semantics out of comments into code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

NotImplementedError: unsupported filter /JBIG2Decode
2 participants