Skip to content

Conversation

@nico
Copy link
Contributor

@nico nico commented Nov 13, 2025

There can be up to four custom huffman tables for refinement in
text regions. Text regions also must refer to at least one symbol
dictinary, so customizing all refinement tables means that a text
region refers to five other segments.

That causes issues, especially in other decoders, see #26393.

Also, Acrobat Reader apparently can't handle if the table that stores
refinement data sizes is customized.

So instead of adding a single test that customizes all four tables,
add a bunch of tests, with the goal of most of them decoding in
several other JBIG2 decoders.

  1. Add a file with custom tables for refinement position.
    Decodes fine in PDFium, Acrobat Reader.
    Decodes mostly fine in Preview.app.

  2. Add a file with custom tables for refinement dimensions.
    Decodes fine in PDFium, Acrobat Reader.
    Decodes mostly fine in Preview.app.

  3. Add a file with custom tables for refinement data size.
    Decodes fine in PDFium.
    Decodes mostly fine in Preview.app.

  4. Add a file with custom tables for refinement position and
    dimensions. This has 5 referred-to segments.
    Decodes fine in Acrobat Reader (and HEAD PDFium).
    Decodes mostly fine in Preview.app.

  5. Finally, add one with all customized.
    PDFium can decode this as of a few hours ago, but nothing else can.
    (Preview.app comes fairly close.) Given that it's just a combination
    of tests 1-4 and those all decode fine somewhere, there's some hope
    to believe that the test itself is correct.

For the contents of the custom huffman tables, I used the same
contents as in the default huffman tables (B.14 for the first four,
B.1 for the last), but changed them a bit. For B.14, I put the 1-length
symbol in each of the four slots that it isn't in the normal B.14 table,
and for B.1 I swapped the first two entries.


Also, fix a bug in our implementation of writing custom tables that these tests found :^)

@github-actions github-actions bot added the 👀 pr-needs-review PR needs review from a maintainer or community member label Nov 13, 2025
@nico nico force-pushed the jbig2-stuff branch 2 times, most recently from d9e5770 to e9b41f5 Compare November 13, 2025 02:45
nico added 3 commits November 13, 2025 13:26
The custom tables in this test are meant to be the default tables with
some rows swapped, but I had a typo in the tables that are meant to
be B.1 with rows swapped: B.1 has a lowest value of 0, not 1.

No real behavior change (the test checked what it was supposed to
check with the "wrong" value too), just pedantry.
* The huffman "refine one" test was added in SerenityOS#26388
* We now have tests for everything tested by Annex H except for
  a symbol refinement segment referring to itself, so use a more
  precise bullet for just that
This makes no difference when the default table is used, but it does
make a difference if a custom table is present.
There can be up to four custom huffman tables for refinement in
text regions. Text regions also must refer to at least one symbol
dictionary, so customizing all refinement tables means that a text
region refers to five other segments.

That causes issues, especially in other decoders, see SerenityOS#26393.

Also, Acrobat Reader apparently can't handle if the table that stores
refinement data sizes is customized.

So instead of adding a single test that customizes all four tables,
add a bunch of tests, with the goal of most of them decoding in
several other JBIG2 decoders.

1. Add a file with custom tables for refinement position.
   Decodes fine in PDFium, Acrobat Reader.
   Decodes mostly fine in Preview.app.

2. Add a file with custom tables for refinement dimensions.
   Decodes fine in PDFium, Acrobat Reader.
   Decodes mostly fine in Preview.app.

3. Add a file with custom tables for refinement data size.
   Decodes fine in PDFium.
   Decodes mostly fine in Preview.app.

4. Add a file with custom tables for refinement position and
   dimensions. This has 5 referred-to segments.
   Decodes fine in Acrobat Reader (and HEAD PDFium).
   Decodes mostly fine in Preview.app.

5. Finally, add one with all customized.
   PDFium can decode this as of a few hours ago, but nothing else can.
   (Preview.app comes fairly close.) Given that it's just a combination
   of tests 1-4 and those all decode fine somewhere, there's some hope
   to believe that the test itself is correct.

For the contents of the custom huffman tables, I used the same
contents as in the default huffman tables (B.14 for the first four,
B.1 for the last), but changed them a bit. For B.14, I put the 1-length
symbol in each of the four slots that it isn't in the normal B.14 table,
and for B.1 I swapped the first two entries.
@nico nico merged commit cadb38a into SerenityOS:master Nov 13, 2025
13 checks passed
@nico nico deleted the jbig2-stuff branch November 13, 2025 23:20
@github-actions github-actions bot removed the 👀 pr-needs-review PR needs review from a maintainer or community member label Nov 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants