Skip to content

Conversation

@nico
Copy link
Contributor

@nico nico commented Nov 17, 2025

Encoding a generic refinement region that refines the page contents
requires that the writer has access to the current page contents. For
some region types (text, halftone), computing the output bitmap isn't
trivial, and handling all the composition operators and so on is also
not trivial.

Reimplementing all this in the writer doesn't seem ideal.

So instead, when encountering a generic refinement region refining the
page contents in the writer, collect the encoded segment data for all
segments for the current page (and for the "global page" 0) that are in
front of the refinement region, and call into the loader to decode the
data so far, and then refine the bitmap returned by the loader.

This requires storing the encoded data for each segment, but we do that
already for implementing writing files in random access organization.

It also requires making the list of segments available on
JBIG2EncodingContext.


This approach is O(n^2) in the number of refinement regions encoding the page, but n is usually 0 or 1, and it's also only writing that hits this code path. It doesn't seem worth adding complexity to add a codepath to incrementally draw the segments that are new since last time, and to only pass those.

nico added 7 commits November 17, 2025 16:04
This was blocked on not having test files. We're about to teach the
writer to generate test files and to add test files, so we can add
support for this now :^)
...and add some spec comments.

No behavior change.
The page refinement code will have to apply a crop to the page buffer.
With this, that's easy to do.

No behavior change.
Encoding a generic refinement region that refines the page contents
requires that the writer has access to the current page contents.  For
some region types (text, halftone), computing the output bitmap isn't
trivial, and handling all the composition operators and so on is also
not trivial.

Reimplementing all this in the writer doesn't seem ideal.

So instead, when encountering a generic refinement region refining the
page contents in the writer, collect the encoded segment data for all
segments for the current page (and for the "global page" 0) that are in
front of the refinement region, and call into the *loader* to decode the
data so far, and then refine the bitmap returned by the loader.

This requires storing the encoded data for each segment, but we do that
already for implementing writing files in random access organization.

It also requires making the list of segments available on
JBIG2EncodingContext.
The decode ok in Preview.app, Acrobat Reader, poppler, and HEAD PDFium.
@github-actions github-actions bot added the 👀 pr-needs-review PR needs review from a maintainer or community member label Nov 17, 2025
@nico nico merged commit 95e1462 into SerenityOS:master Nov 17, 2025
12 checks passed
@nico nico deleted the jbig2-refine-page branch November 17, 2025 23:54
@github-actions github-actions bot removed the 👀 pr-needs-review PR needs review from a maintainer or community member label Nov 17, 2025
nico added a commit to nico/serenity that referenced this pull request Nov 18, 2025
Similar to SerenityOS#26410, the challenge with refining halftone and text regions
is that the writer needs to know the decoded halftone and text region
bitmap. That data is easily available in the loader, but not in the
writer.

Similar to SerenityOS#26410, the approach is to call into the *loader* with a
list of segments needed to decode the intermediate region's data.

...and then some minor plumbing to hook up the intermediate region
types in jbig2-from-json.

With this, we can write all region segment types :^)
nico added a commit that referenced this pull request Nov 18, 2025
Similar to #26410, the challenge with refining halftone and text regions
is that the writer needs to know the decoded halftone and text region
bitmap. That data is easily available in the loader, but not in the
writer.

Similar to #26410, the approach is to call into the *loader* with a
list of segments needed to decode the intermediate region's data.

...and then some minor plumbing to hook up the intermediate region
types in jbig2-from-json.

With this, we can write all region segment types :^)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant