Skip to content

Conversation

@1c3t3a
Copy link
Contributor

@1c3t3a 1c3t3a commented Nov 21, 2025

This change implements gaps we observed in the current Metadata extraction for PNG. There are multiple significant changes:

  • For all metadata types, also search for metadata in the zTXt, tEXT and iTXt chunk.
  • If we find metadata in the zTXt chunk, handle it like ImageMagick metadata and interpret it as a hex string prefixed by a header, that is separated by newlines.

We did significant testing on these methods internally and ran it against all test files we could find. We also ran it against all test images in libpng and pngsuite, and so far did not find a single image that had metadata in the zTXt chunk that was not compressed like the ImageMagick hex string.

@1c3t3a 1c3t3a force-pushed the png-metadata branch 2 times, most recently from a0e290d to 0275027 Compare November 21, 2025 06:21
This change implements gaps we observed in the current Metadata extraction for PNG.
There are multiple significant changes:
- For all metadata types, also search for  metadata in the zTXt, tEXT and
  iTXt chunk.
- If we find metadata in the zTXt chunk, handle it like ImageMagick metadata and
  interpret it as a hex string prefixed by a header, that is separated by
  newlines.

We did significant testing on these methods internally and ran it against all test
files we could find. We also ran it against all test images in libpng and pngsuite,
and so far did not find a single image that had metadata in the zTXt chunk that was
not compressed like the ImageMagick hex string.
Copy link
Member

@197g 197g left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like it, the internal test data sounds encouraging. Regarding images, what's the status of attribution to include them here?

Comment on lines +108 to +112
56, 66, 73, 77, 4, 4, 0, 0, 0, 0, 0, 49, 28, 2, 110, 0, 24, 65, 73, 45, 71, 101, 110, 101,
114, 97, 116, 101, 100, 32, 119, 105, 116, 104, 32, 71, 111, 111, 103, 108, 101, 28, 2, 90,
0, 8, 75, 105, 110, 103, 115, 116, 111, 110, 28, 2, 0, 0, 2, 0, 4, 0, 56, 66, 73, 77, 4,
37, 0, 0, 0, 0, 0, 16, 67, 89, 196, 70, 206, 234, 16, 4, 50, 89, 230, 125, 147, 191, 230,
81,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume this changed as a result of the hex decoding? At least the original test side looks like mostly hex digits from a glance.

let parse = || {
let mut parts = buffer.splitn(4, '\n');

// Skip the first two parts, grab the size (3rd part), then the body.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: We're now dropping the contents of the second part, what would be contained in it? In the example it only says Optional but if I understand the imagemagick source for it it is any ascii whitespace padding until a non-ws character.

///
/// This method parses such data and returns the decoded payload.
/// If the structure doesn't match, we return and empty `Vec<u8>`.
fn parse_raw_profile(buffer: &str, header: Option<&str>) -> Vec<u8> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this something that would make sense to expose as a public method from the png crate?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yeah, that would really make sense! It should be a method on ZTXtChunk. I will open a PR shortly to add that.

let size_str = parts.nth(2)?;
let body = parts.next()?;

// Parse size and validate a 4MB limit.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where does this 4MB come from? Is it from the underlying standards, or something you're adding to avoid running out of memory?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants