Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoiding scratch_buffer (a field of struct Reader<R>) #417

Open
anforowicz opened this issue Oct 4, 2023 · 1 comment
Open

Avoiding scratch_buffer (a field of struct Reader<R>) #417

anforowicz opened this issue Oct 4, 2023 · 1 comment

Comments

@anforowicz
Copy link
Contributor

There is a TODO in next_interlaced_row asking to "change the interface of next_interlaced_row to take an output buffer instead of making us return a reference to a buffer that we own". I very much agree with this TODO - it seems that it would be best to output directly to the final buffer (as the next_frame API does) rather than forcing the caller to copy the bytes. I assume that outputting directly to the final buffer would be good for:

  • Reducing the number of memcpy-like calls
  • Reducing the number of L1 cache misses
  • Reducing the memory pressure overall

FWIW, the performance considerations above mostly do not affect the next_frame API (which calls into lower-level functions like next_interlaced_row_impl for non-interlaced images) and therefore mostly do not affect png crate's benchmarks. OTOH, users of the png crate who wish to post-process the output (e.g. to transform RGB into RGBA, or alpha-multiply) may wish to do such post-processing row-by-row (while the freshly decoded row is still hot in the L1 cache). More specifically, the performance considerations to apply to:

  • image::codecs::png::PngReader::read (which calls next_row)
  • Current prototype integrating the png crate into Chromium (currently built on top of the image crate, but working directly with the png crate also wouldn't help because of the current shape of the next_row API)

So (given the presence of TODO + performance benefits), should I just go ahead and make a breaking change to the png::Reader::next_row and png::Reader::next_interlaced_row APIs?

  • Increase the crate version number in Cargo.toml
  • Remove struct Row<'data>
  • Remove struct InterlacedRow<'data>
  • Modify fn next_row:
    • Old: pub fn next_row(&mut self) -> Result<Option<Row>, DecodingError>
    • New: pub fn next_row(&mut self, buf: &mut [u8]) -> Result<Some<usize>, DecodingError>, documenting that:
      • This function may panic if buf is too small for the next row
      • This function returns the length of the row, or None is there is no next row
  • Modify fn next_interlaced_row:
    • Old: pub fn next_interlaced_row(&mut self) -> Result<Option<InterlacedRow>, DecodingError>
    • New: pub fn next_interlaced_row(&mut self, buf: &mut [u8]) -> Result<Option<InterlaceInfo>, DecodingError>, documenting that:
      • This function may panic if buf is too small for the next interlaced row
      • This function returns row info, or None if there is no next row

WDYT? Are there some alternative API designs that we should consider first?

@fintelia
Copy link
Contributor

fintelia commented Oct 4, 2023

That sounds good to me. If we're doing a breaking change, it might make sense to drop next_row entirely given that it is just a very thin wrapper over next_interlaced_row:

pub fn next_row(&mut self) -> Result<Option<Row>, DecodingError> {
self.next_interlaced_row()
.map(|v| v.map(|v| Row { data: v.data }))
}

anforowicz added a commit to anforowicz/image-png that referenced this issue Nov 1, 2023
This commit is desirable, because it avoids copying of image data across
intermediate buffers.  Avoiding such copying seems important based on
the data gathered in
image-rs#416 (comment)
Removal of `Reader::scrach_buffer` was not included in the initial
series of commits used to test the copy avoidance approach on
performance, but it should have a similar effect.

Fixes image-rs#417
anforowicz added a commit to anforowicz/image-png that referenced this issue Nov 1, 2023
This commit is desirable, because it avoids copying of image data across
intermediate buffers.  Avoiding such copying seems important based on
the data gathered in
image-rs#416 (comment)
Removal of `Reader::scrach_buffer` was not included in the initial
series of commits used to test the copy avoidance approach on
performance, but it should have a similar effect.

Fixes image-rs#417
anforowicz added a commit to anforowicz/image-png that referenced this issue Nov 15, 2023
This commit is desirable, because it avoids copying of image data across
intermediate buffers.  Avoiding such copying seems important based on
the data gathered in
image-rs#416 (comment)
Removal of `Reader::scrach_buffer` was not included in the initial
series of commits used to test the copy avoidance approach on
performance, but it should have a similar effect.

Fixes image-rs#417
anforowicz added a commit to anforowicz/image-png that referenced this issue Jul 17, 2024
This commit is desirable, because it avoids copying of image data across
intermediate buffers.  This commit was motivated by the data gathered in
image-rs#416 (comment)
The commit results in the followin performance gains seen in the
recently introduced `row-by-row/128x128-4k-idat` benchmark:

- time: [-18.401% -17.807% -17.192%] (p = 0.00 < 0.05)
- time: [-9.4276% -8.8789% -8.2860%] (p = 0.00 < 0.05)
- time: [-12.389% -11.780% -11.181%] (p = 0.00 < 0.05)

Fixes image-rs#417
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants