Library wide memory limits #938

birktj · 2019-05-05T14:28:27Z

By now most decoder libraries have some sort of Limits struct or sanity checks that stops them from trying to decode images that would make them run out of memory. However this library doesn't have anything like it that could be used to configure these limits of the individual decoders. We should also use them to set a limit to the size of ImageBuffer and alike, with image-rs/image-png#115 for example the png decoder will happily decode an image that is 100 000 x 100 000 pixels large as long as it doesn't use to much memory. In these cases we should then have a second check that we are not allocating a ImageBuffer of that size.

The text was updated successfully, but these errors were encountered:

Shnatsel · 2019-08-31T21:28:40Z

TIFF now also implements memory limit based on the maximum buffer size, just like png crate:
https://github.com/image-rs/image-tiff/blob/f6a91065d937fac56e2b70dae90da20e40fa46c1/src/decoder/mod.rs#L125-L136

HeroicKatora · 2019-08-31T21:30:14Z

And y4m also adopted one based on buffer size here (even though that's less relevant for images right now).

fintelia · 2019-09-01T01:04:57Z

There is also the ImageDecoder::total_bytes method, which is implemented for all decoders. That doesn't count intermediate allocations done while decoding an image, but those should generally be quite small. Perhaps that method could be used to have the new Reader struct take an optional memory limit on the size of images to load?

HeroicKatora · 2019-10-02T21:28:30Z

With the new io::Reader interface we could also support this in a semi-opaque way by adding a method to set a limit. This could have three possible internal states:

Do not enforce limits (default for now)
Enforce some useful default limits if the format supports limits (default in the next major version)
Enforce exact custom limits, error if none are supported

impl<R> Reader<R> {
    pub fn no_limits(&mut self) { }
    pub fn smart_limits(&mut self) { }
    pub fn set_limits(&mut self, _: Limits) { }
}

fintelia · 2019-10-02T21:36:00Z

I posted something very similar in the other thread. If we want to have limits on the decoder::dimensions or decoder::total_bytes then we wouldn't need any special buy in from implementations. I don't think any of them do large allocations at creation time, only when actually directed to decode the image so we can query metadata without risking OOMs.

Ideally the limits should be in terms of total memory used, but the amount of memory used for the output seems like a reasonably compromise for now: for reasonably sized images the allocations required are only at most 2x (or 3x?) the total output size.

birktj · 2020-10-01T17:46:00Z

I'm going to have a second try on this. I think I have found a reasonably extendable approach that should work nicely.

I propose to add the following struct:

impl Limits {
    /// Decode without any limits on resource usage.
    pub fn no_limits() -> Limits { }

    /// Set a relaxed limit on the number of bytes that may be allocated during decoding. This is a
    /// "best effort" limit which may be exceeded in some cases.
    pub fn set_max_bytes_relaxed(&mut self, max_bytes: u64) { }

    /// Set a strict limit on the number of bytes that may be allocated during decoding. If the
    /// decoder cannot guarantee to uphold this limit then decoding will fail.
    pub fn set_max_bytes_strict(&mut self, max_bytes: u64) { }

    // Easy to add other limits in the future.
}

And extend the Reader struct with the following methods, I think the default for reader should be some reasonabl default set of limits (currently relaxed).

impl<R> Reader<R> {
    /// Disable all limits.
    pub fn no_limits(&mut self) { }

    /// Set resource limits for decoding.
    pub fn set_limits(&mut self, limits: Limits) { }
}

And extend the ImageDecoder trait with the following method with a default implementation:

impl ImageDecoder {
    /// Set limits for this decoder.
    fn set_limits(&mut self, limits: Limits) -> ImageResult<()> {
        if limits.max_bytes.is_strict() {
            return Err(ImageError::Limits(error::LimitError::from_kind(
                error::LimitErrorKind::CannotGuarantee)))
        }

        if let Some(max_bytes) = limits.max_bytes.get() {
            if max_bytes < self.total_bytes() {
                return Err(ImageError::InsufficientMemory)
            }
        }

        Ok(())
    }
}

Shnatsel · 2020-10-01T17:53:43Z

I'm not convinced that decoding failing at runtime is a good idea for strict limits. I'd rather have the strict limits method not be implemented on those decoders.

Speaking of relaxed and strict limits, I think it makes sense to separately limit the maximum size of the final image (which is easy to enforce for all decoders) and the limit on the RAM allocated in addition to storing the final image, which some decoders might not support.

birktj · 2020-10-01T17:59:44Z

For it to work together with io::Reader it would have to be a runtime check, right?

The idea is that it should be easy to add different kinds of fine-grained limits in the future. However your idea with separating the total image size and other potential allocations sounds like a nice starting point. The maximum image size doesn't even need to have a strict/relaxed type bound.

Shnatsel · 2020-10-01T18:07:13Z

Yes, I suppose we would need a runtime check.

Total size of the resulting image is relatively easy to enforce, so I'd start with adding that to everything and then see if we need any limits that are more specialized. If we do, we can add them on a per-decoder basis.

Also, I wonder how this would interact with multi-frame images, such as animated GIF or APNG? I'm tempted to set a limit per frame, since it's easy to hand-roll total size limit if you have a frame size limit, but not vice versa.

fintelia · 2020-10-01T18:15:23Z

@birktj how does your design deal with adding additional strict limits later? Wouldn't decoders have to know about the change or else they'd silently ignore the new limits?

birktj · 2020-10-01T18:16:30Z

Looking over the tiff decoder I see that we also still have the problem of setting limits before metadata is read. There the entire ifd directory is read into memory in order to get metadata. This is problematic because is one of the things we probably would like more fine-grained limits for. (Of course this may simply be a problem with the implementation in the tiff crate that we can fix later).

This was also discussed in #1088 (comment)

birktj · 2020-10-01T18:22:28Z

@fintelia I realize this would be a slight problem for external decoders. For internal decoders I thought the best would be to destructure the Limits trait to make sure all checks were accounted for, however this is obviously not a possibility for external decoders.

Maybe some sort of ack_limit and remanding_limits could work, however this feels a bit messy?

HeroicKatora · 2020-10-01T18:29:20Z

Looking over the tiff decoder I see that we also still have the problem of setting limits before metadata is read.

Instead of requiring limits before reading, could we avoid buffering the whole ifd to memory? Maybe we could skip all tags that are not relevant to the methods required by ImageDecoder and the re-read the image by seeking to the start.

birktj · 2020-10-01T18:47:53Z

Yeah, I believe we can probably just consider this a quirk in the tiff crate that could be fixed over there at some point.

birktj · 2020-10-01T18:57:30Z

Actually on closer inspection it doesn't look so bad, it seems the hashmap only contains pointers to the real data and reading of potentially large datatypes like strings is delayed until they are needed. This should mean that setting limits after the decoder is created should be fine.

Shnatsel · 2020-10-01T19:06:10Z

Unless I can create usize::MAX hashtable entries or some such. Not sure about TIFF, but Vorbis totally lets me do that.

HeroicKatora · 2020-10-01T19:15:54Z

The number of baseline tags is finite and iirc they can only appear once. Maybe we should even use tinyvec+bitset or so to read them there instead of a HashMap in any case.

fintelia · 2020-10-01T19:37:43Z

Also TIFF tags are only 16-bits, so even if the image contained every possible tag (most of which aren't even defined...) the hashtable would only take a few megabytes.

birktj added discussion draft Add if this issue includes suggested code, compares interfaces, preparses wording,etc kind: enhancement medium labels May 5, 2019

Shnatsel mentioned this issue Oct 2, 2019

Large CPU and memory consumption on decoding a crafted GIF file #1052

Closed

birktj mentioned this issue Nov 29, 2019

Add limits to decoder #1088

Closed

HeroicKatora mentioned this issue Jan 29, 2020

Error rework #1119

Merged

HeroicKatora added this to Planned in Version 0.23 Feb 9, 2020

This was referenced Feb 12, 2020

Slow parsing for small inputs image-rs/jpeg-decoder#134

Open

Introduce limits: Excessive memory allocation for small inputs image-rs/jpeg-decoder#133

Closed

birktj mentioned this issue Mar 5, 2021

Decoding limits try 2 #1445

Merged

HeroicKatora added the topic: meta data Towards dealing with meta data label Mar 12, 2021

fintelia closed this as completed Mar 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Library wide memory limits #938

Library wide memory limits #938

birktj commented May 5, 2019

Shnatsel commented Aug 31, 2019

HeroicKatora commented Aug 31, 2019 •

edited

Loading

fintelia commented Sep 1, 2019

HeroicKatora commented Oct 2, 2019 •

edited

Loading

fintelia commented Oct 2, 2019

birktj commented Oct 1, 2020

Shnatsel commented Oct 1, 2020

birktj commented Oct 1, 2020

Shnatsel commented Oct 1, 2020

fintelia commented Oct 1, 2020

birktj commented Oct 1, 2020

birktj commented Oct 1, 2020

HeroicKatora commented Oct 1, 2020

birktj commented Oct 1, 2020

birktj commented Oct 1, 2020

Shnatsel commented Oct 1, 2020

HeroicKatora commented Oct 1, 2020 •

edited

Loading

fintelia commented Oct 1, 2020

Library wide memory limits #938

Library wide memory limits #938

Comments

birktj commented May 5, 2019

Shnatsel commented Aug 31, 2019

HeroicKatora commented Aug 31, 2019 • edited Loading

fintelia commented Sep 1, 2019

HeroicKatora commented Oct 2, 2019 • edited Loading

fintelia commented Oct 2, 2019

birktj commented Oct 1, 2020

Shnatsel commented Oct 1, 2020

birktj commented Oct 1, 2020

Shnatsel commented Oct 1, 2020

fintelia commented Oct 1, 2020

birktj commented Oct 1, 2020

birktj commented Oct 1, 2020

HeroicKatora commented Oct 1, 2020

birktj commented Oct 1, 2020

birktj commented Oct 1, 2020

Shnatsel commented Oct 1, 2020

HeroicKatora commented Oct 1, 2020 • edited Loading

fintelia commented Oct 1, 2020

HeroicKatora commented Aug 31, 2019 •

edited

Loading

HeroicKatora commented Oct 2, 2019 •

edited

Loading

HeroicKatora commented Oct 1, 2020 •

edited

Loading