Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Large CPU and memory consumption on decoding a crafted GIF file #1052

Closed
Shnatsel opened this issue Oct 2, 2019 · 16 comments · Fixed by #2090
Closed

Large CPU and memory consumption on decoding a crafted GIF file #1052

Shnatsel opened this issue Oct 2, 2019 · 16 comments · Fixed by #2090

Comments

@Shnatsel
Copy link
Contributor

Shnatsel commented Oct 2, 2019

Tested on latest git commit as of this writing, ac93e75

Actual behavior

Decoding a crafted file mere 469 bytes in size takes over 7 gigabytes of memory. And not just virtual memory - it does get actually used and backed by physical memory.

This sample does not pose a problem as is, but it's possible that by applying the same technique it's possible to create very small files that OOM the system.

Expected

imagemagick rejects these files immediately. For the exact reason for each file see https://ncry.pt/p/6bPn#OPffqnjbN7RwGKJMwA0BQl3wuNlynKoUxw2dXZx9UKY

Reproduction steps

Same reproduction code as in #876

I'm attaching all samples classified as hangs by AFL so that you have the most complete info. The file id:000002,src:000000+000052,op:splice,rep:128 is the most egregious in terms of time and memory consumption.

image_gif_timeouts.zip

@HeroicKatora
Copy link
Member

HeroicKatora commented Oct 2, 2019

Woah! gif only has 16-bit sizes so this is definitely excessive.

@HeroicKatora
Copy link
Member

Or just enough to be possible with 2^32 * 4 bytes (width×height×bpp). Either way, it's a good reminder to have better limits.

@Shnatsel
Copy link
Contributor Author

Shnatsel commented Oct 2, 2019

For more on memory limits see #938

@fintelia
Copy link
Contributor

fintelia commented Oct 2, 2019

In fact, the code from #876 does implement a memory limit. width * height < 2_000_000_000 at 4 bytes per pixel enforces a limit of about 7.45 GiB. I don't think this particular case is especially worrying, but we should have a more general solution.

I alluded to this in that other thread, but perhaps our new Reader API should have:

/// Set the limit on total memory use. Attempting to read a file which would take more 
/// space than this will fail with ImageError::InsufficientMemory. Defaults to 128MB, set to 
/// `usize::MAX` to disable.
Reader::with_memory_limit(bytes: usize);
/// Set the limit on image size. Defaults to 16,384 x 16,384.
Reader::with_size_limit(width: u32, height: u32);

@HeroicKatora
Copy link
Member

Good catch, I had for some reason thought the limit was 2_000_000 pixels and there was a verification bug in the existing code. I'm glad it's just a little less bad.

@Shnatsel
Copy link
Contributor Author

Shnatsel commented Oct 3, 2019

It's worth noting that the 2_000_000_000 bytes check is not actually there in the library. The code in #876 that does the check is just a fuzzing harness and is not present in the crate itself.

@Shnatsel
Copy link
Contributor Author

Shnatsel commented Jan 3, 2024

I've updated the reproducing code to the latest version of image, and I can still reproduce the abnormally high memory consumption.

These files blow past the explicitly set size and allocation limits to use up over 7GB memory in image v0.24.7.

Code used:

extern crate image;

use image::AnimationDecoder;
use image::io::Limits;

use std::fs::File;
use std::io::Read;
use std::env;
 
fn main() {
    let filename = env::args().skip(1).next().expect("No file given");
    println!("Decoding {:?}", filename);
    let mut file = File::open(filename).unwrap();
 
    let mut contents: Vec<u8> = Vec::new();
    // Returns amount of bytes read and append the result to the buffer
    let _num_read = file.read_to_end(&mut contents).unwrap();
    gif_decode(&contents);
}

fn gif_decode(data: &[u8]) {
    let mut limits = Limits::default();
    limits.max_image_height = Some(16384); // 16384 * 16384 * 4 bytes per pixel = 1 GiB
    limits.max_image_width = Some(16384);
    limits.max_alloc = Some(1024 * 1024 * 1024); // 1 GiB
    if let Ok(decoder) = image::codecs::gif::GifDecoder::with_limits(data, limits) {

        let mut frames = decoder.into_frames();
        while let Some(Ok(_)) = frames.next() {
            // we don't use the frames for anything
        }
    }
}

Notable differences from the previous code:

  1. No longer uses .collect_frames(), reads frames one-by-one and discards the contents immediately
  2. Uses the built-in limiting facilities for both dimensions and memory usage

If I did the math right, both size and memory limiting are ineffective for these files - they both constrain memory usage to 1GiB, but I am seeing up to 7GiB memory usage in practice.

@fintelia
Copy link
Contributor

fintelia commented Jan 3, 2024

If you call GifDecoder::set_limits that should at least return an error if the dimensions exceed what you set.

Really, we should deprecate GifDecoder::with_limits and for now have it internally call new followed by set_limits.

@Shnatsel
Copy link
Contributor Author

Shnatsel commented Jan 3, 2024

Using .new() + a separate .set_limits() call does enforce the width/height limits!

These files still blow right past the allocation limit, but I'm not sure if that's even supported for gif.

@Shnatsel
Copy link
Contributor Author

Shnatsel commented Jan 3, 2024

It might be useful to at least do a crude check against the memory limit: width * height * bit_depth <= limits.max_alloc and return an error if the image clearly exceeds that.

@Shnatsel
Copy link
Contributor Author

Shnatsel commented Jan 3, 2024

Oh, there is native support for a memory limit in the gif crate: https://docs.rs/gif/latest/gif/struct.MemoryLimit.html

It's just not wired up to image for some reason.

@Shnatsel
Copy link
Contributor Author

Shnatsel commented Jan 3, 2024

Hmm, I think the gif crate limits are not expressive enough. They only allow setting the limit of up to u32 bytes: https://docs.rs/gif/latest/src/gif/reader/mod.rs.html#32-63

But you can have a GIF image that decodes into up to width * height * bits_per_pixel which can be at most u16::MAX * u16::MAX * 4 which is 16GiB, while u32::MAX is 4GiB.

@Shnatsel
Copy link
Contributor Author

Shnatsel commented Jan 3, 2024

I've opened a pull request for the gif crate to fix its memory limit implementation: image-rs/image-gif#156

Once that's merged and shipped, I can pass through the memory limit set in image to the underlying gif decoder and finally fix this.

@fintelia
Copy link
Contributor

fintelia commented Jan 4, 2024

I just checked thegif crate. It only checks the memory limit in read_next_frame which is never called from the image crate. It also only applies to the memory allocation used for the output buffer so any scratch buffers used during the decoding process aren't counted.

However, the general convention in the image crate is that decoders should not count output buffers towards the allocation limit because the API of ImageDecoder::read_image is specifically designed so that the allocation happens in the caller's code and not within the decoder. The higher-level io::Reader API then decrements the allocation limit by the output buffer size before decoding starts

@Shnatsel
Copy link
Contributor Author

Shnatsel commented Jan 4, 2024

It is great that image implements a generic approach that counts the output buffer size towards the limit! But it does not appear to be working in this case.

If I take the above code and uncomment the limits on the dimensions, it creates a very large output buffer and consumes 7GiB again. I suspect that check is not performed in GifDecoder.into_frames().next(). Any tips on how to plumb it in there somewhere in a robust manner?

@fintelia
Copy link
Contributor

fintelia commented Jan 4, 2024

The into_frames API is kind of a mess, so I'm not sure if there's clean way to do it. But you probably have to pass in either the limits or the error that the limits are exceeded as a field within the object that gets returned. And then when next is called, you'd return that error instead of decoding a frame.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants