Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory leaks in Image.close() after saving as pdf or png #6448

Closed
mansuf opened this issue Jul 17, 2022 · 8 comments · Fixed by #6456
Closed

Memory leaks in Image.close() after saving as pdf or png #6448

mansuf opened this issue Jul 17, 2022 · 8 comments · Fixed by #6456
Labels

Comments

@mansuf
Copy link

mansuf commented Jul 17, 2022

What did you do?

I'm converting 16 PNG images (with resolution 3000x4000) to PDF file and since Pillow didn't close the images after converting, it can cause huge memory. So i modified it's code from https://github.com/python-pillow/Pillow/blob/9.2.0/src/PIL/PdfImagePlugin.py#L218-L227, it looks like this

            existing_pdf.write_obj(contents_refs[page_number], stream=page_contents)

            page_number += 1

        # Close image after converting to save some memory
        im_sequence.close() # <--- This is the modified code

    #
    # trailer
    existing_pdf.write_xref_and_trailer()
    if hasattr(fp, "flush"):
        fp.flush()
    existing_pdf.close()

What did you expect to happen?

The memory stay low during converting

What actually happened?

Usually it worked for Pillow 9.0.1 and the memory stays between 20-60 MB. But after installing version 9.2.0, 9.1.0, and 9.1.1 it increased up to 700+ MB

Take a look at this footages

Pillow 9.0.1
Pillow_v9.0.1.mp4
Pillow 9.1.0
Pillow_v9.1.0.mp4
Pillow 9.1.1
Pillow_v9.1.1.mp4
Pillow 9.2.0
Pillow_v9.2.0.mp4

What are your OS, Python and Pillow versions?

  • OS: Windows 10 Home Single Language 64-bit (10.0, Build 19044)
  • Python: 3.10.2 64-bit
  • Pillow: 9.2.0
import time
from pathlib import Path
from glob import glob
from PIL import Image

base_path = Path('./test_images')
pdf_file = Path('test.pdf')

if pdf_file.exists():
    pdf_file.unlink()

raw_files = glob('*', root_dir=base_path)

files = [Image.open(base_path / i) for i in raw_files]

im = files.pop(0)

print('Converting')
im.save(pdf_file, save_all=True, append_images=files)
print('Done converting')

time.sleep(9000)
@mansuf mansuf changed the title Memory leaks in Image.close() Memory leaks in PIL.PngImagePlugin.PngImageFile.close() Jul 17, 2022
@radarhere
Copy link
Member

Could we have copies of the images? It would be helpful to remove as much ambiguity as possible from your example.

You changed the title to "Memory leaks in PIL.PngImagePlugin.PngImageFile.close". So you believe that this problem can be replicated without saving as PDFs and without using your Pillow modification?

@radarhere
Copy link
Member

radarhere commented Jul 18, 2022

If you're concerned about memory when saving PDFs, you might also be interested in this method that uses less memory. If that works, it would also remove the need for your Pillow modification.

@mansuf
Copy link
Author

mansuf commented Jul 18, 2022

Could we have copies of the images? It would be helpful to remove as much ambiguity as possible from your example.

Unfortunately, i cannot send you the images in this issue publicly, because of copyright reason. Is there a way that i can send images privately ?

You changed the title to "Memory leaks in PIL.PngImagePlugin.PngImageFile.close". So you believe that this problem can be replicated without saving as PDFs and without using your Pillow modification?

I assume that variable im_sequence is PIL.PngImagePlugin.PngImageFile. And no, the problem only happened in low-level function _save() in `PdfImagePlugin.py.

@mansuf
Copy link
Author

mansuf commented Jul 18, 2022

If you're concerned about memory when saving PDFs, you might also be interested in #4067 (comment) If that works, it would also remove the need for your Pillow modification.

I have tried this before. In my case, converting thousand of images to PDF file in single run can cause slow performance. I have looked at Pillow module and found that https://github.com/python-pillow/Pillow/blob/9.2.0/src/PIL/PdfImagePlugin.py#L112 is causing slow, because the PDF plugin is trying to load the whole contents of PDF file. So modifying Pillow module is best option so far.

@mansuf
Copy link
Author

mansuf commented Jul 18, 2022

I cannot send you the original images that i used for tests, but this should do.

Source: https://commons.wikimedia.org/wiki/File:Lattinaichnusa.png

@mansuf mansuf changed the title Memory leaks in PIL.PngImagePlugin.PngImageFile.close() Memory leaks in Image.close() after saving as other formats Jul 21, 2022
@mansuf mansuf changed the title Memory leaks in Image.close() after saving as other formats Memory leaks in Image.close() after saving as pdf or png Jul 21, 2022
@mansuf
Copy link
Author

mansuf commented Jul 21, 2022

After testing further with every extensions available (PIL.Image.EXTENSION). This is the result i found.

Image i used

image

Not affected

  • BMP
  • DIB
  • GIF
  • TIFF
  • JPEG
  • PPM
  • PCX
  • DDS
  • EPS
  • JPEG2000
  • ICNS
  • ICO
  • IM
  • MPO
  • SGI
  • TGA
  • WEBP

Failed

  • BLP (ValueError: Unsupported BLP image mode)
  • BUFR (OSError: BUFR save handler not installed)
  • CUR (KeyError: 'CUR')
  • DCX (KeyError: 'DCX')
  • FITS (OSError: FITS save handler not installed)
  • FLI (KeyError: 'FLI')
  • FTEX (KeyError: 'FTEX')
  • GBR (KeyError: 'GBR')
  • GRIB (OSError: GRIB save handler not installed)
  • HDF5 (OSError: HDF5 save handler not installed)
  • IPTC (KeyError: 'IPTC')
  • MPEG (KeyError: 'MPEG')
  • MSP (OSError: cannot write mode RGB as MSP)
  • PALM (OSError: cannot write mode RGB as Palm)
  • PCD (KeyError: 'PCD')
  • PIXAR (KeyError: 'PIXAR')
  • PSD (KeyError: 'PSD')
  • SUN (KeyError: 'SUN')
  • WMF (OSError: WMF save handler not installed)
  • XBM (OSError: cannot write mode RGB as XBM)
  • XPM (KeyError: 'XPM')

Affected

  • PDF
  • PNG

@radarhere
Copy link
Member

Thanks for the test image. Testing opening that image 16 times with your modification, I'm able to replicate the problem. The change in behaviour comes from bb9338e

I've created PR #6456 to resolve this. If you could test it and confirm, that would be helpful.

@mansuf
Copy link
Author

mansuf commented Jul 22, 2022

Really confident that the PR resolve the problem, thank you very much 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants