Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Process sometimes stuck when decorating pipeline function with some error handling and an error occurs #113

Open
charlielito opened this issue Dec 4, 2023 · 0 comments
Labels
bug Something isn't working

Comments

@charlielito
Copy link
Contributor

charlielito commented Dec 4, 2023

Describe the bug
I have a simple pipeline to read images from URLs. I am decorating the pipeline function with a try-except to catch any error when trying to get the images. Nonetheless, sometimes (a couple of times the script runs as expected) when an error occurs (e.g. the URL is invalid) the function returns None, and in the main process I raise explicitly an Error, the process seems to get stuck and the script just freezes. Sometimes the explicitly raised Error is not even shown and the script also gets stuck there.

Minimal code to reproduce
Small snippet that contains a minimal amount of code.

import logging
from io import BytesIO

import numpy as np
import pypeln as pl
import requests
from PIL import Image
from tqdm import tqdm


def try_catch_and_log(function):
    def wrapper_fun(row):
        try:
            return function(row)
        except Exception as e:
            logging.exception(e)
            # logging.error(e)
            return None

    return wrapper_fun


@try_catch_and_log
def url_to_numpy(url):
    print(url)
    response = requests.get(url)
    image = Image.open(BytesIO(response.content))
    numpy_array = np.array(image)
    return numpy_array


with open("data.txt") as f:
    lines = f.readlines()
images_urls = [line.split("\t")[0] for line in lines]

stage = pl.thread.map(url_to_numpy, images_urls, workers=10, maxsize=100)

for result in tqdm(stage, total=len(images_urls)):
    if result is None:
        raise Exception("WTF")
    else:
        print(result.shape)

The data.txt is attached:

data.txt

NOTE: It is random, you need to run the script several times to get this behavior after some runs.

Expected behavior
The script should raise the error and finalize

Library Info
OS: Ubuntu 20.04
pypeln version: 0.4.9
python 3.7.12

Screenshots
When the script freezes, it prints the Exception but it doesn't finish the execution.

Traceback (most recent call last):
  File "wtf.py", line 14, in wrapper_fun
    return function(row)
  File "wtf.py", line 27, in url_to_numpy
    image = Image.open(BytesIO(response.content))
  File "/opt/conda/lib/python3.7/site-packages/PIL/Image.py", line 3148, in open
    "cannot identify image file %r" % (filename if filename else fp)
PIL.UnidentifiedImageError: cannot identify image file <_io.BytesIO object at 0x7fc1a875d590>
https://farm8.staticflickr.com/7081/7317587058_b4b5f72cd7_o.jpg
  0%|                                                                                                                                                                   | 0/289 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "wtf.py", line 40, in <module>
    raise Exception("WTF")
Exception: WTF

I you ctr+C then the following appears:

^CError in atexit._run_exitfuncs:
Traceback (most recent call last):
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 2035, in shutdown
    h.acquire()
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 843, in acquire
    self.lock.acquire()
KeyboardInterrupt

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 2045, in shutdown
    h.release()
  File "/opt/conda/lib/python3.7/logging/__init__.py", line 850, in release
    self.lock.release()
RuntimeError: cannot release un-acquired lock

Sometimes the first Exception is not shown, and you have to ctr+C to stop the script.

@charlielito charlielito added the bug Something isn't working label Dec 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant