Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Importing 3rd party package ruamel.yaml anywhere causes failure to validate workflow #638

Open
adamh-oai opened this issue Sep 8, 2024 · 6 comments
Labels
bug Something isn't working

Comments

@adamh-oai
Copy link

To reproduce:

  1. Checkout https://github.com/temporalio/samples-python
  2. poetry add ruamel.yaml
  3. Add import ruamel.yaml to the top of hello/hello_activity.py
  4. poetry run hello/hello_activity.py

This produces the error here: https://gist.github.com/adamh-oai/dfdb9b07b89bf3cee10da34ba2582805

Important parts:

  File "/Users/adamh/.virtualenvs/temporalio-samples-G-Ux_4V2-py3.11/lib/python3.11/site-packages/ruamel/yaml/representer.py", line 1132, in <module>
    RoundTripRepresenter.add_representer(TimeStamp, RoundTripRepresenter.represent_datetime)
  File "/Users/adamh/.virtualenvs/temporalio-samples-G-Ux_4V2-py3.11/lib/python3.11/site-packages/ruamel/yaml/representer.py", line 135, in add_representer
    cls.yaml_representers[data_type] = representer
    ~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^
TypeError: unhashable type: '_RestrictedProxy'

The above exception was the direct cause of the following exception:
...
  File "/Users/adamh/.virtualenvs/temporalio-samples-G-Ux_4V2-py3.11/lib/python3.11/site-packages/temporalio/worker/_workflow.py", line 118, in __init__
    raise RuntimeError(f"Failed validating workflow {defn.name}") from err
RuntimeError: Failed validating workflow GreetingWorkflow

This behavior is surprising to me because the workflow/activity is not actually using the ruamel package. The error persists if I move the workflow/activity to a separate module and wrap in a imports_passed_through block. Wrapping the ruamel import in this same block does resolve the error. It seems like the ruamel import may have side effects on other modules. So, I have a workaround, but (short of disabling sandboxing entirely) I'm concerned that in a large codebase with unpredictable 3rd party libraries this will be a recurring issue.

@adamh-oai adamh-oai added the bug Something isn't working label Sep 8, 2024
@cretz
Copy link
Member

cretz commented Sep 9, 2024

This is a sandbox issue. Despite those short "hello" samples being in the same file, we actually recommend workflows be in their own file. Can you replicate if the workflows are in a separate file and import the activities via "passthrough" similar to what's shown in the quickstart of the README: https://github.com/temporalio/sdk-python?tab=readme-ov-file#implementing-a-workflow?

@adamh-oai
Copy link
Author

Yes, it happens if the workflows are in their own module and imported via an unsafe block:

The error persists if I move the workflow/activity to a separate module and wrap in a imports_passed_through block. Wrapping the ruamel import in this same block does resolve the error. It seems like the ruamel import may have side effects on other modules. So, I have a workaround, but (short of disabling sandboxing entirely) I'm concerned that in a large codebase with unpredictable 3rd party libraries this will be a recurring issue.

@adamh-oai
Copy link
Author

A related issue: #301

@cretz
Copy link
Member

cretz commented Sep 9, 2024

Yes, it happens if the workflows are in their own module and imported via an unsafe block:

Ah, I see the bug already opened for making the proxy item hashable. I wonder then if this is a duplicate of that. In this case our wrapping of datetime causes this issue.

Wrapping the ruamel import in this same block does resolve the error. It seems like the ruamel import may have side effects on other modules.

To confirm, are you using this library inside the workflow and want the import reloaded every time the workflow is run? We recommend passing through all imports you know you'll use deterministically and/or you don't want completely reloaded every time. It's going to be a performance issue to reload third party libraries for every workflow run, but I understand the safety benefit.

I'm concerned that in a large codebase with unpredictable 3rd party libraries this will be a recurring issue.

Yes, most do not use many third party libraries inside their workflows, mostly only in activities and other code. But yes, our goal is to try our best to make third party libraries usable inside workflows but these hiccups happen here and there and we need to fix the linked bug.

@adamh-oai
Copy link
Author

To confirm, are you using this library inside the workflow and want the import reloaded every time the workflow is run?

This import is not used inside the workflow; in the real case where I hit this I was using it to load a yaml file for configuring the temporal client, but the workflows themselves don't import it or use it.

@cretz
Copy link
Member

cretz commented Sep 9, 2024

but the workflows themselves don't import it or use it.

This error only occurs in workflow code when loading it in a sandbox, so somehow this yaml code is getting executed/loaded in a workflow. Maybe the activity imports weren't being passed through when you split the workflow file off onto its own?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants