Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Presto import triggers directory creation which does not always work #25

Open
kvantricht opened this issue Oct 18, 2023 · 2 comments
Open

Comments

@kvantricht
Copy link
Contributor

Hi guys,

during a Presto import, this line always gets triggered:

data_dir.mkdir(exist_ok=True, parents=True)

On some systems like our cluster, where a Python wheel containing Presto is shipped and deployed, this does not work. I fixed it for now locally by catching the error without failing and the rest of the code (computing Presto encodings) worked without issue on the cluster. So it's something to look at.

@gabrieltseng
Copy link
Collaborator

Which functions are you using from the Presto codebase?

I think it would make sense to split a lot of the functionality out into a (much simpler) pip-installable package, but my guess is that only two things (the Presto model itself, and the utils function to make data) are necessary.

As a note, we do have single_file_presto.py which is intended for easy integration into other applications (since it just requires a single file to be copied into another application), but this doesn't solve the requirements issues.

@kvantricht
Copy link
Contributor Author

At the moment only construct_single_presto_input and Presto.load_pretrained from which we use the encoder. But we're only getting started ;-)
I'll take a look at the single file implementation!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants