Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support pd.read_pickle and pd.to_pickle in cudf.pandas #16105

Merged

Conversation

Matt711
Copy link
Contributor

@Matt711 Matt711 commented Jun 26, 2024

Description

Closes #15459

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@Matt711 Matt711 requested a review from a team as a code owner June 26, 2024 19:59
@Matt711 Matt711 requested review from wence- and lithomas1 June 26, 2024 19:59
@github-actions github-actions bot added Python Affects Python cuDF API. cudf.pandas Issues specific to cudf.pandas labels Jun 26, 2024
Copy link
Contributor Author

@Matt711 Matt711 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm including the source for pd.read_pickle and pd.to_pickle so that pickle protocol is kept intact.

python/cudf/cudf/pandas/pickle.py Outdated Show resolved Hide resolved
@@ -937,8 +938,13 @@ def _fast_slow_function_call(
domain="cudf_pandas",
):
slow_args, slow_kwargs = _slow_arg(args), _slow_arg(kwargs)
with disable_module_accelerator():
result = func(*slow_args, **slow_kwargs)
if args[0].__name__ == "read_pickle":
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

result = func(*slow_args, **slow_kwargs) led to infinite recursion so I had to get the argument needed for read_pickle. Is there a better way to do this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious what is the part leading to recursion?

If the goal is to always just call pandas' read_pickle/to_pickle, it may make sense to manually define these method wrapper in python/cudf/cudf/pandas/_wrappers/pandas.py.

@Matt711 Matt711 added the non-breaking Non-breaking change label Jun 26, 2024
@Matt711 Matt711 changed the title Support pd.read_pickle in cudf.pandas Support pd.read_pickle and pd.to_pickle in cudf.pandas Jun 26, 2024
@Matt711 Matt711 force-pushed the feature/support-pandas-pickling branch from b604a54 to a6375a0 Compare June 27, 2024 19:09
python/cudf/cudf/pandas/_wrappers/pandas.py Outdated Show resolved Hide resolved
@Matt711 Matt711 self-assigned this Jul 2, 2024
Copy link
Contributor

@wence- wence- left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A suggestion for simplification

Copy link

copy-pr-bot bot commented Jul 8, 2024

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@Matt711 Matt711 added the bug Something isn't working label Jul 8, 2024
@Matt711
Copy link
Contributor Author

Matt711 commented Jul 9, 2024

/merge

@rapids-bot rapids-bot bot merged commit 341e014 into rapidsai:branch-24.08 Jul 9, 2024
81 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working cudf.pandas Issues specific to cudf.pandas non-breaking Non-breaking change Python Affects Python cuDF API.
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

[BUG] Unpickling objects with pd.read_pickle() doesn't work with cudf.pandas enabled
4 participants