-
Notifications
You must be signed in to change notification settings - Fork 651
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
REFACTOR-#6939: Make 'modin.pandas.DataFrame._to_pandas' a public method #6940
Conversation
… a public method Signed-off-by: Dmitry Chigarev <dmitry.chigarev@intel.com>
Signed-off-by: Dmitry Chigarev <dmitry.chigarev@intel.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
modin/utils.py
Outdated
if ( | ||
isinstance(getattr(result, "name", None), str) | ||
isinstance(obj, BaseQueryCompiler) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need this additional condition?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Previously, this logic was only applied to objects having public .to_pandas()
method (query compilers, low-level modin frames), so I thought that putting this condition is necessary. But on the second though, it seems that this logic works just fine for high-level dataframes as well, so removed the condition in the last commit
Signed-off-by: Dmitry Chigarev <dmitry.chigarev@intel.com>
@dchigarev @YarShev it seems to me that we forgot that there is a special namespace for functionality that is not in Pandas - |
@anmyachev, that was done intentionally. We wanted to simplify conversion of a Modin DataFrame to the other data format by adding methods to the DataFrame object itself. We considered that would not break the statement: cc @dchigarev |
@YarShev by expanding the scope of |
We should probably have more people to achieve an agreement on this matter before releasing 0.28.0. |
I don't mind copying the # I want both of these calls to work:
df.to_pandas()
df.modin.to_pandas() Why: Why I'm not convinced with:
In a scenario where users may need Once again, I'm not against hiding the rest of the pandas API extensions in the |
We can add a link to an additional interface almost at the very beginning of the documentation; it is difficult to imagine a situation where even the readme is not read when working with the new library.
We can also add a new method reference in to |
Originally, we wanted to be able to go from and to pandas with the import change. If someone writes a workflow in Modin, it can also work in pandas by design. If we support APIs like this at the top level, it is not easy for users to find these types of APIs and change them (especially if we add more). Putting this under the |
I suppose we will proceed with |
What do these changes do?
flake8 modin/ asv_bench/benchmarks scripts/doc_checker.py
black --check modin/ asv_bench/benchmarks scripts/doc_checker.py
git commit -s
modin.pandas.DataFrame._to_pandas
a public method #6939docs/development/architecture.rst
is up-to-date