Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-41910: [Python] Add support for Pyodide #37822

Merged
merged 140 commits into from
Jul 5, 2024

Conversation

joemarshall
Copy link
Contributor

@joemarshall joemarshall commented Sep 21, 2023

pyarrow knows about ARROW_ENABLE_THREADING and doesn't use threads if they are not enabled in libarrow.

Split from #37696

@joemarshall
Copy link
Contributor Author

@kou And these are the python changes

@jorisvandenbossche jorisvandenbossche changed the title GH23221 - python changes for pyodide build GH-23221: [Python] python changes for pyodide build Sep 25, 2023
@apache apache deleted a comment from github-actions bot Sep 25, 2023
@jorisvandenbossche
Copy link
Member

@joemarshall thanks for the PR!

We might want to expose is_threading_enabled() in pyarrow publicly (it might be useful for downstream packages as well?), in __init__.py

2. pyarrow sets defaults for inclusion of submodules based on their inclusion in the arrow build. e.g. pyarrow.parquet is built only if ARROW_PARQUET is set. This makes it possible to build in situations where you don't have access to set the build environment variables (e.g. in cross compiling situations like pyodide).

This part is not actually included here? (or I don't understand the sentence)
(and it might also make sense to leave that for a third PR since it is changing the build process beyond emscriptem?)

@joemarshall
Copy link
Contributor Author

@joemarshall thanks for the PR!

We might want to expose is_threading_enabled() in pyarrow publicly (it might be useful for downstream packages as well?), in __init__.py

  1. pyarrow sets defaults for inclusion of submodules based on their inclusion in the arrow build. e.g. pyarrow.parquet is built only if ARROW_PARQUET is set. This makes it possible to build in situations where you don't have access to set the build environment variables (e.g. in cross compiling situations like pyodide).

This part is not actually included here? (or I don't understand the sentence) (and it might also make sense to leave that for a third PR since it is changing the build process beyond emscriptem?)

Sorry, I missed out putting in the setup.py changes. They're in now.

About is_threading_enabled(), it is currently in pyarrow.lib.is_threading_enabled(). Does it need to be top-level?

@joemarshall
Copy link
Contributor Author

joemarshall commented Sep 28, 2023

Oh and for now I have blocked the auto-setting of PYARROW_* to happen only on emscripten - I don't know if that makes sense or not, but it isn't possible to build for emscripten without that change or something similar right now.

python/CMakeLists.txt Outdated Show resolved Hide resolved
@github-actions github-actions bot added awaiting committer review Awaiting committer review and removed awaiting review Awaiting review labels Oct 3, 2023
python/pyarrow/io.pxi Outdated Show resolved Hide resolved
python/setup.py Outdated Show resolved Hide resolved
python/CMakeLists.txt Outdated Show resolved Hide resolved
@github-actions github-actions bot removed the awaiting committer review Awaiting committer review label Oct 3, 2023
@github-actions github-actions bot added the awaiting review Awaiting review label Jul 5, 2024
Co-authored-by: Sutou Kouhei <kou@cozmixng.org>
@github-actions github-actions bot added awaiting committer review Awaiting committer review and removed awaiting review Awaiting review labels Jul 5, 2024
@joemarshall
Copy link
Contributor Author

@ianmcook changes merged in, running tests locally right now, could probably fire off the emscripten tests here to be sure also

@joemarshall
Copy link
Contributor Author

Tests pass here, so hopefully we're ready to merge.

@ianmcook
Copy link
Member

ianmcook commented Jul 5, 2024

@github-actions crossbow submit test-conda-python-emscripten

Copy link

github-actions bot commented Jul 5, 2024

Revision: fa0e497

Submitted crossbow builds: ursacomputing/crossbow @ actions-e5764a79dd

Task Status
test-conda-python-emscripten GitHub Actions

@ianmcook
Copy link
Member

ianmcook commented Jul 5, 2024

@github-actions crossbow submit -g wheel

Copy link

github-actions bot commented Jul 5, 2024

Revision: fa0e497

Submitted crossbow builds: ursacomputing/crossbow @ actions-a1c5cd96da

Task Status
wheel-macos-big-sur-cp310-arm64 GitHub Actions
wheel-macos-big-sur-cp311-arm64 GitHub Actions
wheel-macos-big-sur-cp312-arm64 GitHub Actions
wheel-macos-big-sur-cp38-arm64 GitHub Actions
wheel-macos-big-sur-cp39-arm64 GitHub Actions
wheel-macos-catalina-cp310-amd64 GitHub Actions
wheel-macos-catalina-cp311-amd64 GitHub Actions
wheel-macos-catalina-cp312-amd64 GitHub Actions
wheel-macos-catalina-cp38-amd64 GitHub Actions
wheel-macos-catalina-cp39-amd64 GitHub Actions
wheel-manylinux-2-28-cp310-amd64 GitHub Actions
wheel-manylinux-2-28-cp310-arm64 GitHub Actions
wheel-manylinux-2-28-cp311-amd64 GitHub Actions
wheel-manylinux-2-28-cp311-arm64 GitHub Actions
wheel-manylinux-2-28-cp312-amd64 GitHub Actions
wheel-manylinux-2-28-cp312-arm64 GitHub Actions
wheel-manylinux-2-28-cp38-amd64 GitHub Actions
wheel-manylinux-2-28-cp38-arm64 GitHub Actions
wheel-manylinux-2-28-cp39-amd64 GitHub Actions
wheel-manylinux-2-28-cp39-arm64 GitHub Actions
wheel-manylinux-2014-cp310-amd64 GitHub Actions
wheel-manylinux-2014-cp310-arm64 GitHub Actions
wheel-manylinux-2014-cp311-amd64 GitHub Actions
wheel-manylinux-2014-cp311-arm64 GitHub Actions
wheel-manylinux-2014-cp312-amd64 GitHub Actions
wheel-manylinux-2014-cp312-arm64 GitHub Actions
wheel-manylinux-2014-cp38-amd64 GitHub Actions
wheel-manylinux-2014-cp38-arm64 GitHub Actions
wheel-manylinux-2014-cp39-amd64 GitHub Actions
wheel-manylinux-2014-cp39-arm64 GitHub Actions
wheel-windows-cp310-amd64 GitHub Actions
wheel-windows-cp311-amd64 GitHub Actions
wheel-windows-cp312-amd64 GitHub Actions
wheel-windows-cp38-amd64 GitHub Actions
wheel-windows-cp39-amd64 GitHub Actions

@ianmcook
Copy link
Member

ianmcook commented Jul 5, 2024

@joemarshall could you please sync your fork with upstream main? I think that will resolve the CI failures.

@joemarshall
Copy link
Contributor Author

@joemarshall could you please sync your fork with upstream main? I think that will resolve the CI failures.

Try now

@raulcd
Copy link
Member

raulcd commented Jul 5, 2024

@github-actions crossbow submit -g wheel

Copy link

github-actions bot commented Jul 5, 2024

Revision: 71a2f6a

Submitted crossbow builds: ursacomputing/crossbow @ actions-8f6301a1d5

Task Status
wheel-macos-big-sur-cp310-arm64 GitHub Actions
wheel-macos-big-sur-cp311-arm64 GitHub Actions
wheel-macos-big-sur-cp312-arm64 GitHub Actions
wheel-macos-big-sur-cp38-arm64 GitHub Actions
wheel-macos-big-sur-cp39-arm64 GitHub Actions
wheel-macos-catalina-cp310-amd64 GitHub Actions
wheel-macos-catalina-cp311-amd64 GitHub Actions
wheel-macos-catalina-cp312-amd64 GitHub Actions
wheel-macos-catalina-cp38-amd64 GitHub Actions
wheel-macos-catalina-cp39-amd64 GitHub Actions
wheel-manylinux-2-28-cp310-amd64 GitHub Actions
wheel-manylinux-2-28-cp310-arm64 GitHub Actions
wheel-manylinux-2-28-cp311-amd64 GitHub Actions
wheel-manylinux-2-28-cp311-arm64 GitHub Actions
wheel-manylinux-2-28-cp312-amd64 GitHub Actions
wheel-manylinux-2-28-cp312-arm64 GitHub Actions
wheel-manylinux-2-28-cp38-amd64 GitHub Actions
wheel-manylinux-2-28-cp38-arm64 GitHub Actions
wheel-manylinux-2-28-cp39-amd64 GitHub Actions
wheel-manylinux-2-28-cp39-arm64 GitHub Actions
wheel-manylinux-2014-cp310-amd64 GitHub Actions
wheel-manylinux-2014-cp310-arm64 GitHub Actions
wheel-manylinux-2014-cp311-amd64 GitHub Actions
wheel-manylinux-2014-cp311-arm64 GitHub Actions
wheel-manylinux-2014-cp312-amd64 GitHub Actions
wheel-manylinux-2014-cp312-arm64 GitHub Actions
wheel-manylinux-2014-cp38-amd64 GitHub Actions
wheel-manylinux-2014-cp38-arm64 GitHub Actions
wheel-manylinux-2014-cp39-amd64 GitHub Actions
wheel-manylinux-2014-cp39-arm64 GitHub Actions
wheel-windows-cp310-amd64 GitHub Actions
wheel-windows-cp311-amd64 GitHub Actions
wheel-windows-cp312-amd64 GitHub Actions
wheel-windows-cp38-amd64 GitHub Actions
wheel-windows-cp39-amd64 GitHub Actions

@ianmcook
Copy link
Member

ianmcook commented Jul 5, 2024

Checks look good to me. The CI failures are unrelated.

@kou do you think this is good to merge now? If so please go ahead.

Copy link
Member

@kou kou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@kou kou merged commit 2de8008 into apache:main Jul 5, 2024
59 of 63 checks passed
@kou kou removed the awaiting committer review Awaiting committer review label Jul 5, 2024
@github-actions github-actions bot added the awaiting merge Awaiting merge label Jul 5, 2024
Copy link

After merging your PR, Conbench analyzed the 4 benchmarking runs that have been run so far on merge-commit 2de8008.

There were no benchmark performance regressions. 🎉

The full Conbench report has more details. It also includes information about 31 possible false positives for unstable benchmarks that are known to sometimes produce them.

raulcd added a commit that referenced this pull request Jul 8, 2024
pyarrow knows about ARROW_ENABLE_THREADING and doesn't use threads if they are not enabled in libarrow.

Split from #37696 

* GitHub Issue: #41910

Lead-authored-by: Joe Marshall <joe.marshall@nottingham.ac.uk>
Co-authored-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>
Co-authored-by: Raúl Cumplido <raulcumplido@gmail.com>
Co-authored-by: Sutou Kouhei <kou@cozmixng.org>
Signed-off-by: Sutou Kouhei <kou@clear-code.com>
zanmato1984 pushed a commit to zanmato1984/arrow that referenced this pull request Jul 9, 2024
pyarrow knows about ARROW_ENABLE_THREADING and doesn't use threads if they are not enabled in libarrow.

Split from apache#37696 

* GitHub Issue: apache#41910

Lead-authored-by: Joe Marshall <joe.marshall@nottingham.ac.uk>
Co-authored-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>
Co-authored-by: Raúl Cumplido <raulcumplido@gmail.com>
Co-authored-by: Sutou Kouhei <kou@cozmixng.org>
Signed-off-by: Sutou Kouhei <kou@clear-code.com>
@jorisvandenbossche
Copy link
Member

Thanks @joemarshall for the amazing effort and everyone for the reviews!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants