-
Notifications
You must be signed in to change notification settings - Fork 189
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[HOTFIX] Update cuda-python dependency to 11.7.1 #952
Closed
Closed
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
[gpuCI] Forward-merge branch-22.10 to branch-22.12 [skip gpuci]
[gpuCI] Forward-merge branch-22.10 to branch-22.12 [skip gpuci]
[gpuCI] Forward-merge branch-22.10 to branch-22.12 [skip gpuci]
[gpuCI] Forward-merge branch-22.10 to branch-22.12 [skip gpuci]
[gpuCI] Forward-merge branch-22.10 to branch-22.12 [skip gpuci]
[gpuCI] Forward-merge branch-22.10 to branch-22.12 [skip gpuci]
[gpuCI] Forward-merge branch-22.10 to branch-22.12 [skip gpuci]
[gpuCI] Forward-merge branch-22.10 to branch-22.12 [skip gpuci]
[gpuCI] Forward-merge branch-22.10 to branch-22.12 [skip gpuci]
[gpuCI] Forward-merge branch-22.10 to branch-22.12 [skip gpuci]
[gpuCI] Forward-merge branch-22.10 to branch-22.12 [skip gpuci]
[gpuCI] Forward-merge branch-22.10 to branch-22.12 [skip gpuci]
[gpuCI] Forward-merge branch-22.10 to branch-22.12 [skip gpuci]
[gpuCI] Forward-merge branch-22.10 to branch-22.12 [skip gpuci]
[gpuCI] Forward-merge branch-22.10 to branch-22.12 [skip gpuci]
[gpuCI] Forward-merge branch-22.10 to branch-22.12 [skip gpuci]
[gpuCI] Forward-merge branch-22.10 to branch-22.12 [skip gpuci]
[gpuCI] Forward-merge branch-22.10 to branch-22.12 [skip gpuci]
[gpuCI] Forward-merge branch-22.10 to branch-22.12 [skip gpuci]
[gpuCI] Forward-merge branch-22.10 to branch-22.12 [skip gpuci]
[gpuCI] Forward-merge branch-22.10 to branch-22.12 [skip gpuci]
[gpuCI] Forward-merge branch-22.10 to branch-22.12 [skip gpuci]
[gpuCI] Forward-merge branch-22.10 to branch-22.12 [skip gpuci]
[gpuCI] Forward-merge branch-22.10 to branch-22.12 [skip gpuci]
[gpuCI] Forward-merge branch-22.10 to branch-22.12 [skip gpuci]
[gpuCI] Forward-merge branch-22.10 to branch-22.12 [skip gpuci]
…apidsai#906) Authors: - Corey J. Nolet (https://github.com/cjnolet) Approvers: - Divye Gala (https://github.com/divyegala) URL: rapidsai#906
…ir` -> `raft::KeyValuePair` (rapidsai#905) cc @Nyrio Authors: - Corey J. Nolet (https://github.com/cjnolet) Approvers: - Divye Gala (https://github.com/divyegala) URL: rapidsai#905
Part of rapidsai#535. Implementation of the raft::stats API with mdspan, with the C++ tests 14/22 Files implemented. The remaining files will come in a following PR. Authors: - Micka (https://github.com/lowener) - Corey J. Nolet (https://github.com/cjnolet) Approvers: - Corey J. Nolet (https://github.com/cjnolet) URL: rapidsai#802
This coincides pretty well w/ the `pairwise_distance_armin` building block that's being exposed in Scikit-learn, except it's faster and saves a lot of gpu memory by fusing the argmin w/ the pairwise distances so we don't ever have to store the n^2 distances. cc @betatim Authors: - Corey J. Nolet (https://github.com/cjnolet) Approvers: - Dante Gama Dessavre (https://github.com/dantegd) URL: rapidsai#924
…uce_rows_by_keys` (rapidsai#909) `accumulate_into_selected` achieves much better performance than the previous implementation of `reduce_rows_by_keys` for large `nkeys` (`sum_rows_by_key_large_nkeys_kernel_rowmajor`). According to the benchmark that I added for this primitive, the difference is a factor of 240x for sizes relevant to IVF-Flat (and a factor of ~10x for smaller `nkeys`, e.g 64). This is mostly because the legacy implementation, probably in an attempt to reduce atomic conflicts, assigned a key and a tile of the matrix to each block, and the block only reduces the rows corresponding to the assigned key. With a very large number of keys, e.g 1k, this results in blocks iterating over a large number of rows (possibly tens of thousands) and only reading and accumulating 1 in 1k rows. This PR: - Replaces `sum_rows_by_key_large_nkeys_rowmajor` with `accumulate_into_selected` (I didn't find any cases in which the old kernel performed better). - Removes `accumulate_into_selected` from `ann_utils.cuh`. - Fixes support for custom iterators in `reduce_rows_by_keys`. - Uses the raft prims in `calc_centers_and_sizes`. Perf notes: - The original kmeans gets a 15-20% speedup for large numbers of clusters. - The performance of `ivf_flat::build` stays the same as before. - There are a bunch of extra steps since I separated the cluster size count from the reduction by key, but they are quite neglectable in comparison. Question: the change breaks support for host-side-only arrays in `calc_centers_and_sizes`, is it actually a possibility? Should I add a branch and not use the raft prims when all arrays are host-side? cc @achirkin @tfeher @cjnolet Authors: - Louis Sugy (https://github.com/Nyrio) Approvers: - Tamas Bela Feher (https://github.com/tfeher) - Corey J. Nolet (https://github.com/cjnolet) URL: rapidsai#909
Authors: - Corey J. Nolet (https://github.com/cjnolet) Approvers: - Divye Gala (https://github.com/divyegala) URL: rapidsai#932
Authors: - Corey J. Nolet (https://github.com/cjnolet) Approvers: - Dante Gama Dessavre (https://github.com/dantegd) URL: rapidsai#933
…he docs. (rapidsai#943) Lots of improvements to the docs all around: - adding all new ANN docs and cleaning up neighborhood docs in general - new quick-start tutorial in the docs - docs added to many different APIs including core, clustering, and solvers - making sure to include docs for mdspan/mdarray in the `core` docs - updates to build.sh to only compile nn/dist libs when the appropriate tests/benchmarks are selected Authors: - Corey J. Nolet (https://github.com/cjnolet) Approvers: - Dante Gama Dessavre (https://github.com/dantegd) URL: rapidsai#943
This PR fixes the calculation of max cluster size. Previous calculation could return much larger than the actual values, which could lead to OOM while allocating temporary buffers while building index for large datasets. Authors: - Tamas Bela Feher (https://github.com/tfeher) Approvers: - Artem M. Chirkin (https://github.com/achirkin) - Corey J. Nolet (https://github.com/cjnolet) URL: rapidsai#938
…e) (rapidsai#947) Closes rapidsai#946 Authors: - Corey J. Nolet (https://github.com/cjnolet) Approvers: - Dante Gama Dessavre (https://github.com/dantegd) URL: rapidsai#947
This PR removes the stale issue labeler workflow Authors: - Ray Douglass (https://github.com/raydouglass) Approvers: - AJ Schmidt (https://github.com/ajschmidt8) URL: rapidsai#951
Reorder channels.
bdice
approved these changes
Oct 25, 2022
Switch to using a centralized rapids-cmake function for getting Google benchmark. Authors: - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - Corey J. Nolet (https://github.com/cjnolet) URL: rapidsai#954
This is a different approach / followup PR of rapidsai#663 for issue rapidsai#497. I implemented a `layout_padded_general` within raft to statically enforce padding on mdpsan accesses. * The layout has template parameters for `ValueType`, `StorageOrder `(default `row_major_t`), and `ByteAlignment `(default 128) * in order to *not* require changes upstream I skipped `submdspan `functionality right now. I have a branch on a mdspan fork where I tested this though (https://github.com/mfoerste4/mdspan/tree/layout_padded). Authors: - Malte Förster (https://github.com/mfoerste4) Approvers: - Artem M. Chirkin (https://github.com/achirkin) - Corey J. Nolet (https://github.com/cjnolet) URL: rapidsai#725
I recalled having made this change initially and I'm wondering if it accidentally got reverted since there's been so many hands in the IVF flat code recently. For proper end-to-end testing, we need ivf flat testing code to invoke the mdspan APIs (which in turn invoke the non-mdspan APIs). Authors: - Corey J. Nolet (https://github.com/cjnolet) Approvers: - Divye Gala (https://github.com/divyegala) URL: rapidsai#955
Follow up to rapidsai#924, removing a `print` that managed to sneak in. cc @cjnolet Authors: - Tim Head (https://github.com/betatim) Approvers: - Corey J. Nolet (https://github.com/cjnolet) URL: rapidsai#958
Adds a benchmark for `fusedL2NN`. The benchmark used the wrapper `fusedL2NNMinReduce` compiled in the distance library, for faster compilation times. Authors: - Louis Sugy (https://github.com/Nyrio) Approvers: - Corey J. Nolet (https://github.com/cjnolet) URL: rapidsai#936
Closes rapidsai#873 Closes rapidsai#944 Authors: - Ben Frederickson (https://github.com/benfred) Approvers: - Corey J. Nolet (https://github.com/cjnolet) URL: rapidsai#942
We can close this PR, since this is a duplicate of : #963 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This should resolve a segfault we are seeing with
cuda-python=11.7.0
(rapidsai/cudf#11941).