Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[HOTFIX] Update cuda-python dependency to 11.7.1 #952

Closed
wants to merge 54 commits into from

Conversation

shwina
Copy link
Contributor

@shwina shwina commented Oct 25, 2022

This should resolve a segfault we are seeing with cuda-python=11.7.0 (rapidsai/cudf#11941).

raydouglass and others added 30 commits September 23, 2022 11:39
[gpuCI] Forward-merge branch-22.10 to branch-22.12 [skip gpuci]
[gpuCI] Forward-merge branch-22.10 to branch-22.12 [skip gpuci]
[gpuCI] Forward-merge branch-22.10 to branch-22.12 [skip gpuci]
[gpuCI] Forward-merge branch-22.10 to branch-22.12 [skip gpuci]
[gpuCI] Forward-merge branch-22.10 to branch-22.12 [skip gpuci]
[gpuCI] Forward-merge branch-22.10 to branch-22.12 [skip gpuci]
[gpuCI] Forward-merge branch-22.10 to branch-22.12 [skip gpuci]
[gpuCI] Forward-merge branch-22.10 to branch-22.12 [skip gpuci]
[gpuCI] Forward-merge branch-22.10 to branch-22.12 [skip gpuci]
[gpuCI] Forward-merge branch-22.10 to branch-22.12 [skip gpuci]
[gpuCI] Forward-merge branch-22.10 to branch-22.12 [skip gpuci]
[gpuCI] Forward-merge branch-22.10 to branch-22.12 [skip gpuci]
[gpuCI] Forward-merge branch-22.10 to branch-22.12 [skip gpuci]
[gpuCI] Forward-merge branch-22.10 to branch-22.12 [skip gpuci]
[gpuCI] Forward-merge branch-22.10 to branch-22.12 [skip gpuci]
[gpuCI] Forward-merge branch-22.10 to branch-22.12 [skip gpuci]
[gpuCI] Forward-merge branch-22.10 to branch-22.12 [skip gpuci]
[gpuCI] Forward-merge branch-22.10 to branch-22.12 [skip gpuci]
[gpuCI] Forward-merge branch-22.10 to branch-22.12 [skip gpuci]
[gpuCI] Forward-merge branch-22.10 to branch-22.12 [skip gpuci]
[gpuCI] Forward-merge branch-22.10 to branch-22.12 [skip gpuci]
[gpuCI] Forward-merge branch-22.10 to branch-22.12 [skip gpuci]
[gpuCI] Forward-merge branch-22.10 to branch-22.12 [skip gpuci]
[gpuCI] Forward-merge branch-22.10 to branch-22.12 [skip gpuci]
[gpuCI] Forward-merge branch-22.10 to branch-22.12 [skip gpuci]
[gpuCI] Forward-merge branch-22.10 to branch-22.12 [skip gpuci]
Part of rapidsai#535.
Implementation of the raft::stats API with mdspan, with the C++ tests
14/22 Files implemented. The remaining files will come in a following PR.

Authors:
  - Micka (https://github.com/lowener)
  - Corey J. Nolet (https://github.com/cjnolet)

Approvers:
  - Corey J. Nolet (https://github.com/cjnolet)

URL: rapidsai#802
cjnolet and others added 8 commits October 19, 2022 17:45
This coincides pretty well w/ the `pairwise_distance_armin` building block that's being exposed in Scikit-learn, except it's faster and saves a lot of gpu memory by fusing the argmin w/ the pairwise distances so we don't ever have to store the n^2 distances. 


cc @betatim

Authors:
  - Corey J. Nolet (https://github.com/cjnolet)

Approvers:
  - Dante Gama Dessavre (https://github.com/dantegd)

URL: rapidsai#924
…uce_rows_by_keys` (rapidsai#909)

`accumulate_into_selected` achieves much better performance than the previous implementation of `reduce_rows_by_keys` for large `nkeys` (`sum_rows_by_key_large_nkeys_kernel_rowmajor`). According to the benchmark that I added for this primitive, the difference is a factor of 240x for sizes relevant to IVF-Flat (and a factor of ~10x for smaller `nkeys`, e.g 64).

This is mostly because the legacy implementation, probably in an attempt to reduce atomic conflicts, assigned a key and a tile of the matrix to each block, and the block only reduces the rows corresponding to the assigned key. With a very large number of keys, e.g 1k, this results in blocks iterating over a large number of rows (possibly tens of thousands) and only reading and accumulating 1 in 1k rows.

This PR:

- Replaces `sum_rows_by_key_large_nkeys_rowmajor` with `accumulate_into_selected` (I didn't find any cases in which the old kernel performed better).
- Removes `accumulate_into_selected` from `ann_utils.cuh`.
- Fixes support for custom iterators in `reduce_rows_by_keys`.
- Uses the raft prims in `calc_centers_and_sizes`.

Perf notes:

- The original kmeans gets a 15-20% speedup for large numbers of clusters.
- The performance of `ivf_flat::build` stays the same as before.
- There are a bunch of extra steps since I separated the cluster size count from the reduction by key, but they are quite neglectable in comparison.

Question: the change breaks support for host-side-only arrays in `calc_centers_and_sizes`, is it actually a possibility? Should I add a branch and not use the raft prims when all arrays are host-side?

cc @achirkin @tfeher @cjnolet

Authors:
  - Louis Sugy (https://github.com/Nyrio)

Approvers:
  - Tamas Bela Feher (https://github.com/tfeher)
  - Corey J. Nolet (https://github.com/cjnolet)

URL: rapidsai#909
…he docs. (rapidsai#943)

Lots of improvements to the docs all around:

- adding all new ANN docs and cleaning up neighborhood docs in general
- new quick-start tutorial in the docs
- docs added to many different APIs including core, clustering, and solvers
- making sure to include docs for mdspan/mdarray in the `core` docs
- updates to build.sh to only compile nn/dist libs when the appropriate tests/benchmarks are selected

Authors:
  - Corey J. Nolet (https://github.com/cjnolet)

Approvers:
  - Dante Gama Dessavre (https://github.com/dantegd)

URL: rapidsai#943
This PR fixes the calculation of max cluster size. 

Previous calculation could return much larger than the actual values, which could lead to OOM while allocating temporary buffers while building index for large datasets.

Authors:
  - Tamas Bela Feher (https://github.com/tfeher)

Approvers:
  - Artem M. Chirkin (https://github.com/achirkin)
  - Corey J. Nolet (https://github.com/cjnolet)

URL: rapidsai#938
@shwina shwina requested a review from a team as a code owner October 25, 2022 19:04
raydouglass and others added 3 commits October 25, 2022 19:09
This PR removes the stale issue labeler workflow

Authors:
  - Ray Douglass (https://github.com/raydouglass)

Approvers:
  - AJ Schmidt (https://github.com/ajschmidt8)

URL: rapidsai#951
vyasr and others added 8 commits October 25, 2022 23:53
Switch to using a centralized rapids-cmake function for getting Google benchmark.

Authors:
  - Vyas Ramasubramani (https://github.com/vyasr)

Approvers:
  - Corey J. Nolet (https://github.com/cjnolet)

URL: rapidsai#954
This is a different approach / followup PR of rapidsai#663 for issue rapidsai#497.

I implemented a `layout_padded_general` within raft to statically enforce padding on mdpsan accesses.
* The layout has template parameters for `ValueType`, `StorageOrder `(default `row_major_t`), and `ByteAlignment `(default 128)
* in order to *not* require changes upstream I skipped `submdspan `functionality right now. I have a branch on a mdspan fork where I tested this though (https://github.com/mfoerste4/mdspan/tree/layout_padded).

Authors:
  - Malte Förster (https://github.com/mfoerste4)

Approvers:
  - Artem M. Chirkin (https://github.com/achirkin)
  - Corey J. Nolet (https://github.com/cjnolet)

URL: rapidsai#725
I recalled having made this change initially and I'm wondering if it accidentally got reverted since there's been so many hands in the IVF flat code recently. 

For proper end-to-end testing, we need ivf flat testing code to invoke the mdspan APIs (which in turn invoke the non-mdspan APIs).

Authors:
  - Corey J. Nolet (https://github.com/cjnolet)

Approvers:
  - Divye Gala (https://github.com/divyegala)

URL: rapidsai#955
Follow up to rapidsai#924, removing a `print` that managed to sneak in.

cc @cjnolet

Authors:
  - Tim Head (https://github.com/betatim)

Approvers:
  - Corey J. Nolet (https://github.com/cjnolet)

URL: rapidsai#958
Adds a benchmark for `fusedL2NN`. The benchmark used the wrapper `fusedL2NNMinReduce` compiled in the distance library, for faster compilation times.

Authors:
  - Louis Sugy (https://github.com/Nyrio)

Approvers:
  - Corey J. Nolet (https://github.com/cjnolet)

URL: rapidsai#936
@galipremsagar
Copy link
Contributor

We can close this PR, since this is a duplicate of : #963

@vyasr vyasr closed this Nov 1, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.