Support max_n and min_n reductions on GPU #1196

ianthomas23 · 2023-03-22T12:19:32Z

Closes #1177.

This adds support for max_n and min_n reductions on a GPU, both with and without dask. The key change is to add new CUDA mutex functionality to support CUDA append functions (i.e. individual pixel callbacks) that do more than a simple get/set operation. Because of the massively parallel nature of CUDA hardware multiple threads can access the same canvas pixel at the same time, and up until now we have been restricted to CUDA atomic operations (https://numba.readthedocs.io/en/stable/cuda/intrinsics.html#supported-atomic-operations) in append functions. With the new mutex we can lock access to a particular pixel to a single thread at a time and thus perform more complicated operations such as for max_n without any race conditions.

In implementation we need to get the mutex (a cupy array) to the CUDA append functions and this is achieved within the expand_aggs_and_cols framework by appending the mutex array in the make_info function which is where other arrays and/or dataframe columns are extracted and passed to append functions. This ensures that there is only ever a single shared mutex even if multiple reductions need it.

This implementation is limited by what is currently available in numba 0.56 which means we can only lock/unlock the mutex as a whole rather than individual elements/pixels of it so the performance will not be great. Numba PR numba/numba#8790 will allow us to lock individual pixels, so when numba 0.57 is released I will write another PR to check use the fast route if that is available otherwise drop back to this slower one.

There is no support yet for where(max_n) on CUDA, but this will follow in another PR soon.

codecov · 2023-03-22T12:50:03Z

Codecov Report

Merging #1196 (8f9ec81) into main (ed0d58e) will decrease coverage by 0.80%.
The diff coverage is 32.00%.

@@            Coverage Diff             @@
##             main    #1196      +/-   ##
==========================================
- Coverage   85.48%   84.68%   -0.80%     
==========================================
  Files          35       35              
  Lines        8232     8345     +113     
==========================================
+ Hits         7037     7067      +30     
- Misses       1195     1278      +83

Impacted Files	Coverage Δ
datashader/transfer_functions/_cuda_utils.py	`23.52% <17.77%> (-2.85%)`	⬇️
datashader/reductions.py	`83.11% <30.64%> (-3.06%)`	⬇️
datashader/compiler.py	`92.81% <72.22%> (-2.94%)`	⬇️

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

ianthomas23 · 2023-03-22T12:50:24Z

Test failures are related to rioxarray which released 0.14.0 yesterday.

ianthomas23 · 2023-03-24T16:42:00Z

Test failures are related to rioxarray which released 0.14.0 yesterday.

This was a dependency issue in the conda-forge rioxarray build which has now been fixed: conda-forge/rioxarray-feedstock#70.

jbednar · 2023-03-24T17:52:16Z

Nice. Thanks!

ianthomas23 added 2 commits March 20, 2023 14:13

Use CUDA mutexes to support CUDA max_n and min_n without dask

b37c4fa

Support CUDA with dask

8f9ec81

ianthomas23 requested a review from jbednar March 22, 2023 12:19

ianthomas23 added this to the v0.14.5 milestone Mar 22, 2023

jbednar approved these changes Mar 24, 2023

View reviewed changes

ianthomas23 merged commit 3d2f7df into holoviz:main Mar 24, 2023

ianthomas23 deleted the cuda_mutexes branch March 24, 2023 18:00

ianthomas23 mentioned this pull request May 4, 2023

Implement fast CUDA mutex #1211

Closed

ianthomas23 mentioned this pull request Aug 11, 2023

Add CUDA support for std and var reductions #1267

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support max_n and min_n reductions on GPU #1196

Support max_n and min_n reductions on GPU #1196

ianthomas23 commented Mar 22, 2023

codecov bot commented Mar 22, 2023 •

edited

Loading

ianthomas23 commented Mar 22, 2023

ianthomas23 commented Mar 24, 2023

jbednar commented Mar 24, 2023

Support max_n and min_n reductions on GPU #1196

Support max_n and min_n reductions on GPU #1196

Conversation

ianthomas23 commented Mar 22, 2023

codecov bot commented Mar 22, 2023 • edited Loading

Codecov Report

ianthomas23 commented Mar 22, 2023

ianthomas23 commented Mar 24, 2023

jbednar commented Mar 24, 2023

codecov bot commented Mar 22, 2023 •

edited

Loading