[FEA] Support for half-float mixed precise in brute-force #2382

rhdong · 2024-07-17T05:22:29Z

distance supports half-float
SDDMM support half-float
gemm supports multi-type compose
transpose & copy support half
random supports half

- distance supports half-float - SDDMM support half-float - gemm supports multi-type compose - transpose & copy support half - random supports half

benfred · 2024-07-17T18:22:58Z

cpp/test/CMakeLists.txt

@@ -219,38 +219,7 @@ if(BUILD_TESTS)
    NAME
    LINALG_TEST
    PATH
-    linalg/add.cu


I think this file was accidentally checked in - we should probably keep the linalg/sparse tests here

OMG, I forgot to recover the test cases a second time...T_T

benfred · 2024-07-17T18:35:12Z

cpp/include/raft/util/cuda_dev_essentials.cuh

+ * @param a need to convert
+ */
+template <typename T>
+HDI auto half2float(T& a)


I think this API should be renamed - since this not only converts from half2float, but also handles conversions when T=float.

I also think we probably want to have both an input and output template parameters for the conversion - rather than have a template parameter for the input, and assume the output is always a float.

The faiss codebase has a generic 'convert' operator that does this -
(https://github.com/facebookresearch/faiss/blob/dd72e4121dc6684c6fbf289949ba4526a54d9f3b/faiss/gpu/utils/ConversionOperators.cuh#L39-L45) which seems to work pretty well

benfred · 2024-07-17T18:57:42Z

cpp/include/raft/core/math.hpp

@@ -102,7 +102,13 @@ template <typename T>
 RAFT_INLINE_FUNCTION auto asin(T x)
 {
 #ifdef __CUDA_ARCH__
-  return ::asin(x);
+  if constexpr (std::is_same<T, __half>::value) {


do we need half support for the asin function?

I'm wondering if we should either remove the half support for this function, or add half support for all the other trigonometric functions in this file

Yeah, maybe we can. As of now, it's a kind of trade-off. I understand your meaning; some distance algorithms need this as supporting half; removing it will cause a compilation error. If we bring half everything, that could be ideal, but the workload can be out of control..

cjnolet

I'm a little concerned by the transpose kernel and I think this PR needs some additional benchmarks- both on the perf w/ half precision and on the increase in build time and binary size.

Given that we're butting right up against code freeze, I'd like to retarget this PR to 24.10 so we can make sure we're happy w/ the perf and changes to the build.

cjnolet · 2024-07-24T18:28:16Z

cpp/include/raft/linalg/detail/transpose.cuh

+                                  const half* __restrict__ in,
+                                  half* __restrict__ out)
+{
+  __shared__ half tile[TILE_DIM][TILE_DIM + 1];


Transpose of a non-symmetric non-square matrix is non-trivial to parallelize. This seems to be doing it all in a single kernel. This makes me wonder if we might be lacking test coverage here.

…ltiple of 32

rhdong · 2024-08-13T15:47:52Z

/ok to test

cjnolet · 2024-08-20T17:24:30Z

/ok to test

cjnolet · 2024-08-21T02:01:26Z

/ok to test

raydouglass · 2024-08-21T13:59:00Z

/okay to test

cjnolet · 2024-08-21T17:37:56Z

@rhdong before we merge this, we need to make sure it doesn't break cuml. Can you build cuML with this RAFT branch and just verify it doesn't need to be updated?

rhdong · 2024-08-21T18:10:12Z

@rhdong before we merge this, we need to make sure it doesn't break cuml. Can you build cuML with this RAFT branch and just verify it doesn't need to be updated?

OK, I will verify it.

cjnolet · 2024-08-22T13:13:39Z

/merge

[FEA] Support for half-float mixed precise in brute-force

0780518

- distance supports half-float - SDDMM support half-float - gemm supports multi-type compose - transpose & copy support half - random supports half

rhdong requested review from benfred and cjnolet July 17, 2024 05:22

rhdong requested review from a team as code owners July 17, 2024 05:22

github-actions bot added cpp CMake labels Jul 17, 2024

rhdong added enhancement New feature or request 3 - Ready for Review feature request New feature or request non-breaking Non-breaking change and removed cpp CMake labels Jul 17, 2024

benfred reviewed Jul 17, 2024

View reviewed changes

Merge branch 'branch-24.08' into half_knn

0e9e70b

github-actions bot added cpp CMake labels Jul 17, 2024

recover the test cases

3dd5fdf

github-actions bot removed the CMake label Jul 17, 2024

rhdong added 9 commits July 17, 2024 18:40

try to fix sddmm error in conda-cpp-tests

eb3b8a6

check sparse

60ff874

fix the Floating-point exception

6103fd1

Merge remote-tracking branch 'origin/branch-24.08' into half_knn

d744d59

conditionally skip sddmm test cases

efec888

conditionally skip sddmm test cases(2nd try)

a55f100

half2float -> to_float

920997f

Merge branch 'branch-24.08' into half_knn

f2f21c4

CI fix caused by half2float -> to_float

43ee640

cjnolet requested changes Jul 24, 2024

View reviewed changes

cjnolet assigned rhdong Jul 24, 2024

rhdong changed the base branch from branch-24.08 to branch-24.10 July 26, 2024 17:06

rhdong added 7 commits July 26, 2024 10:07

Merge branch 'branch-24.10' into half_knn

11a9988

Merge branch 'branch-24.10' into half_knn

aa6da25

update masked_matmul to support mixed precision

3d8b7ca

fix CI fail by skipping low version cuSparse

34dbde8

fix: illegal instruction by masked_matmul on V100 with a dim not mu…

10e5204

…ltiple of 32

Merge branch 'branch-24.10' into half_knn

4046bf8

[Fix] bug of masked_matmul on non-2-power of num_cols

1ed4801

rhdong mentioned this pull request Aug 15, 2024

[FEA] Support for half-float mixed precise in brute-force rapidsai/cuvs#225

Merged

cjnolet approved these changes Aug 20, 2024

View reviewed changes

cjnolet approved these changes Aug 21, 2024

View reviewed changes

[Fix] compilation error of gemm & transpose under cuml

4d55c57

rapids-bot bot merged commit db07998 into rapidsai:branch-24.10 Aug 22, 2024
58 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEA] Support for half-float mixed precise in brute-force #2382

[FEA] Support for half-float mixed precise in brute-force #2382

rhdong commented Jul 17, 2024

benfred Jul 17, 2024

rhdong Jul 17, 2024

benfred Jul 17, 2024

benfred Jul 17, 2024

rhdong Jul 17, 2024

cjnolet left a comment

cjnolet Jul 24, 2024

rhdong commented Aug 13, 2024

cjnolet commented Aug 20, 2024

cjnolet commented Aug 21, 2024

raydouglass commented Aug 21, 2024

cjnolet commented Aug 21, 2024

rhdong commented Aug 21, 2024

cjnolet commented Aug 22, 2024

[FEA] Support for half-float mixed precise in brute-force #2382

[FEA] Support for half-float mixed precise in brute-force #2382

Conversation

rhdong commented Jul 17, 2024

benfred Jul 17, 2024

Choose a reason for hiding this comment

rhdong Jul 17, 2024

Choose a reason for hiding this comment

benfred Jul 17, 2024

Choose a reason for hiding this comment

benfred Jul 17, 2024

Choose a reason for hiding this comment

rhdong Jul 17, 2024

Choose a reason for hiding this comment

cjnolet left a comment

Choose a reason for hiding this comment

cjnolet Jul 24, 2024

Choose a reason for hiding this comment

rhdong commented Aug 13, 2024

cjnolet commented Aug 20, 2024

cjnolet commented Aug 21, 2024

raydouglass commented Aug 21, 2024

cjnolet commented Aug 21, 2024

rhdong commented Aug 21, 2024

cjnolet commented Aug 22, 2024