Add nextafter op #2510

jacobhinkle · 2023-02-23T15:16:02Z

The nextafter(x, y) operation provides the nearest distinct representable floating point value to x between x and y. In CUDA these are obtained with the builtins nextafter and nextafterf.

In PyTorch, the torch.nextafter function is defined for any pair of arguments which would normally be promoted to either float32 or float64. So arguments which are both ints, bools, complex, or half-precision floats are not supported. This PR implements a binary op macro with a TypePromotionConfig that enforces that rule.

Test is currently segfaulting for scalar-only inputs.

jacobhinkle · 2023-02-23T15:16:34Z

third_party/nvfuser/csrc/type_promotion.cpp

+  if (config.require_full_precision_promoted) {
+    TORCH_CHECK(
+        common_dtype == c10::ScalarType::Float ||
+            common_dtype == c10::ScalarType::Double,


I am not sure that this is the right place to perform this check.

Hmm. I got this wrong too. PyTorch actually supports bfloat16, just not float16 for this op. It was added with a manual implementation taken from musl: https://github.com/pytorch/pytorch/pull/61829/files#diff-ece04c31934b3504382e10ed3e9a69f03ffabd81ad1a2a890aab19b1642f53c0R120

third_party/nvfuser/python_tests/test_python_frontend.py

The `nextafter(x, y)` operation provides the nearest distinct representable floating point value to `x` between `x` and `y`. In CUDA these are obtained with the builtins `nextafter` and `nextafterf`. Other types such as bfloat16 are not directly supported, though PyTorch has implemented that case based on some code in musl: https://github.com/pytorch/pytorch/pull/61829/files#diff-ece04c31934b3504382e10ed3e9a69f03ffabd81ad1a2a890aab19b1642f53c0R120. In PyTorch, the torch.nextafter function is defined for any pair of arguments which would normally be promoted to either float32 or float64. So arguments which are both ints, bools, complex, or half-precision floats are not supported. This PR implements a binary op macro with a TypePromotionConfig that enforces that rule. This is a translation/update of csarofeen/pytorch#2510. --------- Co-authored-by: Jacob Hinkle <jhinkle@nvidia.com>

Add nextafter op for single and double inputs

734b536

Test is currently segfaulting for scalar-only inputs.

jacobhinkle requested a review from kevinstephano February 23, 2023 15:16

jacobhinkle commented Feb 23, 2023

View reviewed changes

third_party/nvfuser/python_tests/test_python_frontend.py Outdated Show resolved Hide resolved

jacobhinkle added 2 commits February 23, 2023 09:16

Avoid outputting scalars in nextafter frontend test

f8cea44

Fix typo in computeTypes error message

af2ab95

jacobhinkle mentioned this pull request Mar 21, 2023

Add nextafter op NVIDIA/Fuser#44

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add nextafter op #2510

Add nextafter op #2510

jacobhinkle commented Feb 23, 2023

jacobhinkle Feb 23, 2023

jacobhinkle Feb 24, 2023

Add nextafter op #2510

Are you sure you want to change the base?

Add nextafter op #2510

Conversation

jacobhinkle commented Feb 23, 2023

jacobhinkle Feb 23, 2023

Choose a reason for hiding this comment

jacobhinkle Feb 24, 2023

Choose a reason for hiding this comment