Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

finfo should not require float type for the result fields #405

Open
asmeurer opened this issue Mar 18, 2022 · 7 comments
Open

finfo should not require float type for the result fields #405

asmeurer opened this issue Mar 18, 2022 · 7 comments
Labels
API change Changes to existing functions or objects in the API.

Comments

@asmeurer
Copy link
Member

Right now finfo requires that the output fields be float https://data-apis.org/array-api/latest/API_specification/generated/signatures.data_type_functions.finfo.html#signatures.data_type_functions.finfo. However, making the results 0-D arrays would be better.

For the spec itself, float is fine, but it's problematic for any library that implements higher precision data types like float128:

>>> import numpy as np
>>> np.finfo(np.float128)
finfo(resolution=1.0000000000000000715e-18, min=-1.189731495357231765e+4932, max=1.189731495357231765e+4932, dtype=float128)
>>> float('-1.189731495357231765e+4932')
-inf

float is essentially a float64, so the various values for float128 cannot be represented as floats. NumPy uses scalars for the values, which for the spec would be 0-D arrays:

>>> type(np.finfo(np.float128).min)
<class 'numpy.float128'>

Even if there are no plans to add float128 to the spec, it's useful for libraries that do have it to have a consistent return type for finfo, since float and a 0-D array aren't 100% duck type compatible.

For iinfo obviously this problem isn't present since int can represent any integer, but it may be good to change it to 0-D array as well just for consistency (although I should note that numpy.iinfo just uses int for its fields).

@kgryte kgryte added the API change Changes to existing functions or objects in the API. label Mar 21, 2022
@kgryte
Copy link
Contributor

kgryte commented Mar 21, 2022

Thanks for opening this issue @asmeurer. Agreed that 0-D arrays probably make the most sense here. As we've already departed from NumPy behavior for reductions (e.g., sum et al returning 0-D arrays whereas the main namespace returns scalars), returning 0-D arrays for finfo and iinfo would be consistent.

Given that NumPy, CuPy, etc have the array API implementation under an experimental status, probably not too late to make this change for these libraries. However, Torch is already shipping finfo and iinfo according to the current spec: https://pytorch.org/docs/1.11/type_info.html?highlight=finfo#torch.torch.finfo, so moving to returning a 0-D array would be a breaking change.

@rgommers perhaps you have thoughts?

@rgommers
Copy link
Member

This is not a proposal that I immediately think is a good idea, because:

  • There are no current issues, only a hypothetical future one. NumPy is the only library with a >64-bits dtype, and it has array scalars too so there's only a problem if it really becomes possible to completely remove array scalars.
    • Side note: I hope we can get rid of float96/float128 in NumPy, because they're a massive pain and a not-so-useful relic of the past.
  • It'd be a breaking change for every library that has finfo and wants to implement array API standard support in its main namespace (which long-term is everyone I hope), not just PyTorch.

If we designed this from scratch with no concern for backwards compatibility, then I'd agree 0-D arrays would be preferred.

@leofang
Copy link
Contributor

leofang commented Mar 21, 2022

I don't have strong opinion here, but whatever we do for finfo it must be done consistently for other constants as well: https://data-apis.org/array-api/latest/API_specification/constants.html
(We had a long discussion on the very same issue for constants, on my cell and couldn't find it immediately 😅)

@rgommers
Copy link
Member

gh-154 is the discussion on constants. The reason for them to be included was different though: #154 (comment). I don't think we ever discussed making constants 0-D arrays, only whether to include them or not.

@leofang
Copy link
Contributor

leofang commented Mar 21, 2022

No we didn't discuss making them 0D arrays, but both pi, e, etc and finfo are constants that require uniform treatments in the standard. If there is desire to keep pi on GPU, it's equally possible there's desire to have eps stay on GPU 🙂

@leofang
Copy link
Contributor

leofang commented Mar 21, 2022

(So, in short, I'm just adding another reason for finfo to keep floating type)

@rgommers
Copy link
Member

One other concern similar to float128 that was brought up today is custom dtypes, for example a decimal type. NumPy does allow such things, but only as object dtype:

>>> import decimal
>>> d = decimal.Decimal('0.142857')
>>> x = np.array([d, d])
>>> x
array([Decimal('0.142857'), Decimal('0.142857')], dtype=object)
>>> np.finfo(x)
Traceback (most recent call last):
  Input In [12] in <cell line: 1>
    np.finfo(x)
  File ~/anaconda3/envs/dev/lib/python3.9/site-packages/numpy/core/getlimits.py:473 in __new__
    raise ValueError("data type %r not inexact" % (dtype))
ValueError: data type <class 'numpy.object_'> not inexact

@jakirkham mentioned that cuDF has a native decimal dtype, so that could in principle be made to work with finfo. That said, the solution is the same as for float128 (a custom object being returned, see below).

A >64-bit type like float128 or another custom dtype like decimal can always be supported as a custom object which duck types with a Python float. We had this same discussion for constants (see my comment above), and I think we've had it as well in places where the return value of a function is a tuple of arrays. In all those cases it is fine to return custom objects that duck type with the Python builtin types in the standard. One common reason that libraries may do this is to be able to keep the results on a GPU.

So the proposed solution to close this issue is: add a new section to the standard which explicitly mentions that it's fine to implement duck type objects, the reasons why libraries may do that, and give examples for constants (pi, e, inf, nan), finfo and a function that returns a tuple.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API change Changes to existing functions or objects in the API.
Projects
None yet
Development

No branches or pull requests

4 participants