Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add clip #482

Closed
jakirkham opened this issue Sep 23, 2022 · 11 comments · Fixed by #715
Closed

Add clip #482

jakirkham opened this issue Sep 23, 2022 · 11 comments · Fixed by #715
Labels
API extension Adds new functions or objects to the API.
Milestone

Comments

@jakirkham
Copy link
Member

The clip function is quite handy for constraining data to a certain range. Would be nice to include this in the spec

@rgommers
Copy link
Member

gh-187 says clip is universally implemented by libraries. This looks to me like a useful function to add. IIRC we had some issues with clip behavior in numpy, we should dig those up to check for possible semantics issues.

@rgommers rgommers added the API extension Adds new functions or objects to the API. label Sep 23, 2022
@rgommers
Copy link
Member

rgommers commented Oct 5, 2022

IIRC we had some issues with clip behavior in numpy, we should dig those up to check for possible semantics issues.

This wasn't too bad actually, a lot of it was segfaults and whether it was a ufunc or not. The relevant ones:

And of course we need to choose keyword names, always a hard problem. Existing signatures don't match:

  • np.clip(a, a_min, a_max, out=None, **kwargs)
  • torch.clip(input, min=None, max=None, *, out=None)

No name is ideal - min/max are nicer names, but not everyone may be happy that it conflicts with builtin names. a_min and a_max are bad though, given that we consistently name our input arrays x and not a. So how about:

def clip(x, min, max, /):  # all positional-only args to avoid incompatibilities

@shoyer
Copy link
Contributor

shoyer commented Oct 5, 2022

No name is ideal - min/max are nicer names, but not everyone may be happy that it conflicts with builtin names. a_min and a_max are bad though, given that we consistently name our input arrays x and not a. So how about:

def clip(x, min, max, /):  # all positional-only args to avoid incompatibilities

I would lean towardsclip(x, /, min=None, max=None). The naming conflict is only inside the array library, not in user code, which can call these variables whatever it likes.

Side note: this is yet another example (like unstack and moveaxis) of functionality that could perhaps have a universal default implementation terms of other array API functionality (specifically where).

@rgommers
Copy link
Member

rgommers commented Oct 5, 2022

The naming conflict is only inside the array library, not in user code, which can call these variables whatever it likes.

Only if they're positional, right?

I actually do like your signature better - clip(x, max=3) is nicer than having to write clip(x, None, 3). It's just another thing then that we have to deal with when we aim to converge NumPy's main namespace to the array API standard. But I guess it's not too bad, there's already **kwargs in np.clip, so using aliases is easy.

@shoyer
Copy link
Contributor

shoyer commented Oct 5, 2022

The naming conflict is only inside the array library, not in user code, which can call these variables whatever it likes.

Only if they're positional, right?

I only get lint errors for overriding builtins if I define variables with the same name as a builtin. Something like clip(x, max=3) is fine -- the lint error is defining a variable named max, not for using a keyword argument.

clip(x, max=max) would be problematic, but there's nothing in the interface that requires calling the variable max. Users can write something like clip(x, max=a_max) if desired.

@jakirkham
Copy link
Member Author

cc @seberg (in case you have thoughts here)

@seberg
Copy link
Contributor

seberg commented Oct 6, 2022

I agree that clip is clear enough that it seems fine to allow passing positional and min, max seems OK (no strong opinions on it though).
I would lean against a_min, etc. even if that is what NumPy has. The alternative I may seriously consider is lower and upper (I have not quite made up my mind whether that is the better or worse name for bounds – mathematically speaking).

@rgommers
Copy link
Member

rgommers commented Oct 7, 2022

Questions that were brought up yesterday:

  • Should the min and max inputs participate in determining the output dtype, or should the output dtype equal the input dtype?
    • Tentative answer: == input_dtype
  • Are min and max allowed to be arrays, or only scalars?
    • Tentative answer: yes, because otherwise there's no way to clip along an axis.

@oleksandr-pavlyk
Copy link
Contributor

if min and max do not participate in the type promotion, this makes clip not equivalent to xp.maximum(xp.minimum(x, min), max) in functional behavior.

In NumPy both clip's min and max arguments participate in the type promotion:

In [3]: np.clip(np.arange(10, dtype=np.int8), 0, np.inf)
Out[3]: array([0., 1., 2., 3., 4., 5., 6., 7., 8., 9.])

@seberg
Copy link
Contributor

seberg commented Oct 25, 2023

I don't want to volunteer it, but it shouldn't be hard to change that in NumPy or fix in a compat namespace. And yes, it will mean that it isn't the same as a chained minimum/maximum call.

@kgryte
Copy link
Contributor

kgryte commented Oct 25, 2023

At this point, I'd find it a bit inconsistent if clip does not follow type promotion rules.

We explicitly changed, e.g., the bitwise_left|right_shift specifications to ensure consistent type promotion rules throughout.

At the time of original API addition, I had advocated for returning an array having a dtype of the same type, but objections were raised (in PyTorch) about having certain APIs exempt from normal type promotion rules.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API extension Adds new functions or objects to the API.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants