Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make commutative operations commutative in both magnitude and units #1019

Open
jthielen opened this issue Feb 7, 2020 · 7 comments
Open
Assignees
Milestone

Comments

@jthielen
Copy link
Contributor

jthielen commented Feb 7, 2020

As brought up in pydata/xarray#3706 (comment), having + (and other should-be-commutative operations like np.where) give differing results based on order was an unexpected result for a duck array. In other words, something like the following would ideally be an invariant:

(a + b).units == (b + a).units

A suggestion by @shoyer was:

[...] I think it would be worth considering adding some sort of unit casting hierarchy into pint's unit registries. It almost doesn't matter what exactly the rules are as long as it exists.

So, instead of ensuring consistent units by converting to the first unit, there will need to be some hierarchical way of giving a preference to units of a given dimensionality.

While there might be some fully general way of doing so, it would have to be very complex. So, I would think the best way forward is to decide on some heuristics on which unit to choose, and then fall back to base units if those heuristics do not resolve a consistent unit.

I like working in base units when possible, so having the set of heuristics be minimal (perhaps just choosing the closer-to-unity prefix, and always converting to base units when in different systems) would be good with me, but I don't know what others' thoughts on this are.

@jules-ch
Copy link
Collaborator

jules-ch commented Feb 9, 2020

True we should have commutative operations on any operations IMO it would make the use of the library more consistent. I don't think it will change the way we use pint anyway.

Only requirement for this is simplicity since it won't change how operation is conducted.

My thoughts at the moment :

  • Take units if you have the same unit or prefix ones
  • Fall back to system base unit for dimensionality if using a particular system
  • Use base units if you're using a basic registry without system support

But maybe we should be opinionated about it and do operations on simple base units. We need some more thought s on the matter.

@dalito
Copy link
Contributor

dalito commented Feb 9, 2020

I am not sure if falling back to base unit is a good solution. For example:

1 foot + 12 inch = ?

What is the expected result: (a) 24 inch, (b) 2 foot, (c) 0.6096meter? For the typical user (c) is least expected, probably (b) is the nices. On the other hand, for 1 foot - 10 inch the nicest result would be "2 inch".

@jules-ch
Copy link
Collaborator

jules-ch commented Feb 9, 2020

What is the expected result: (a) 24 inch, (b) 2 foot, (c) 0.6096meter? For the typical user (c) is least expected, probably (b) is the nices. On the other hand, for 1 foot - 10 inch the nicest result would be "2 inch".

That's why we should not let the user guess which unit he is going to have after such operations.
I think the current public API gives enough flexibility with things like systems, convert functions to / to_compact to let the user be explicit if he wants a particular unit output.

@dalito
Copy link
Contributor

dalito commented Feb 9, 2020

Your rules are simple enough to remember and I would be OK with them. Maybe good documentation can prevent false expectations. And you are right that pint offers enough options to get the desired output.

@hgrecco
Copy link
Owner

hgrecco commented Feb 10, 2020

I agree. Can we have a PR?

@jules-ch
Copy link
Collaborator

I can give it a shot.

@jules-ch jules-ch added this to the 0.18 milestone May 26, 2021
@jules-ch jules-ch self-assigned this May 26, 2021
@jules-ch
Copy link
Collaborator

@hgrecco About operation with same unit, I think i'm gonna change my mind.
Taking the most precise unit is simpler & maybe the best option.

Numpy does that with timedelta dtype.

Like so :

>>> a = np.timedelta64(1, 'D') 
>>> b = np.timedelta64(4, 'ms')
>>> a+b
numpy.timedelta64(86400004,'ms')
>>> b+a
numpy.timedelta64(86400004,'ms')

When you have multiple units things get tricky.
I need to go deeper in the options that we have.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants