Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

_ArrayLike[Any] vs. ndarray[Any] #14

Open
kjyv opened this issue Mar 3, 2017 · 5 comments
Open

_ArrayLike[Any] vs. ndarray[Any] #14

kjyv opened this issue Mar 3, 2017 · 5 comments

Comments

@kjyv
Copy link

kjyv commented Mar 3, 2017

It seems a bit unclear to me how to properly annotate functions that expect an ndarray. Many ndarray functions (e.g. flatten) return _ArrayLike which is then not recognized to be an ndarray. I guess I can't and shouldn't use _ArrayLike in my own annotations. Is this currently not supported properly by mypy or do I have to do this differently?
Btw., the numpy docs actually specify the return type of e.g. flatten to be ndarray, not "array_like".

@kjyv
Copy link
Author

kjyv commented Mar 3, 2017

It seems to me that many methods are defined in the _ArrayLike class and return _ArrayLike but the actual numpy methods are only defined in numpy.ndarray and always return an ndarray. Maybe they should move to class ndarray to remove any type confusion. E.g. slicing an ndarray should also create an ndarray and not a variable of type _ArrayLike.

@shoyer
Copy link

shoyer commented Mar 15, 2017

Agreed. _ArrayLike is a useful class for annotation function signatures, but I don't think it makes sense for methods.

@dmoisset
Copy link
Contributor

@kjyv can you show a small sample snippet that shows this? I usually try to use abstractions in annotations instead of concrete types (it's what you usually mean in python, given duck typing), although they were some tricky things around numpy semantic that can make this general advice wrong.

@shoyer
Copy link

shoyer commented Mar 20, 2017

We're often a sometimes a little sloppy on terminology, but it can be useful to distinguish between "array likes" and "duck arrays" (I think that's the source of our confusion here).

To quote @njsmith:

NB: we should probably be careful to distinguish between "array-likes"
(which is a term that's already well established to mean "anything that can
be passed to np.asarray", and includes scalars, lists, memoryviews, among
others), versus what we've been calling "duck arrays", i.e. objects that
act like ndarray while not actually being ndarrays or even necessarily
convertible to ndarrays.

We definitely want at least support for NumPy ArrayLike types (which, given the way NumPy currently works, means basically any arbitrary object), but DuckArray should be a stronger constraint of some form. The challenge is that DuckArray is not entirely well defined, because various NumPy functions that handle duck arrays only look for the particular properties they need. There's no single notation of a duck array, so it should really should be considered a collection of protocols.

Supposing that @overload is handled in order of definition (python/typing#253), a "proper" type definition for transpose might look something like:

@overload
def transpose(array: SupportsTranspose, indices: Tuple[int] = None) -> SupportsTranspose

def transpose(array: ArrayLike, indices: Tuple[int] = None) -> ndarray

where SupportsTranspose indicates a object with a .transpose method.

In practice, this gets pretty complex and I'm not sure it's worth the trouble. If someone is using type checking, they probably would be happy with slightly stricter functions, even if they aren't defined on everything. So the later signature might be enough for now.

@kjyv
Copy link
Author

kjyv commented Mar 22, 2017

@dmoisset
As an example, see this. arr passes fine (creation routines return ndarray) while arr2 gives a type error.

import numpy as np

def test(data):
    #type: (np.ndarray) -> None
    print(data)

arr = np.ones(5)
test(arr)  # fine

arr2 = arr[0:3]
test(arr2)  # expected np._ArrayLike

I wonder now why I didn't try to set np._ArrayLike as expected input type since that covers both cases. The Readme does not give that hint, but maybe the correct answer is to simply use that type.
However, even the methods defined within _ArrayLike should return ndarray, as there are no corresponding methods in numpy that return arraylike.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants