Any incorrectly inferred for a call to an overloaded function #1322

JukkaL · 2016-04-01T10:35:48Z

This was reported by Alex Allain.

Mypy doesn't give an error for this program even though it should:

from typing import TypeVar, overload, List

T = TypeVar('T')

@overload
def f(x: T) -> List[T]: ...
@overload
def f(x: T, y: int) -> int: pass

def not_checked(text):
    # type: (str) -> str
    return f(*[])  # no error reported!

Mypy infers type List[None] for [] (which is arguably okay, though a little confusing), and this results in all overload variants matching, which is again okay. Now since multiple overload variants match, mypy incorrectly gives up and infers Any as the return type of f(*[]), even though in this case we could infer a more precise type: a union of all return types of all overload variants, i.e. Union[List[None], int]. Alternatively, mypy could give up and say that the return type is ambiguous, but it's less clear when this would be an okay thing to do (if ever).

The text was updated successfully, but these errors were encountered:

gvanrossum · 2016-04-03T22:10:05Z

Why is it okay if multiple overloads match? I thought mypy had some check that warns if overlaps are possible when the overload is defined, so I'd expect that that would produce an error (and I commend Alex for having found an example that matches multiple overloads).

Given that you generally don't like unions I'm a little surprised you propose one here. But I suppose if both variants were to return the same type we should at least infer that type.

Also, once we implement strict Optional checking, List[None] no longer would match all overload variants, right?

JukkaL · 2016-04-04T12:51:48Z

Originally the idea was that overloads shouldn't be overlapping, but this got abandoned since it relied on multiple inheritance from built-in types being restricted so that you can't subclass int and str, for example. If almost any pair of classes can be subclassed, almost all pairs of types are potentially overlapping. Currently mypy doesn't keep track of which classes are built-in / extension classes so it can't do this sort of enforcement easily. Besides, this also seemed to be too restrictive, and the ambiguity due to overlapping overload variants doesn't seem to be a significant problem most of the time. The story is a bit more complex than this, as some forms of overlapping are always okay. Maybe I should write some more detailed docs about how overloaded functions works.

I still don't like union types, but here they just might be justified. However, let's start with an example where union types are plausible but problematic. Consider this fragment:

# stub for m
@overload
def f(x: str) -> str: ...
@overload
def f(x: str, y: str) -> int: ...

# program
from m import f
a = <something> # type: Iterable[str]
f(*a)  # Is this valid? What's the type?

Mypy currently assumes that either variant of f could be invoked on the last line -- we can't predict the number of items in an Iterable, and the call could target either of the variants. There are a few things we could do here:

Disallow the call, because the return type is ambiguous. It would have to rewritten, perhaps like this:
```
a = <something> # type: Iterable[str]
x, y = a
f(x, y)
```
Disallow the call, because the length of *args argument cannot be determined, and the call could fail at runtime if the length is not 1 or 2. The workaround would be the same as above.
Infer type Union[int, str] for the result, as either variant could be called. I don't like this very much because the programmer might know that a particular variant will always be called, and the union type would require a cast or similar.
Infer type Any for the result -- this is what mypy does right now, though unintentionally. This is bad because it can produce Any types from seemingly annotated code.

The above example is pretty academic and unlikely to happen often in the wild. The case of zip is more realistic. Let's forget about List[None] for a while, as that should be fixed once None is no longer special as you mentioned. A similar thing can happen without it:

a = <something> # type: Iterable[List[int]]
zip(*a)  # What's the type of this?

zip is defined as an overloaded function, as variadic type variables aren't supported. The correct type for zip(*a) would be Iterator[Tuple[int, ...]], but the stub file makes it hard to come up with this type, since it only supports cases with up to four arguments. If we'd added an extra overload variant that accepts and arbitrary number of iterables (via *args), we could have something like this (I'm simplifying a bit by only including 3 variants):

@overload
def zip(iter1: Iterable[_T1]) -> Iterator[Tuple[_T1]]: ...
@overload
def zip(iter1: Iterable[_T1], iter2: Iterable[_T2]) -> Iterator[Tuple[_T1, _T2]]: ...
@overload
def zip(iter1: Iterable[_T], 
        iter2: Iterable[_T], 
        iter3: Iterable[_T], 
        *iter: Iterable[_T]) -> Iterator[Tuple[_T, ...]]: ...
# The last variant is not included in the current stub

Now the call to zip(*a) could still match all of the variants, and the return type is ambiguous. However, if we'd use a union for the return type, we could actually simplify the union to just Iterator[Tuple[_T, ...]] which would be good. So here using union types would actually work.

A suitable rule to cover both of the above examples might be something like this:

If multiple overload variants match [+ some extra conditions that I'm leaving out for simplicity], the type of the call is the union of the return types of the matching variants, assuming this union can be simplified to a non-union type (if one the variants returns a union, then a union is fine after simplification). If the result would be a union, generate a error and ask the code to be refactored.

gvanrossum · 2016-04-04T16:31:39Z

OK, as long as the decision that multiple variants match happens rarely
enough, and as long as the zip(*a) example actually works as expected.

JukkaL · 2016-04-05T10:47:56Z

I'm not sure whether my proposal would work in practice, but it wouldn't too hard to run an experiment.

ilevkivskyi · 2017-08-14T21:00:53Z

This also affects protocols, I think we might want to raise priority of this to high.

This commit adds support for very basic and simple union math when calling overloaded functions, resolving python#4576. One thing led to another, and this ended up accidentally fixing or touching on several different overload-related issues. In particular, I believe this pull request: 1. Fixes the bug (?) where calling overloaded functions can sometimes silently infer a return type of 'Any' 2. Changes the semantics of how mypy handles overlapping functions, which I believe is currently under discussion in python/typing#253 Although this change is functional and mergable, I was planning on polishing it more -- adding more tests, fleshing out the union math behavior, etc. However, I think these are sort of big changes and wanted to check in and make sure this pull request is actually welcome/is a good idea. If not, let me know, and I'd be happy to abandon it. --- Details on specific changes made: 1. The new algorithm works by modifying checkexpr.overload_call_targets to return all possible matches, rather then just one. We start by trying the first matching signature. If there was some error, we (conservatively) attempt to union all of the matching signatures together and repeat the typechecking process. If it doesn't seem like it's possible to combine the matching signatures in a sound way, we end and just output the errors we obtained from typechecking the first match. The "signature-unioning" code is currently deliberately very conservative. I figured it was better to start small and attempt to handle only basic cases like python#1943 and relax the restrictions later as needed. For more details on this algorithm, see the comments in checkexpr.union_overload_matches. 2. This change incidentally resolves any bugs related to how calling an overloaded function can sometimes silently infer a return type of Any. Previously, if a function call caused an overload to be less precise then a previous one, we gave up and returned a silent Any. This change removes this case altogether and only infers Any if either (a) the caller arguments explicitly contains Any or (b) if there was some error. For example, see python#3295 and python#1322 -- I believe this pull request touches on and maybe resolves (??) those two issues. 3. As a result, I needed to fix a few parts of mypy that were relying on this "silently infer Any" behavior -- see the changes in checker.py and semanal.py. Both files were using expressions of the form `zip(*iterable)`, which ended up having a type of `Any` under the old algorithm. The new algorithm will instead infer `Iterable[Tuple[Any, ...]]` which actually matches the stubs in typeshed. 4. These changes cause the attr stubs in `test-data/unit/lib-stub` to no longer work. It seems that the stubs both here and in typeshed were both also falling prey to the 'silently infer Any' bug: code like `a = attr.ib()` typechecked not because they matched the signature of any of the overloads, but because that particular call caused one or more overloads to overlap, which made mypy give up and infer Any. I couldn't find a clean way of fixing the stubs to infer the correct thing under this new behavior, so just gave up and removed the overloads altogether. I think this is fine though -- it seems like the attrs plugin infers the correct type for us anyways, regardless of what the stubs say. If this pull request is accepted, I plan on submitting a similar pull request to the stubs in typeshed. 4. This pull request also probably touches on python/typing#253. We still require the overloads to be written from the most narrow to general and disallow overlapping signatures. However, if a *call* now causes overlaps, we try the "union" algorithm described above and default to selecting the first matching overload instead of giving up.

This commit adds support for very basic and simple union math when calling overloaded functions, resolving python#4576. As a side effect, this change also fixes a bug where calling overloaded functions can sometimes silently infer a return type of 'Any' and slightly modifies the semantics of how mypy handles overlaps in overloaded functions. Details on specific changes made: 1. The new algorithm works by modifying checkexpr.overload_call_targets to return all possible matches, rather then just one. We start by trying the first matching signature. If there was some error, we (conservatively) attempt to union all of the matching signatures together and repeat the typechecking process. If it doesn't seem like it's possible to combine the matching signatures in a sound way, we end and just output the errors we obtained from typechecking the first match. The "signature-unioning" code is currently deliberately very conservative. I figured it was better to start small and attempt to handle only basic cases like python#1943 and relax the restrictions later as needed. For more details on this algorithm, see the comments in checkexpr.union_overload_matches. 2. This change incidentally resolves any bugs related to how calling an overloaded function can sometimes silently infer a return type of Any. Previously, if a function call caused an overload to be less precise then a previous one, we gave up and returned a silent Any. This change removes this case altogether and only infers Any if either (a) the caller arguments explicitly contains Any or (b) if there was some error. For example, see python#3295 and python#1322 -- I believe this pull request touches on and maybe resolves (??) those two issues. 3. As a result, this caused a few errors in mypy where code was relying on this "silently infer Any" behavior -- see the changes in checker.py and semanal.py. Both files were using expressions of the form `zip(*iterable)`, which ended up having a type of `Any` under the old algorithm. The new algorithm will instead infer `Iterable[Tuple[Any, ...]]` which actually matches the stubs in typeshed. 4. Many of the attrs tests were also relying on the same behavior. Specifically, these changes cause the attr stubs in `test-data/unit/lib-stub` to no longer work. It seemed that expressions of the form `a = attr.ib()` were evaluated to 'Any' not because of a stub, but because of the 'silent Any' bug. I couldn't find a clean way of fixing the stubs to infer the correct thing under this new behavior, so just gave up and removed the overloads altogether. I think this is fine though -- it seems like the attrs plugin infers the correct type for us anyways, regardless of what the stubs say. If this pull request is accepted, I plan on submitting a similar pull request to the stubs in typeshed. 4. This pull request also probably touches on python/typing#253. We still require the overloads to be written from the most narrow to general and disallow overlapping signatures. However, if a *call* now causes overlaps, we try the "union" algorithm described above and default to selecting the first matching overload instead of giving up.

Michael0x2a · 2018-06-15T04:28:54Z

The original example now fails with an Incompatible return value type (got "List[<nothing>]", expected "str") error (almost certainly after f61c2ba landed). It's admittedly a bit more cryptic then I would like, but I think that's an unavoidable consequence of misusing generics and it at least indicates to the user that something is wrong. We've also worked out semantics I think we're reasonably happy with (though it ended up being less strict overall then what Jukka was proposing up above).

So, I'm thinking we can close this issue? Feel free to reopen if anybody disagrees.

JukkaL added the bug mypy got something wrong label Apr 1, 2016

gvanrossum added this to the 0.4.0 milestone Apr 7, 2016

JukkaL mentioned this issue Apr 18, 2016

Overloads with None don't work #1399

Closed

rwbarton mentioned this issue May 5, 2016

"No overload variant of sub" Issue with re.compile after upgrading to mypy 0.4 #1492

Closed

gvanrossum mentioned this issue Jul 26, 2016

Clarify what @overload means python/typing#253

Closed

gvanrossum added the priority-1-normal label Jan 6, 2017

gvanrossum removed this from the 0.5 milestone Mar 29, 2017

JukkaL mentioned this issue Apr 26, 2017

Any inferred when overloaded function called with None argument #3256

Closed

JukkaL mentioned this issue May 18, 2017

zip silently returns Any because of loose join with star types #3370

Open

ilevkivskyi mentioned this issue Mar 14, 2018

@overload with identical arguments in the signature provides unexpected results. #4694

Closed

Michael0x2a mentioned this issue Apr 2, 2018

Adds support for basic union math with overloads #4842

Closed

ilevkivskyi added the topic-overloads label May 20, 2018

Michael0x2a closed this as completed Jun 15, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Any incorrectly inferred for a call to an overloaded function #1322

Any incorrectly inferred for a call to an overloaded function #1322

JukkaL commented Apr 1, 2016

gvanrossum commented Apr 3, 2016

JukkaL commented Apr 4, 2016

gvanrossum commented Apr 4, 2016

JukkaL commented Apr 5, 2016

ilevkivskyi commented Aug 14, 2017

Michael0x2a commented Jun 15, 2018

Any incorrectly inferred for a call to an overloaded function #1322

Any incorrectly inferred for a call to an overloaded function #1322

Comments

JukkaL commented Apr 1, 2016

gvanrossum commented Apr 3, 2016

JukkaL commented Apr 4, 2016

gvanrossum commented Apr 4, 2016

JukkaL commented Apr 5, 2016

ilevkivskyi commented Aug 14, 2017

Michael0x2a commented Jun 15, 2018