Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Any incorrectly inferred for a call to an overloaded function #1322

Closed
JukkaL opened this issue Apr 1, 2016 · 6 comments
Closed

Any incorrectly inferred for a call to an overloaded function #1322

JukkaL opened this issue Apr 1, 2016 · 6 comments
Labels

Comments

@JukkaL
Copy link
Collaborator

JukkaL commented Apr 1, 2016

This was reported by Alex Allain.

Mypy doesn't give an error for this program even though it should:

from typing import TypeVar, overload, List

T = TypeVar('T')

@overload
def f(x: T) -> List[T]: ...
@overload
def f(x: T, y: int) -> int: pass

def not_checked(text):
    # type: (str) -> str
    return f(*[])  # no error reported!

Mypy infers type List[None] for [] (which is arguably okay, though a little confusing), and this results in all overload variants matching, which is again okay. Now since multiple overload variants match, mypy incorrectly gives up and infers Any as the return type of f(*[]), even though in this case we could infer a more precise type: a union of all return types of all overload variants, i.e. Union[List[None], int]. Alternatively, mypy could give up and say that the return type is ambiguous, but it's less clear when this would be an okay thing to do (if ever).

@JukkaL JukkaL added the bug mypy got something wrong label Apr 1, 2016
@gvanrossum
Copy link
Member

Why is it okay if multiple overloads match? I thought mypy had some check that warns if overlaps are possible when the overload is defined, so I'd expect that that would produce an error (and I commend Alex for having found an example that matches multiple overloads).

Given that you generally don't like unions I'm a little surprised you propose one here. But I suppose if both variants were to return the same type we should at least infer that type.

Also, once we implement strict Optional checking, List[None] no longer would match all overload variants, right?

@JukkaL
Copy link
Collaborator Author

JukkaL commented Apr 4, 2016

Originally the idea was that overloads shouldn't be overlapping, but this got abandoned since it relied on multiple inheritance from built-in types being restricted so that you can't subclass int and str, for example. If almost any pair of classes can be subclassed, almost all pairs of types are potentially overlapping. Currently mypy doesn't keep track of which classes are built-in / extension classes so it can't do this sort of enforcement easily. Besides, this also seemed to be too restrictive, and the ambiguity due to overlapping overload variants doesn't seem to be a significant problem most of the time. The story is a bit more complex than this, as some forms of overlapping are always okay. Maybe I should write some more detailed docs about how overloaded functions works.

I still don't like union types, but here they just might be justified. However, let's start with an example where union types are plausible but problematic. Consider this fragment:

# stub for m
@overload
def f(x: str) -> str: ...
@overload
def f(x: str, y: str) -> int: ...

# program
from m import f
a = <something> # type: Iterable[str]
f(*a)  # Is this valid? What's the type?

Mypy currently assumes that either variant of f could be invoked on the last line -- we can't predict the number of items in an Iterable, and the call could target either of the variants. There are a few things we could do here:

  1. Disallow the call, because the return type is ambiguous. It would have to rewritten, perhaps like this:

    a = <something> # type: Iterable[str]
    x, y = a
    f(x, y)
    
  2. Disallow the call, because the length of *args argument cannot be determined, and the call could fail at runtime if the length is not 1 or 2. The workaround would be the same as above.

  3. Infer type Union[int, str] for the result, as either variant could be called. I don't like this very much because the programmer might know that a particular variant will always be called, and the union type would require a cast or similar.

  4. Infer type Any for the result -- this is what mypy does right now, though unintentionally. This is bad because it can produce Any types from seemingly annotated code.

The above example is pretty academic and unlikely to happen often in the wild. The case of zip is more realistic. Let's forget about List[None] for a while, as that should be fixed once None is no longer special as you mentioned. A similar thing can happen without it:

a = <something> # type: Iterable[List[int]]
zip(*a)  # What's the type of this?

zip is defined as an overloaded function, as variadic type variables aren't supported. The correct type for zip(*a) would be Iterator[Tuple[int, ...]], but the stub file makes it hard to come up with this type, since it only supports cases with up to four arguments. If we'd added an extra overload variant that accepts and arbitrary number of iterables (via *args), we could have something like this (I'm simplifying a bit by only including 3 variants):

@overload
def zip(iter1: Iterable[_T1]) -> Iterator[Tuple[_T1]]: ...
@overload
def zip(iter1: Iterable[_T1], iter2: Iterable[_T2]) -> Iterator[Tuple[_T1, _T2]]: ...
@overload
def zip(iter1: Iterable[_T], 
        iter2: Iterable[_T], 
        iter3: Iterable[_T], 
        *iter: Iterable[_T]) -> Iterator[Tuple[_T, ...]]: ...
# The last variant is not included in the current stub

Now the call to zip(*a) could still match all of the variants, and the return type is ambiguous. However, if we'd use a union for the return type, we could actually simplify the union to just Iterator[Tuple[_T, ...]] which would be good. So here using union types would actually work.

A suitable rule to cover both of the above examples might be something like this:

If multiple overload variants match [+ some extra conditions that I'm leaving out for simplicity], the type of the call is the union of the return types of the matching variants, assuming this union can be simplified to a non-union type (if one the variants returns a union, then a union is fine after simplification). If the result would be a union, generate a error and ask the code to be refactored.

@gvanrossum
Copy link
Member

OK, as long as the decision that multiple variants match happens rarely
enough, and as long as the zip(*a) example actually works as expected.

@JukkaL
Copy link
Collaborator Author

JukkaL commented Apr 5, 2016

I'm not sure whether my proposal would work in practice, but it wouldn't too hard to run an experiment.

@ilevkivskyi
Copy link
Member

This also affects protocols, I think we might want to raise priority of this to high.

Michael0x2a added a commit to Michael0x2a/mypy that referenced this issue Apr 2, 2018
This commit adds support for very basic and simple union math when
calling overloaded functions, resolving python#4576.

One thing led to another, and this ended up accidentally fixing or
touching on several different overload-related issues. In particular,
I believe this pull request:

1.  Fixes the bug (?) where calling overloaded functions can sometimes
    silently infer a return type of 'Any'

2.  Changes the semantics of how mypy handles overlapping functions,
    which I believe is currently under discussion in python/typing#253

Although this change is functional and mergable, I was planning on
polishing it more -- adding more tests, fleshing out the union math
behavior, etc.

However, I think these are sort of big changes and wanted to check in
and make sure this pull request is actually welcome/is a good idea.
If not, let me know, and I'd be happy to abandon it.

---

Details on specific changes made:

1.  The new algorithm works by modifying checkexpr.overload_call_targets
    to return all possible matches, rather then just one.

    We start by trying the first matching signature. If there was some
    error, we (conservatively) attempt to union all of the matching
    signatures together and repeat the typechecking process.

    If it doesn't seem like it's possible to combine the matching
    signatures in a sound way, we end and just output the errors we
    obtained from typechecking the first match.

    The "signature-unioning" code is currently deliberately very
    conservative. I figured it was better to start small and attempt to
    handle only basic cases like python#1943 and relax the restrictions later
    as needed. For more details on this algorithm, see the comments in
    checkexpr.union_overload_matches.

2.  This change incidentally resolves any bugs related to how calling
    an overloaded function can sometimes silently infer a return type
    of Any. Previously, if a function call caused an overload to be
    less precise then a previous one, we gave up and returned a silent
    Any.

    This change removes this case altogether and only infers Any if
    either (a) the caller arguments explicitly contains Any or (b) if
    there was some error.

    For example, see python#3295 and python#1322 -- I believe this pull request touches
    on and maybe resolves (??) those two issues.

3.  As a result, I needed to fix a few parts of mypy that were
    relying on this "silently infer Any" behavior -- see the changes in
    checker.py and semanal.py. Both files were using expressions of the
    form `zip(*iterable)`, which ended up having a type of `Any` under
    the old algorithm. The new algorithm will instead infer
    `Iterable[Tuple[Any, ...]]` which actually matches the stubs in
    typeshed.

4.  These changes cause the attr stubs in `test-data/unit/lib-stub` to
    no longer work. It seems that the stubs both here and in typeshed
    were both also falling prey to the 'silently infer Any' bug: code
    like `a = attr.ib()` typechecked not because they matched the
    signature of any of the overloads, but because that particular call
    caused one or more overloads to overlap, which made mypy give up and
    infer Any.

    I couldn't find a clean way of fixing the stubs to infer the correct
    thing under this new behavior, so just gave up and removed the
    overloads altogether. I think this is fine though -- it seems like
    the attrs plugin infers the correct type for us anyways, regardless
    of what the stubs say.

    If this pull request is accepted, I plan on submitting a similar
    pull request to the stubs in typeshed.

4.  This pull request also probably touches on
    python/typing#253. We still require the
    overloads to be written from the most narrow to general and disallow
    overlapping signatures.

    However, if a *call* now causes overlaps, we try the "union"
    algorithm described above and default to selecting the first
    matching overload instead of giving up.
Michael0x2a added a commit to Michael0x2a/mypy that referenced this issue Apr 2, 2018
This commit adds support for very basic and simple union math when
calling overloaded functions, resolving python#4576.

One thing led to another, and this ended up accidentally fixing or
touching on several different overload-related issues. In particular,
I believe this pull request:

1.  Fixes the bug (?) where calling overloaded functions can sometimes
    silently infer a return type of 'Any'

2.  Changes the semantics of how mypy handles overlapping functions,
    which I believe is currently under discussion in python/typing#253

Although this change is functional and mergable, I was planning on
polishing it more -- adding more tests, fleshing out the union math
behavior, etc.

However, I think these are sort of big changes and wanted to check in
and make sure this pull request is actually welcome/is a good idea.
If not, let me know, and I'd be happy to abandon it.

---

Details on specific changes made:

1.  The new algorithm works by modifying checkexpr.overload_call_targets
    to return all possible matches, rather then just one.

    We start by trying the first matching signature. If there was some
    error, we (conservatively) attempt to union all of the matching
    signatures together and repeat the typechecking process.

    If it doesn't seem like it's possible to combine the matching
    signatures in a sound way, we end and just output the errors we
    obtained from typechecking the first match.

    The "signature-unioning" code is currently deliberately very
    conservative. I figured it was better to start small and attempt to
    handle only basic cases like python#1943 and relax the restrictions later
    as needed. For more details on this algorithm, see the comments in
    checkexpr.union_overload_matches.

2.  This change incidentally resolves any bugs related to how calling
    an overloaded function can sometimes silently infer a return type
    of Any. Previously, if a function call caused an overload to be
    less precise then a previous one, we gave up and returned a silent
    Any.

    This change removes this case altogether and only infers Any if
    either (a) the caller arguments explicitly contains Any or (b) if
    there was some error.

    For example, see python#3295 and python#1322 -- I believe this pull request touches
    on and maybe resolves (??) those two issues.

3.  As a result, I needed to fix a few parts of mypy that were
    relying on this "silently infer Any" behavior -- see the changes in
    checker.py and semanal.py. Both files were using expressions of the
    form `zip(*iterable)`, which ended up having a type of `Any` under
    the old algorithm. The new algorithm will instead infer
    `Iterable[Tuple[Any, ...]]` which actually matches the stubs in
    typeshed.

4.  These changes cause the attr stubs in `test-data/unit/lib-stub` to
    no longer work. It seems that the stubs both here and in typeshed
    were both also falling prey to the 'silently infer Any' bug: code
    like `a = attr.ib()` typechecked not because they matched the
    signature of any of the overloads, but because that particular call
    caused one or more overloads to overlap, which made mypy give up and
    infer Any.

    I couldn't find a clean way of fixing the stubs to infer the correct
    thing under this new behavior, so just gave up and removed the
    overloads altogether. I think this is fine though -- it seems like
    the attrs plugin infers the correct type for us anyways, regardless
    of what the stubs say.

    If this pull request is accepted, I plan on submitting a similar
    pull request to the stubs in typeshed.

4.  This pull request also probably touches on
    python/typing#253. We still require the
    overloads to be written from the most narrow to general and disallow
    overlapping signatures.

    However, if a *call* now causes overlaps, we try the "union"
    algorithm described above and default to selecting the first
    matching overload instead of giving up.
Michael0x2a added a commit to Michael0x2a/mypy that referenced this issue Apr 2, 2018
This commit adds support for very basic and simple union math when
calling overloaded functions, resolving python#4576.

As a side effect, this change also fixes a bug where calling overloaded
functions can sometimes silently infer a return type of 'Any' and
slightly modifies the semantics of how mypy handles overlaps in
overloaded functions.

Details on specific changes made:

1.  The new algorithm works by modifying checkexpr.overload_call_targets
    to return all possible matches, rather then just one.

    We start by trying the first matching signature. If there was some
    error, we (conservatively) attempt to union all of the matching
    signatures together and repeat the typechecking process.

    If it doesn't seem like it's possible to combine the matching
    signatures in a sound way, we end and just output the errors we
    obtained from typechecking the first match.

    The "signature-unioning" code is currently deliberately very
    conservative. I figured it was better to start small and attempt to
    handle only basic cases like python#1943 and relax the restrictions later
    as needed. For more details on this algorithm, see the comments in
    checkexpr.union_overload_matches.

2.  This change incidentally resolves any bugs related to how calling
    an overloaded function can sometimes silently infer a return type
    of Any. Previously, if a function call caused an overload to be
    less precise then a previous one, we gave up and returned a silent
    Any.

    This change removes this case altogether and only infers Any if
    either (a) the caller arguments explicitly contains Any or (b) if
    there was some error.

    For example, see python#3295 and python#1322 -- I believe this pull request touches
    on and maybe resolves (??) those two issues.

3.  As a result, this caused a few errors in mypy where code was
    relying on this "silently infer Any" behavior -- see the changes in
    checker.py and semanal.py. Both files were using expressions of the
    form `zip(*iterable)`, which ended up having a type of `Any` under
    the old algorithm. The new algorithm will instead infer
    `Iterable[Tuple[Any, ...]]` which actually matches the stubs in
    typeshed.

4.  Many of the attrs tests were also relying on the same behavior.
    Specifically, these changes cause the attr stubs in
    `test-data/unit/lib-stub` to no longer work. It seemed that expressions
    of the form `a = attr.ib()` were evaluated to 'Any' not because of a
    stub, but because of the 'silent Any' bug.

    I couldn't find a clean way of fixing the stubs to infer the correct
    thing under this new behavior, so just gave up and removed the
    overloads altogether. I think this is fine though -- it seems like
    the attrs plugin infers the correct type for us anyways, regardless
    of what the stubs say.

    If this pull request is accepted, I plan on submitting a similar
    pull request to the stubs in typeshed.

4.  This pull request also probably touches on
    python/typing#253. We still require the
    overloads to be written from the most narrow to general and disallow
    overlapping signatures.

    However, if a *call* now causes overlaps, we try the "union"
    algorithm described above and default to selecting the first
    matching overload instead of giving up.
Michael0x2a added a commit to Michael0x2a/mypy that referenced this issue Apr 11, 2018
This commit adds support for very basic and simple union math when
calling overloaded functions, resolving python#4576.

As a side effect, this change also fixes a bug where calling overloaded
functions can sometimes silently infer a return type of 'Any' and
slightly modifies the semantics of how mypy handles overlaps in
overloaded functions.

Details on specific changes made:

1.  The new algorithm works by modifying checkexpr.overload_call_targets
    to return all possible matches, rather then just one.

    We start by trying the first matching signature. If there was some
    error, we (conservatively) attempt to union all of the matching
    signatures together and repeat the typechecking process.

    If it doesn't seem like it's possible to combine the matching
    signatures in a sound way, we end and just output the errors we
    obtained from typechecking the first match.

    The "signature-unioning" code is currently deliberately very
    conservative. I figured it was better to start small and attempt to
    handle only basic cases like python#1943 and relax the restrictions later
    as needed. For more details on this algorithm, see the comments in
    checkexpr.union_overload_matches.

2.  This change incidentally resolves any bugs related to how calling
    an overloaded function can sometimes silently infer a return type
    of Any. Previously, if a function call caused an overload to be
    less precise then a previous one, we gave up and returned a silent
    Any.

    This change removes this case altogether and only infers Any if
    either (a) the caller arguments explicitly contains Any or (b) if
    there was some error.

    For example, see python#3295 and python#1322 -- I believe this pull request touches
    on and maybe resolves (??) those two issues.

3.  As a result, this caused a few errors in mypy where code was
    relying on this "silently infer Any" behavior -- see the changes in
    checker.py and semanal.py. Both files were using expressions of the
    form `zip(*iterable)`, which ended up having a type of `Any` under
    the old algorithm. The new algorithm will instead infer
    `Iterable[Tuple[Any, ...]]` which actually matches the stubs in
    typeshed.

4.  Many of the attrs tests were also relying on the same behavior.
    Specifically, these changes cause the attr stubs in
    `test-data/unit/lib-stub` to no longer work. It seemed that expressions
    of the form `a = attr.ib()` were evaluated to 'Any' not because of a
    stub, but because of the 'silent Any' bug.

    I couldn't find a clean way of fixing the stubs to infer the correct
    thing under this new behavior, so just gave up and removed the
    overloads altogether. I think this is fine though -- it seems like
    the attrs plugin infers the correct type for us anyways, regardless
    of what the stubs say.

    If this pull request is accepted, I plan on submitting a similar
    pull request to the stubs in typeshed.

4.  This pull request also probably touches on
    python/typing#253. We still require the
    overloads to be written from the most narrow to general and disallow
    overlapping signatures.

    However, if a *call* now causes overlaps, we try the "union"
    algorithm described above and default to selecting the first
    matching overload instead of giving up.
Michael0x2a added a commit to Michael0x2a/mypy that referenced this issue Apr 23, 2018
This commit adds support for very basic and simple union math when
calling overloaded functions, resolving python#4576.

As a side effect, this change also fixes a bug where calling overloaded
functions can sometimes silently infer a return type of 'Any' and
slightly modifies the semantics of how mypy handles overlaps in
overloaded functions.

Details on specific changes made:

1.  The new algorithm works by modifying checkexpr.overload_call_targets
    to return all possible matches, rather then just one.

    We start by trying the first matching signature. If there was some
    error, we (conservatively) attempt to union all of the matching
    signatures together and repeat the typechecking process.

    If it doesn't seem like it's possible to combine the matching
    signatures in a sound way, we end and just output the errors we
    obtained from typechecking the first match.

    The "signature-unioning" code is currently deliberately very
    conservative. I figured it was better to start small and attempt to
    handle only basic cases like python#1943 and relax the restrictions later
    as needed. For more details on this algorithm, see the comments in
    checkexpr.union_overload_matches.

2.  This change incidentally resolves any bugs related to how calling
    an overloaded function can sometimes silently infer a return type
    of Any. Previously, if a function call caused an overload to be
    less precise then a previous one, we gave up and returned a silent
    Any.

    This change removes this case altogether and only infers Any if
    either (a) the caller arguments explicitly contains Any or (b) if
    there was some error.

    For example, see python#3295 and python#1322 -- I believe this pull request touches
    on and maybe resolves (??) those two issues.

3.  As a result, this caused a few errors in mypy where code was
    relying on this "silently infer Any" behavior -- see the changes in
    checker.py and semanal.py. Both files were using expressions of the
    form `zip(*iterable)`, which ended up having a type of `Any` under
    the old algorithm. The new algorithm will instead infer
    `Iterable[Tuple[Any, ...]]` which actually matches the stubs in
    typeshed.

4.  Many of the attrs tests were also relying on the same behavior.
    Specifically, these changes cause the attr stubs in
    `test-data/unit/lib-stub` to no longer work. It seemed that expressions
    of the form `a = attr.ib()` were evaluated to 'Any' not because of a
    stub, but because of the 'silent Any' bug.

    I couldn't find a clean way of fixing the stubs to infer the correct
    thing under this new behavior, so just gave up and removed the
    overloads altogether. I think this is fine though -- it seems like
    the attrs plugin infers the correct type for us anyways, regardless
    of what the stubs say.

    If this pull request is accepted, I plan on submitting a similar
    pull request to the stubs in typeshed.

4.  This pull request also probably touches on
    python/typing#253. We still require the
    overloads to be written from the most narrow to general and disallow
    overlapping signatures.

    However, if a *call* now causes overlaps, we try the "union"
    algorithm described above and default to selecting the first
    matching overload instead of giving up.
@Michael0x2a
Copy link
Collaborator

The original example now fails with an Incompatible return value type (got "List[<nothing>]", expected "str") error (almost certainly after f61c2ba landed). It's admittedly a bit more cryptic then I would like, but I think that's an unavoidable consequence of misusing generics and it at least indicates to the user that something is wrong. We've also worked out semantics I think we're reasonably happy with (though it ended up being less strict overall then what Jukka was proposing up above).

So, I'm thinking we can close this issue? Feel free to reopen if anybody disagrees.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants