Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Types for singleton objects #236

Closed
JukkaL opened this issue Jun 17, 2016 · 20 comments
Closed

Types for singleton objects #236

JukkaL opened this issue Jun 17, 2016 · 20 comments

Comments

@JukkaL
Copy link
Contributor

JukkaL commented Jun 17, 2016

I repeatedly encounter code that uses a singleton instance that is only used to mark some special condition. Here is an example:

_empty = object()

def f(x=_empty):
    if x is _empty:
        # default argument value
        return 0
    elif x is None:
        # argument was provided and it's None
        return 1
    else:
        # non-None argument provided
       return x * 2

Currently PEP 484 doesn't provide primitives for type checking such code precisely. Workarounds such as casts or declaring the type of the special object as Any are possible. We can't use the type object in a union type, since that would cover all possible values, as object is the common base class of everything.

We could refactor the code to use a separate class for the special object and use isinstance to check for the special value, but this may require changing the code in non-trivial ways for larger examples and carries some runtime overhead as isinstance checks are slower than is operations. Example:

from typing import Union

class Empty: pass

_empty = Empty()

def f(x: Union[int, None, Empty] = _empty) -> int:
    # use isinstance so that type checkers can recognize that we are decomposing a union
    if isinstance(x, Empty):
        # default argument value
        return 0
    elif x is None:
        # argument was provided and it's None
        return 1
    else:
        # non-None argument provided
       return x * 2

A potentially better way is to support user-defined types that only include a single instance. This can be seen as a generalization of the None type. Here is a potential way to use this to type check the example:

from typing import SingletonType, Union

_empty = object()

Empty = SingletonType('Empty', _empty)

def f(x: Union[int, None, Empty] = _empty) -> int:
    if x is _empty:
        # default argument value
        return 0
    elif x is None:
        # argument was provided and it's None
        return 1
    else:
        # non-None argument provided
       return x * 2

The final example is better than second example in a few ways:

  • It's closer to the original code.
  • It's as fast as the original code (modulo one-time overhead due to type annotations).
@ilevkivskyi
Copy link
Member

From the point of view of PEP 484 maybe it is better to have a more general construct that defines a type, variables of which can take only fixed set of values. In analogy with Type[C] it could be called Value, so that a variable of type Value[0, 1] could take only values 0 or 1. Then your example will be:

from typing import Value, Union

_empty = object()

def f(x: Union[int, None, Value[_empty]] = _empty) -> int:
    ...

Or one can define a type alias Empty = Value[_empty] if it is used more than few times.

Another argument for this is that I often use functions that can accept few such singletons, something like (this is almost my real code):

task = object()
result = object()

def response(message):
    if message is task:
        ...
    elif message is result:
        ...
    else:
        ...

here message could be typed with Union[Value[task, result], int].

@gvanrossum
Copy link
Member

gvanrossum commented Jun 18, 2016 via email

@ilevkivskyi
Copy link
Member

@gvanrossum As I understand Jukka's point, time or memory are not the main problems here (these probably could be solved by NewType) but the ability for mypy to detect the if branch that corresponds to _empty. If in your example mypy understands that by

if x is _empty:
    ...

we are decomposing a union, then this workaround would be OK. Currently, if I use your way

class _empty: pass

def f(x: Union[int, None, Type[_empty]] = _empty) -> int:
    if x is _empty:
        return 0
    elif x is None:
        return 1
    else:
       return x * 2

then mypy complains with

error: Unsupported operand types for * ("Union[int, Type[_empty]]" and "int")

Jukka's idea with isinstance works but this could require some substantial refactoring in more complex code, so that @JukkaL doesn't like it.

@markshannon
Copy link
Member

markshannon commented Jun 19, 2016

Why should we burden the user with specifying "internal" types?
The checker should be able to infer them (at least in this case).
In the example, you can only pass in an int or None, so the type should be Optional[int], as follows:

_empty = object()

def f(x: Optional[int]=_empty)->int:
    if x is _empty:
        # infer x is _empty
        return 0
    elif x is None:
        # infer x is None
        return 1
    else:
        # infer x is not empty and x is not None => x has type 'int'
       return x * 2

@JukkaL
Copy link
Contributor Author

JukkaL commented Jun 20, 2016

@gvanrossum The is _empty check wouldn't be exhaustive, as there could be subclasses of _empty defined elsewhere in a program, so we'd need something like a subclass check for the example, which would be a little odd-looking way to implement this. Otherwise mypy would be reasonable to give the error message that @ilevkivskyi encountered.

@ilevkivskyi Yes, my main concerns are 1) how type checkers can detect that we are decomposing a union and 2) how easy and error-prone will it be to annotate or refactor existing code that uses the idiom. Performance is still a potential issue, but it's probably less likely to be a blocker.

@markshannon Your suggestion would likely not work for more complex examples, and it's difficult for a type checker to decide if something is an internal thing that shouldn't be exposed in the type signature. Also, the special object doesn't need to be internal -- it could be part of a public API. Here are some use cases that I remember having seen in real code:

  • A special object used as the default argument value (similar to my original example).
  • A special object used to denote an uninitialized module-level variable, where None is a valid value for the variable.
  • A special value passed through a message queue to represent a shutdown message. Again, sometimes None was used as a regular value so it wasn't possible to use it as the special value.

@gvanrossum
Copy link
Member

gvanrossum commented Jun 20, 2016 via email

@ilevkivskyi
Copy link
Member

I like the idea with enums. It looks like it allows to not change the PEP and only change mypy.

@JukkaL
Copy link
Contributor Author

JukkaL commented Jun 29, 2016

That sounds reasonable. Basically any enum value would then also be valid as a singleton type, in addition to being a regular value.

It seems to be that this would still need a PEP change, as we'd need to declare enum values valid as types. Also, typing would have to be updated to accept enum values as types at runtime.

@markshannon
Copy link
Member

How does using enums allow the original example to be type annotated?

If you ask people to change their APIs to fit the typechecker, then they'll just use Any as the type.

@JukkaL
Copy link
Contributor Author

JukkaL commented Jun 29, 2016

The enum idea would still require a change to the original code, but it would be a more local change and wouldn't affect performance. However, if we'd still have to modify the PEP and typing, maybe we should also consider a more general feature that would support the original example with only typing-related code changes.

@gvanrossum
Copy link
Member

On Wed, Jun 29, 2016 at 11:29 AM, Mark Shannon notifications@github.com
wrote:

How does using enums allow the original example to be type annotated?

You would make the argument type the union of the "regular" type and the
"marker" enum.

If you ask people to change their APIs to fit the typechecker, then
they'll just use Any as the type.

Actually we have a lot of experience with this at Dropbox and people in
generally are often willing to change their code. Often the original code,
while working, was nothing to be proud of, and the desire to use the type
checker was a good motivator to clean it up. It's similar to refactoring
your code to be more testable -- most developers are willing to do that
because they see the benefits of tests.

@gvanrossum
Copy link
Member

On Wed, Jun 29, 2016 at 11:57 AM, Jukka Lehtosalo notifications@github.com
wrote:

The enum idea would still require a change to the original code, but it
would be a more local change and wouldn't affect performance. However, if
we'd still have to modify the PEP and typing, maybe we should also
consider a more general feature that would support the original example
with only typing-related code changes.

I'd like to draw a line between modifying the PEP and modifying typing.py.
The PEP is easily fungible; typing.py lives in the 3.5 stdlib and is a pain
to update.

If we write the code like I proposed, no change to typing.py should be
necessary (but we should update the PEP to clarify how it works and of
course we should implement the behavior in mypy).

--Guido van Rossum (python.org/~guido)

@markshannon
Copy link
Member

markshannon commented Jun 29, 2016

They may well be willing to change it at Dropbox. Not everywhere is like Dropbox 😄

Using a unique sentinel object to represent the absence of an argument is a standard pattern, and there is nothing wrong with it. It is hard to claim that

import enum 
class Empty(enum.Enum): 
     token = 0

SENTINEL = Empty.token

is an improvement over

SENTINEL = object()

The PEP is easily fungible; typing.py lives in the 3.5 stdlib and is a pain to update.

A good reason to keep typing as minimal as possible. IMO, the typing module should do nothing, apart from preventing NameErrors when using PEP 484 type annotations.
If typing does no checking, then adding a new type should be a trivial and backward compatible change.

@JukkaL
Copy link
Contributor Author

JukkaL commented Jun 29, 2016

Union[Empty.token, int] generates a runtime error for me, as Union checks whether the arguments are valid types.

@gvanrossum
Copy link
Member

gvanrossum commented Jun 29, 2016 via email

@ilevkivskyi
Copy link
Member

I think that the idea proposed by @gvanrossum is a reasonable compromise:

  1. It does not require changes to typing.py.
  2. The modifications needed in existing code are quite minimal.
  3. It could be generalized to situations with more than one special value in enum.
    If mypy could be taught to destructure a union with values of a "single-valued" type (like None or Empty.token), then probably it could be taught to destructure a union with any statically fixed number of values (which is the case for Enum). For example:
from enum import Enum
from typing import Union

class Reason(Enum):
    timeout = 1
    error = 2

def process(response: Union[int, Reason] = 0) -> int:
    if result is Reason.timeout:
        return 1
    elif result is Reason.error:
        return 2
    else:
        # at this point we know that result could be only int
       return x * 5

@JukkaL
Copy link
Contributor Author

JukkaL commented Jun 30, 2016

I agree that the final idea by @gvanrossum seems pretty reasonable, and not too difficult to add to type checkers. If it seems that users don't like it we can fall back to a more general solution, but it would be there for those who really need it without having to wait for a typing update.

However, I'd expect that this wouldn't be intuitive for users, as this does not follow the established convention. Tools that support the new convention should likely document this explicitly and explain how to refactor code that uses the special_object = object() idiom. I wonder if we should recommend the new convention also in PEP 484?

@gvanrossum
Copy link
Member

Yeah, this sounds like a good first step. @ilevkivskyi would you be interested in drafting something for the PEP?

@ilevkivskyi
Copy link
Member

OK, sure, I will make a PR by Monday.

@gvanrossum
Copy link
Member

Closed by #240. (We're still waiting for implementation in mypy: python/mypy#1803)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants