Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Protocols (a.k.a. structural subtyping) #11

Closed
gvanrossum opened this issue Oct 16, 2014 · 87 comments
Closed

Protocols (a.k.a. structural subtyping) #11

gvanrossum opened this issue Oct 16, 2014 · 87 comments

Comments

@gvanrossum
Copy link
Member

In mypy's typing,py there's a neat feature called Protocol. Perhaps we should add this to the PEP. Or perhaps it's similar to a Python ABC?

@JukkaL
Copy link
Contributor

JukkaL commented Oct 16, 2014

It's meant to be used for structural subtyping. It is quite similar to ABCs, but neither is a replacement for the other. The details of how to use Protocol are still poorly defined, as the mypy type system does not know about structural subtyping yet. However, adding support for structural subtyping shouldn't be too difficult, once we figure out the semantics.

Here is an example of how Protocol could be used:

class SupportsFileno(Protocol):
    @abstractmethod
    def fileno(self) -> int: pass

def f(file: SupportsFileno) -> None:
    id = file.fileno()   # Okay
    ...

f(open('foo'))  # Okay, since file objects have fileno()!

There is no need to explicitly declare that file objects implement SupportsFileno (the type checker would infer this automatically). This is unlike ABCs, where base classes have to be explicitly defined/registered.

The current implementation of Protocol is just a proof of concept, as is the implementation of overload. A production implementation would probably have to be optimized, and there are probably corner cases that aren't handled yet.

@mvitousek
Copy link

Protocols look to me like a pretty reasonable way of implementing structural typing, which is definitely something that we're hoping to see in the PEP. This design is necessarily verbose; an in-line, anonymous definition using dictionaries, like

def f(file: {'fileno': Callable[[], int]}) -> None: ...

is a bit more concise when the protocol/structural type is small. This would conflict with the current proposal to use dictionaries in annotations to represent current non-type uses of annotations (https://github.com/ambv/typehinting/blob/master/pep-NNNN.txt#L194), however see #26 for thoughts on alternatives.

@JukkaL
Copy link
Contributor

JukkaL commented Nov 9, 2014

I haven't seen any evidence for protocol types being used/useful all over the place, so my current working hypothesis is that having a heavy-weight syntax for protocols only adds a trivial amount of syntactic overhead for the vast majority of programs.

However, the absence of evidence doesn't prove anything -- if somebody finds a few counterexamples I'm happy to change my mind. :)

Maybe we could have a look at some Go code and see how many small, throwaway structural types are used there? Of course, having convenient syntax for a feature may nudge programmers into using the feature more often.

@ambv
Copy link
Contributor

ambv commented Jan 7, 2015

ABCs have runtime instance checks so it's hard for the type checker to use them for structural sub-typing. If it could, that would be the simplest and most elegant solution. Otherwise, could we leave this out for now and look how we could make it work with existing protocol solutions like Zope interfaces?

@gvanrossum
Copy link
Member Author

Jukka, your typing.py has a Protocol class that is more complex than most other infrastructure, and it is used for those generic ABCs (e.g. Sized, Hashable) whose collections.abc counterpart does a runtime structural check. Does this mean that you have now implemented this in mypy? I kind of like the resulting syntax for defining a generic ABC that is supposed to use structural testing:

class Sized(Protocol):
    @abstractmethod
    def __len__(self) -> int: pass

class Container(Protocol[T]):
    @abstractmethod
    def __contains__(self, x) -> bool: pass

(Note that Sized is not generic, but Container is.)

Łukasz: I have no experience with zope interfaces. But perhaps they are easy enough to parse for a hypothetical static type checker, and they feel close enough to types/classes that we should try to support them? Can you sketch a simple example?

@gvanrossum
Copy link
Member Author

Quoting Łukasz in python/mypy#539 (comment):
"""
As for explicit protocols, there are some competing (as in: not easily composable) standards here:

  1. ABCs with @AbstractMethod
  2. Zope interfaces
  3. https://pypi.python.org/pypi/characteristic/

etc. etc.

As ABCs are built-in, it seems natural to suggest that they should be used for defining interfaces. However, I understand that static analysis (so, the type checker) might not be able to process every abstract class in general.
"""

@JukkaL
Copy link
Contributor

JukkaL commented Jan 8, 2015

Guido, the machinery in mypy's typing.py has been there for a long time, but it was never fully implemented, and in particular, the type checker doesn't know anything about this. It shouldn't really be there -- everything should just use ABCs, until there is type system support for protocols.

I added a new task for removing Protocol:
python/mypy#552

If protocols (or something similar) will be included in the PEP, I'll update the above issue.

@gvanrossum
Copy link
Member Author

So we have multiple competing ways to spell protocols, but no way to type-check them statically. This feels like something we'll have to leave out of the PEP and come back to later in a separate PEP.

@ambv
Copy link
Contributor

ambv commented Jan 8, 2015

OK, let's remove Protocol from typing.py for now. We should come back to it in a separate PEP, especially that PEP 245 has been rejected and there are ABCs now.

As for Zope interfaces, an interface definition is easy to parse, except for invariants (runnable checks, similar to instancechecks in ABCs). We have to bear in mind that existing interface definitions may sometimes be dynamic, it's Python. That's fine, I think. What the type checker doesn't know, it assumes it's correct.

I see two issues with Zope interfaces:

  • arbitrary classes can be externally registered as implementing interfaces, just like ABCs can have external classes registered to them (so we'd have the same problems with this)
  • adaptation (yes, Zope interfaces implement PEP 246) and runtime adapt hooks

@ambv ambv self-assigned this Jan 8, 2015
@ambv
Copy link
Contributor

ambv commented Jan 14, 2015

Closing as this is being left out for now.

@gvanrossum
Copy link
Member Author

Reopening this, as duck typing is important according to the BDFL-Delegate (Mark Shannon).

@JukkaL
Copy link
Contributor

JukkaL commented May 19, 2015

I'm writing here a bunch of related issues that came to mind. These are intended to start a discussion -- I try not to propose any course of action yet (though I have some opinions about some of these already).

I'm sure there are other issues not covered by these.

1) What to call these types?

We could call them at least duck types, protocols, interfaces or structural types. I'll call them protocols below, but I'm not specifically advocating any particular term.

2) How to define an interface?

Here are a few possible ways:

@protocol
class Sized: ...

class Size(Protocol): ...

class ISize(Protocol): ...  # Zope interface naming convention

3) How to define a method in a protocol?

A few possible ways:

def __len__(self) -> int: pass

def __len__(self) -> int: ...   # Literal ellipsis; semantically same as above

@abstractmethod
def __len__(self) -> int: pass

def __len__(self) -> int:
    raise NotImplementedError   # I'm not actually advocating this, but it's an option

4) How to define an attribute in a protocol?

Some ideas:

class Foo(...):
    a = ???  # type: Foo    # Annotation optional, not sure what the initializer should be

    b = ... # type: Foo    # As in stubs; literal ellipsis

    @property
    def c(self) -> 'Foo': pass

    @abstractproperty
    def d(self) -> 'Foo': pass

    e = typing.attribute('''docstring''')   # type: Foo   # Similar to Zope interfaces

We could also disallow recursive types (see below).

5) How to explicitly declare that a class conforms to a protocol?

We may want the type checker to verify that a class actually conform to a protocol. Some ideas for this:

class A(Sized):
    ...

@implements(Sized)
class A:
    ...

class A:
    implements(Sized)   # Inspired by Zope interfaces

Alternatively, this can always be implicit. In that case we can do something like this
to force a check:

if False:
    _dummy = A()  # type: Sized   # Complain if A doesn't implement Sized

We also need to decide whether the subclass will inherit the method implementations defined in the body of the protocol class in case we use regular inheritance (assuming they aren't abstract but regular methods). If we use a class decorator, we'd probably don't inherit anything.

6) Should we support protocols extending protocols?

We can probably use the same approach for defining a subprotocol as in (5).

7) Should we support optional methods and attributes?

I think TypeScript supports the concept of optional methods. If we declare a method as optional, subtypes don't have to implement it, but if they implement, the signature should be compatible. This could also work with attributes, but coming up with a syntax may be tricky.

8) Should some ABCs in typing be protocols instead?

For example, maybe typing.Sized should be a protocol.

9) Should be support recursive protocols?

For example, when Foo was used in the definition of Foo in (4), it defined a recursive structural type. If we think that they are too tricky we can disallow them.

10) Should we support generic protocols?

For example, if typing.Iterable is a protocol, it would have to be generic.

11) Should these be interoperable with other similar implementations?

See Guido's remark above. Maybe we should interoperate with Zope interfaces (or just borrow the implementation and include it in typing), for example.

@gvanrossum
Copy link
Member Author

I'm sorry I brought up Zope Interfaces. I'll respond to the rest when I
have a real keyboard.

On Monday, May 18, 2015, Jukka Lehtosalo notifications@github.com wrote:

I'm writing here a bunch of related issues that came to mind. These are
intended to start a discussion -- I try not to propose any course of action
yet (though I have some opinions about some of these already).

I'm sure there are other issues not covered by these.

1) What to call these types?

We could call them at least duck types, protocols, interfaces or
structural types. I'll call them protocols below, but I'm not specifically
advocating any particular term.

2) How to define an interface?

Here are a few possible ways:

@protocol
class Sized: ...

class Size(Protocol): ...

class ISize(Protocol): ... # Zope interface naming convention

3) How to define a method in a protocol?

A few possible ways:

def len(self) -> int: pass

def len(self) -> int: ... # Literal ellipsis; semantically same as above

@AbstractMethod
def len(self) -> int: pass

def len(self) -> int:
raise NotImplementedError # I'm not actually advocating this, but it's an option

4) How to define an attribute in a protocol?

Some ideas:

class Foo(...):
a = ??? # type: Foo # Annotation optional, not sure what the initializer should be

b = ... # type: Foo    # As in stubs; literal ellipsis

@property
def c(self) -> 'Foo': pass

@abstractproperty
def d(self) -> 'Foo': pass

e = typing.attribute('''docstring''')   # type: Foo   # Similar to Zope interfaces

We could also disallow recursive types (see below).

5) How to explicitly declare that a class conforms to a protocol?

We may want the type checker to verify that a class actually conform to a
protocol. Some ideas for this:

class A(Sized):
...

@implements(Sized)
class A:
...

class A:
implements(Sized) # Inspired by Zope interfaces

Alternatively, this can always be implicit. In that case we can do
something like this
to force a check:

if False:
_dummy = A() # type: Sized # Complain if A doesn't implement Sized

We also need to decide whether the subclass will inherit the method
implementations defined in the body of the protocol class in case we use
regular inheritance (assuming they aren't abstract but regular methods). If
we use a class decorator, we'd probably don't inherit anything.

6) Should we support protocols extending protocols?

We can probably use the same approach for defining a subprotocol as in (5).

7) Should we support optional methods and attributes?

I think TypeScript supports the concept of optional methods. If we declare
a method as optional, subtypes don't have to implement it, but if they
implement, the signature should be compatible. This could also work with
attributes, but coming up with a syntax may be tricky.

8) Should some ABCs in typing be protocols instead?

For example, maybe typing.Sized should be a protocol.

9) Should be support recursive protocols?

For example, when Foo was used in the definition of Foo in (4), it
defined a recursive structural type. If we think that they are too tricky
we can disallow them.

10) Should we support generic protocols?

For example, if typing.Iterable is a protocol, it would have to be
generic.

11) Should these be interoperable with other similar implementations?

See Guido's remark above. Maybe we should interoperate with Zope
interfaces (or just borrow the implementation and include it in typing),
for example.


Reply to this email directly or view it on GitHub
#11 (comment).

--Guido van Rossum (on iPad)

@JukkaL
Copy link
Contributor

JukkaL commented May 19, 2015

Another open issue:

12) How should isinstance work?

We could disallow isinstance or check for the presence of attributes in the protocol (similar to the now-removed mypy's Protocol implementation).

@refi64
Copy link

refi64 commented May 19, 2015

IMO, following the same conventions as stubs would be easier and less confusing.

@JukkaL
Copy link
Contributor

JukkaL commented Sep 6, 2015

I promised offline to write up a proposal for structural subtyping. This is the first iteration of one. If we reach a consensus, I can write this into a PEP.

Motivation

Currently, typing defines ABCs for several common Python protocols such as Iterable and Sized. The problem with them is that a class has to be explicitly marked to support them, which is arguably unpythonic and unlike what you'd normally do in your non statically typed code. For example, this conforms to the current PEP:

from typing import Sized, Iterable, Iterator

class Bucket(Sized, Iterable[int]):
    ...
    def __len__(self) -> int: ...
    def __iter__(self) -> Iterator[int]: ...

My intention is that the above code could be written instead equivalently without explicit base classes in the class definition, and Bucket would still be implicitly considered a subtype of both Sized and Iterable[int] by using structural subtyping:

from typing import Iterator

class Bucket:
    ...
    def __len__(self) -> int: ...
    def __iter__(self) -> Iterator[int]: ...

As I mentioned earlier, there are many individual design decisions that we need to agree on. I'm proposing answers to many of them here.

1) What to call these types?

Let's call them protocols. The reason is that the term iterator protocol, for example, is widely understood in the community, and coming up with a new term for this concept in a statically typed context would just create confusion.

This has the drawback that the term 'protocol' becomes overloaded with two subtly different meanings: the first is the traditional, well-known but slightly fuzzy concept of protocols such as iterable; the second is the more explicitly defined concept of protocols in statically typed code (or more generally in code that just uses the typing module). I argue that the distinction isn't importat most of the time, and in other cases people can just add a qualifier such as "protocol classes" (for the new-style protocols) or "traditional/non-class/implicit protocols".

2) How to define and use a protocol?

There would be a new class typing.Protocol. If this is explicitly included in the the base class list, the class is a protocol. Here is a simple example:

from typing import Protocol

class SupportsClose(Protocol):
    def close(self) -> None: ...   # See 3) for more about the '...'.

Now if we define a class Resource with a close method that has a suitable signature, it would implicitly be a subtype of SupportsClose, since we'd use structural subtyping for protocol types:

class Resource:
    ...

    def close(self) -> None:
        self.file.close()
        self.lock.release()

Protocol types can be used in annotations, of course, and for type checking:

def close_all(things: Iterable[SupportsClose]) -> None:
    for t in things:
        t.close()

f = open('foo.txt')
r = Resource(...)
close_all([f, r])  # OK!
close_all([1])  # Error: 'int' has no 'close' method

Note that both the user-defined class Resource and the IO type (the return type of open) would be considered subtypes of SupportsClose because they provide a suitable close method.

If using the current typing module, our only option to implement the above example would be to use an ABC (or type Any, but that would compromise type checking). If we'd use an ABC, we'd have to explicitly register the fact that these types are related, and this generally difficult to do with library types as the type objects may be hidden deep in the implementation of the library. Besides, this would be uglier than how you'd actually write the code in straight, idiomatic dynamically typed Python. The code with a protocol class matches common Python conventions much better. It's also automatically extensible and works with additional, unrelated classes that happen to implement the required interface.

3) How to define a method in a protocol?

I propose that most of the regular rules for classes still apply to protocol classes (modulo a few exceptions that only apply to protocol classes). I'd like protocols to also be ABCs, so all of these would be valid within a protocol class:

# Variant 1
def __len__(self) -> int: ...

# Variant 2
def __len__(self) -> int: pass

# Variant 3
@abstractmethod
def __len__(self) -> int: pass

# Variant 4
def __len__(self): pass

# Variant 5
def __len__(self) -> int:
    return 0

# Variant 6
def __len__(self) -> int:
    raise NotImplementedError

For variants 1, 2 and 3, a type checker should probably always require an explicit implementation to be defined in a subclass that explicitly subclasses the protocol (see below for more about this), because the implementations return None which is not a valid return type for the method. For variants 4, 5 and 6, we can use the provided implementation as a default implementation. The default implementations won't be used if the subtype relationship is implicit and only via structural subtyping -- the semantics of inheritance won't be changed.

I also propose that a ... as the method body in a protocol type makes the method implicitly abstract. This would only be checked statically, and there won't be any runtime checking. The idea here is that most methods in a protocol will be abstract, and having to always use @abstractmethod is a little verbose and ugly, and has the issue of implicit None return types confusing things. This would be the recommended way of defining methods in a protocol that don't have an implementation, but the other approaches can be used for legacy code or if people just feel like it. The recommended syntax would mirror how methods are defined in stub files.

4) How to define a data attribute in a protocol?

Similar to 3), there will be multiple valid ways of defining data attributes (or properties). All of these will be valid:

class Foo(Protocol):
    a = ...  # type: int  # Variant 1
    b = 0  # Variant 2

    # Variant 3
    @property
    def c(self) -> int: ...

    # Variant 4
    @property
    def c(self) -> int:
        return 0

    # Variant 5
    @property
    def d(self) -> int:
        raise NotImplementedError

    # Variant 6
    @abstractproperty
    def e(self) -> int: ...

    # Variant 7
    @abstractproperty
    def f(self) -> int: pass

Also, properties with setters can also be defined. The first three variants would be the recommended ways of defining attributes or properties, but similar to 3), the others are possible and may be useful for supporting legacy code.

When using an ... initializer, @abstractproperty or pass/... as property body (and when the type does not include None), the data attribute is considered abstract and must always be explicitly implemented in a compatible class.

Attributes should not be defined in the body of a method by assignment via self. This restriction is a little arbitrary, but my idea is that the protocol class implementation is often not shared by subtypes so the interface should not depend on the default implementation. This is more of a style than a technical issue, as a type checker could infer attributes from assignment statements within methods as well.

When using the ... initializer, the ... initializer might leak into subclasses at runtime, which is unfortunate:

class A(Protocol):
    x = ...  # type: int

class B(A):
    def __init__(self) -> None:
        self.x = 1

b = B()
print(b.x)  # 1
print(B.x)  # Ellipsis

If we'd use a None initializer things wouldn't be any better. Maybe we can modify the metaclass to recognize ... initializers and translate them to something else. This needs to be documented, however. Also, in this proposal there is no way to distinguish between class and instance data attributes.

I'm not sure sure what to do with __init__. I guess a protocol could provide a default implementation that could be used in explicit subclasses.

Overall, I'm probably the least happy about this part of the proposal.

5) How to explicitly declare that a class conforms to a protocol?

I propose that protocols can be used as regular base classes. I can see at least three things that support this decision. First, a protocol class could define default implementations for some methods (typing.Sequence would be an example if we decide to turn it into a protocol). Second, we want a way of statically enforcing that a class actually implements a protocol correctly (but there are other ways to achieve this effect -- see below for alternatives). Third, this makes it possible to turn an existing ABC into a protocol and just have things (mostly) work. This would be important for the existing ABCs in typing what we may want to change into protocols (see point 8 for more about this). The general philosophy would be that Protocols are mostly like regular ABCs, but a static type checker will handle them somewhat specially.

Note that subclassing a protocol class would not turn the subclass into a protocol unless it also has Protocol as an explicit base class. I assume that we can use metaclass trickery to get this to work correctly.

Some terminology could be useful here for clarity. If a class includes a protocol in its MRO, the class is an (explicit) subclass of the procotol. If a class ia a structural subtype of a protocol, it is said to implement the protocol and to be compatible with a protocol. If a class is compatible with a protocol but the protocol is not included in the MRO, the class is an implicit subclass of the protocol.

We could also explicitly add an assignment for checking that a class implements a protocol. I've seen a similar pattern in some Go code that I've reviewed. Example:

class A:
    def __len__(self) -> float:
        return ...

_ = A()  # type: Sized  # Error: A.__len__ doesn't conform to 'Sized'
                        # (Incompatible return type 'float')

I don't much care above the above example, as it moves the check away from the class definition and it almost requires a comment as otherwise the code probably wouldn't make any sense to an average reader -- it looks like dead code. Besides, in the simplest form it requires us to construct an instance of A which could problematic if this requires accessing or allocating some resources such as files or sockets. We could work around the latter by using a cast, for example, but then the code would be really ugly.

6) Should we support protocols extending protocols?

I think that we should support subprotocols. A subprotocol can be defined by having both one or more protocols as the explicit base classes and also having typing.Protocol as an immediate base class:

from typing import Sized, Protocol

class SizedAndCloseable(Sized, Protocol):
    def close(self) -> None: ...

Now the protocol SizedAndCloseable is a protocol with two methods, __len__ and close. Alternatively, we could have implemented it like this, assuming the existence of SupportsClose from an earlier example:

from typing import Sized

class SupportsClose(...): ...  # Like above

class SizedAndCloseable(Sized, SupportsClose, Protocol):
    pass

The two definitions of SizedAndClosable would be equivalent. Subclass relationships between protocols aren't meaningful when considering subtyping, as we only use structural compatibility as the criterion, not the MRO.

If we omit Protocol in the base class list, this would be regular (non-protocol) class that must implement Sized. If Protocol is included in the base class list, all the other base classes must be protocols. A protocol can't extend a regular class.

7) Should we support optional attributes?

We can come up with examples where it would be handy to be able to say that a method or data attribute does not need to be present in a class implementing a protocol, but if it's present, it must conform to a specific signature or type. One could use a hasattr check to determine whether they can use the attribute on a particular instance.

In the interest of simplicity, let's not support optional methods or attributes. We can always revisit this later if there is an actual need. The current realistic potential use cases for protocols that I've seen don't require these. However, other languages have similar features and apparently they are pretty commonly used. If I remember correctly, at least TypeScript and Objective-C support a similar concept.

8) Should some ABCs in typing be protocols instead?

I think that at least these classes in typing should be protocols:

  • Sized
  • Container
  • Iterable
  • Iterator
  • Reversible
  • SupportsAbs (and other Supports* classes)

These classes are small and conceptually simple. It's easy to see which of these protocols a class implements from the presence of a small number of magic methods, which immediately suggest a protocol.

I'm not sure about other classes such as Sequence, Set and IO. I believe that these are sufficiently complex that it makes sense to require code to be explicit about them, as there would not be any sufficiently obvious and small set of 'marker' methods that tell that a class implements this protocol. Also, it's too easy to leave some part of the protocol/interface unimplemented by accident, and explicitly marking the subclass relationship allows type checkers to pinpoint the missing implementations -- or they can be inherited from the ABC, in case that is has default implementations. So I'd currently vote against making these classes protocols.

9) Should we support recursive protocols?

Sure, why not. They might useful for representing self-referential data structures like trees in an abstract fashion, but I don't see them used commonly in production code.

10) Should we support generic protocols?

Generic protocol are important. For example, SupportsAbs, Iterable and Iterator would be generic. We could define them like this, similar to generic ABCs:

T = TypeVar('T', covariant=True)

class Iterable(Protocol[T]):
    def __iter__(self) -> Iterator[T]: ...

11) Should these be interoperable with other similar implementations?

The protocols as described here are basically a small extension to the existing concept of ABCs. I argue that this is the way they should be understood, instead of as something that replaces Zope interfaces, for example.

12) How should isinstance work?

We shouldn't implement any magic isinstance machinery, as performing a runtime compatibility check is generally difficult: we might want to verify argument counts to methods, names of arguments and even argument types, depending the kind of protocol we are talking about, but sometimes we wouldn't care about these, or we'd only care about some of these things.

My preferred semantics would be to make isinstance fail by default for protocol types. This would be in the spirit of duck typing -- protocols basically would be used to model duck typing statically, not explicitly at runtime.

However, it should be possible for protocol types to implement custom isinstance behavior when this would make sense, similar to how Iterable and other ABCs in collections.abc and typing already do it, but this should be specific to these particular classes. We need this fallback option anyway for backward compatibility.

13) Should every class be a protocol by default?

Some languages such as Go make structural subtyping the only or the primary form of subtyping. We could achieve a similar result by making all classes protocols by default (or even always). I argue that this would be a bad idea and classes should need to be explicitly marked as protocols, as shown in my proposal above.

Here's my rationale:

  1. Protocols don't have some properties of regular classes. In particular, isinstance is not well-defined for protocols, whereas it's well-defined (and pretty commonly used) for regular classes.
  2. Protocol classes should generally not have (many) method implementations, as they describe an interface, not an implementation. Most classes have many implementations, making them bad protocol classes.
  3. Experience suggests that most classes aren't practical as protocols anyway, mainly because their interfaces are too large, complex or implementation-oriented (for example, they may include de facto private attributes and methods without a __ prefix). Most actually useful protocols in existing Python code seem to be implicit. The ABCs in typing and collections.abc are a kind-of exception, but even they are pretty recent additions to Python and most programmers do not use them yet.

14) Should protocols be introspectable?

The existing introspection machinery (dir, etc.) could be used with protocols, but typing would not include an implementation of additional introspection or runtime type checking capabilities for protocols.

As all attributes need to be defined in the class body based on this proposal, protocol classes would have better support for introspection than regular classes where attributes can be defined implicitly -- protocol attributes can't be initialized in ways that are not visible to introspection (using setattr, assignment via self, etc.). Still, some things likes types of attributes wouldn't be visible at runtime, so this would necessarily be somewhat limited.

15) How would Protocol be implemented?

We'd need to implement at least the following things:

  • Define class Protocol (this could be simple, and would be
    similar to Generic).
  • Implement metaclass functionality to detect whether a class is a protocol or not. Maybe add a class attribute such as __protocol__ = True if that's the case. Verify that a protocol class only has protocol base classes in the MRO (except for object).
  • Optionally, override isinstance.
  • Optionally, translate ... class attribute values to something
    else (properties?).

@gvanrossum
Copy link
Member Author

I like almost all of this. Let's take this to python-ideas now! I have a few nits and questions, but they're not important enough to wait, and they're not very deep. (There's something niggling about making e.g. Sized a Protocol and not implementing isinstance(), since collections.abc.Sized does implement it.)

@gvanrossum
Copy link
Member Author

FYI, I posted a link to the latest proposal to python-ideas.

Crosslink: https://mail.python.org/pipermail/python-ideas/2015-September/035859.html

@JukkaL
Copy link
Contributor

JukkaL commented Sep 10, 2015

Sorry, the description of isinstance was unclear in the proposal. My idea was to preserve the isinstance support for Sized and all the other existing ABCs in typing that would be turned into protocols, but new protocol types wouldn't automatically get any isinstance machinery. This way the ABCs would remain compatible with existing code that uses them and might use isinstance with them.

@ilevkivskyi
Copy link
Member

@lyschoening
I am not sure what do you mean by

the interpreter would need to check whether the method body is an ellipsis and suppress the TypeError if it is not.

there will be no any changes in the interpreter, most of the things we discuss are only for type checker. But I think I understand what you want, we are going to update the PEP draft, so that you will not need to put @abstractmethod on keys(). Briefly, we will probably have all methods as protocol members, but some of them will be abstract, i.e. required for implementation even in the case of explicit subclassing.

@brettcannon
I think the only thing that we still need to keep special for "empty" bodies is to suppress type errors in type checker on the definition:

class Proto(Protocol):
    @abstractmethod
    def method(self) -> Iterable[int]: # this should not be an error, although formally it is
        raise NotImplementedError      # since we don't return an 'Iterable'.

@lyschoening
Copy link

lyschoening commented Mar 14, 2017

@ilevkivskyi

What I meant was that all protocol members required @abstractmethod, then members with an implementation, such as keys() would have to be implemented (with super() or otherwise), which would force them to be used differently than regular ABCs even though the runtime behavior might be the same, unless the body of protocol members were checked at runtime and only those methods having an ellipsis were treated as abstract methods. If all methods are protocol members that solves the problem.

(I am not advocating that they should be usable in the same way as other ABCs, but that appeared to be the goal.)

gvanrossum pushed a commit to python/peps that referenced this issue Mar 18, 2017
This adds static support for structural subtyping. Previous discussion is here python/typing#11

Fixes #222
@ilevkivskyi ilevkivskyi mentioned this issue Apr 26, 2017
@vlasovskikh
Copy link
Member

I've reviewed the current draft of PEP 544 at PyCon US 2017 sprints. It looks good from PyCharm's perspective. @JukkaL was kind to answer some questions during my review. The only remaining issue is we would like protocols to be available earlier than Python 3.7, perhaps, as a part of typing_extensions, see #435.

@gvanrossum
Copy link
Member Author

Yes on that last question. We also want in for Python 2.7.

@ilevkivskyi
Copy link
Member

@vlasovskikh I just posted the latest version of the PEP on python-dev https://mail.python.org/pipermail/python-dev/2017-May/148005.html

@chadrik
Copy link
Contributor

chadrik commented Nov 1, 2017

This is an incredible feature and I'm very eager to use it. Just curious when the current "pseudo-protocols" like Iterable will be converted to typing_extensions.Protocols (mentioned here in the pep) so that I can begin using them with mypy. Will they land in typing or show up in typing_extensions first?

@ilevkivskyi
Copy link
Member

@chadrik
Copy link
Contributor

chadrik commented Nov 1, 2017

@ilevkivskyi I was searching the wrong repos! Thanks.

@gvanrossum
Copy link
Member Author

gvanrossum commented Nov 1, 2017 via email

@ilevkivskyi
Copy link
Member

@gvanrossum no changes needed in typing (making Mapping a protocol is optional). Many classes there, like typing.Iterable already behave as protocols at runtime, as you can see, the PR I referenced is in typeshed.

@gvanrossum
Copy link
Member Author

I'm not sure how you can say that Iterable already behaves like a protocol. I tried this code:

class C:
  def __iter__(self) -> Iterator[int]:
      yield 0
for x in C():  # error: Iterable expected
    print(x)

This complains that a C instance is not an Iterable.

@JukkaL
Copy link
Contributor

JukkaL commented Nov 1, 2017

Iterable behaves like a protocol at runtime (it has a custom isinstance overload).

@gvanrossum
Copy link
Member Author

gvanrossum commented Nov 1, 2017 via email

@ilevkivskyi
Copy link
Member

So the concerns are more about protocols that don't do that, e.g. Sequence or Mapping.

Yes, there are four classes Sequence, Mapping, MutableSequence, and MutableMapping.
Actually, I am thinking maybe we can leave them out from python/typeshed#1220 and consider them later. I have already heard several times that people want mypy to recognize existing runtime protocols (with Iterable being an absolute winner) so that we could merge python/typeshed#1220 without these four classes soon.

@gvanrossum
Copy link
Member Author

gvanrossum commented Nov 1, 2017 via email

@chadrik
Copy link
Contributor

chadrik commented Apr 23, 2018

Hi, are there any updates on the remaining collections which are not yet protocols -- Sequence, Mapping, MutableSequence, and MutableMapping?

@gvanrossum
Copy link
Member Author

gvanrossum commented Apr 23, 2018 via email

@chadrik
Copy link
Contributor

chadrik commented Apr 24, 2018

The plan is to keep them as they are.

What's the reasoning behind that? Even if the abc classes don't become protocols, it would be convenient to have Protocol variants of these in typing or mypy_extensions, so that users who wish to treat these as protocols within their code don't have to write their own Protocol classes for these common cases.

@ilevkivskyi
Copy link
Member

What's the reasoning behind that?

We have no options now. The feature cut-off for Python 3.7 has passed long ago, and this festure would require a runtime change in collections.abc.

Strong -1 on typing.Mapping and collections.abc.Mapping having different semantics, this will only confuse people.

We had a discussion over e-mail with @gvanrossum recently, we are both +0 on making these protocols (only +0 mostly because they are large, while protocols should be more compact), so we can consider this again in one year for Python 3.8.

@JukkaL
Copy link
Contributor

JukkaL commented Apr 24, 2018

I have some concerns about turning Sequence etc. into protocols. Several of the method signatures in these ABCs are pretty subtle, so it's quite easy to write a class that almost implements, say, Sequence, but not quite because of some signatures being incompatible. With an explicit base class a mismatch will immediately be reported by a type checker -- otherwise it's quite possible to write something that looks like a sequence but actually isn't, and checking that by only reading the code can be hard. This isn't a major problem for most existing protocols in typing since the signatures are pretty obvious.

Also, currently isinstance works with them, so we'd have to support that in the future -- so these protocols would be "runtime" protocols. This brings the risk that the runtime and static views of subtyping become inconsistent because of minor signature differences.

Examples of somewhat tricky signatures in Sequence:

    @overload  # Overloads are a somewhat tricky feature
    def __getitem__(self, i: int) -> _T_co: ...
    @overload
    def __getitem__(self, s: slice) -> Sequence[_T_co]: ...

    def __contains__(self, x: object) -> bool: ...   # object argument type may be unexpected

@ilevkivskyi
Copy link
Member

@JukkaL actually yes, I have seen someone on gitter recently confused by __contains__.

@gvanrossum
Copy link
Member Author

gvanrossum commented Apr 17, 2019

@jakebailey (MS Language Server representative)

@ilevkivskyi
Copy link
Member

PEP 544 is now accepted, mypy (and some other type checkers) have good support for the PEP, all necessary infrastructure updates have been made. So I am finally closing this issue as fixed.

@geryogam
Copy link

Also, currently isinstance works with them, so we'd have to support that in the future -- so these protocols would be "runtime" protocols. This brings the risk that the runtime and static views of subtyping become inconsistent because of minor signature differences.

@JukkaL I had the same concerns here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.