Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow subclassing without supertyping #241

Open
dmoisset opened this issue Jul 8, 2016 · 58 comments
Open

Allow subclassing without supertyping #241

dmoisset opened this issue Jul 8, 2016 · 58 comments
Labels
topic: feature Discussions about new features for Python's type annotations

Comments

@dmoisset
Copy link
Contributor

dmoisset commented Jul 8, 2016

This issue comes from python/mypy#1237. I'll try to make a summary of the discussion here

Some one reported an issue with an artificial example (python/mypy#1237 (comment)):

class Foo:
    def factory(self) -> str:
        return 'Hello'

class Bar(Foo):
    def factory(self) -> int:
        return 10

and then with the following stub files (python/mypy#1237 (comment)):

class QPixmap(QPaintDevice):
    def swap(self, other: 'QPixmap') -> None: ...

class QBitmap(QPixmap):
    def swap(self, other: 'QBitmap') -> None: ...

Which mypy currently reports as erroneous: Argument 1 of "swap" incompatible with supertype "QPixmap"

These were initially was argued to be a correct error because they violate Liskov Substitution Principle (the same case by a change of return type, the second is a covariant argument change). The problem in this scenario is that the actual classes are not meant to be substituted (the first example is actually a wrapper for a non-virtual C++ function so they're not substitutable, and the second aren't supposed to be mixed together, just reuse part of the code). (see python/mypy#1237 (comment))

There was also a suggestion that allow covariant args explicitly (in the same way that Eiffel allows it and adds a runtime check) with a decorator

class QBitmap(QPixmap):
    @covariant_args
    def swap(self, other: 'QBitmap') -> None: ...

My proposal instead was add what some dialects of Eiffel call "non-conforming inheritance" ("conforms" is the Eiffel word for what PEP-483 calls "is-consistent-with"). It is essentially a mechanism to use subclassing just as a way to get the implementation from the superclass without creating a subtyping relation between them. See python/mypy#1237 (comment) for a detailed explanation.

The proposal is to haven Implementation in typing so you can write:

class QBitmap(Implementation[QPixmap]):
    def swap(self, other: 'QBitmap') -> None: ...

which just defines a QBitmap class with all the mothods copied from QPixmap, but without making one a subtype of the other. In runtime we can just make Implementation[QPixmap] == QPixmap

@dmoisset
Copy link
Contributor Author

dmoisset commented Jul 8, 2016

As a follow-up to the discussion, @gvanrossum suggested using a class decorator. The problem with this approach is that class-level is not the right granularity to decide this, but each individual inheritance relation is. Ina multiple-inheritance context, you could want to inherit the interface of a SuperClassA, and the implementation of SuperClassB. I think that might be quite common with mixins (actually I think most mixins could be used with this form of inheritance)

Looking deeper into what Eiffel does (but this is a bit of a side-comment, not part of the main proposal), there it was also possible to define a superclass in a way that only implementation inheritance from it was allowed, and that could work well with mixins (and implemented as a class decorator). However you still need to be able to say "I just want the implementation of this superclass" when inheriting, because the writer of the superclass may not have foreseen that you wanted that.

@dmoisset
Copy link
Contributor Author

dmoisset commented Jul 8, 2016

Also note that this idea makes self types more viable. Things like TotalOrderingMixin where you have

    def __lt__(self: T, other:T): -> bool ...

can be defined more accurately (currently this is in the stdlib), and with no errors (note that the < operator/ __lt__ method is actually covariant in its argument in Python 3). TotalOrderingMixin is something that would create an error in every case with the current mypy implementation if self types are added as is.

@ilevkivskyi
Copy link
Member

I think this is a very nice feature. In general, it is good that we separate classes (code reuse) and types (debugging/documentation). If it is added, then I would also mention this feature in PEP483 (theory) as an example of differentiating classes and types.

Between a decorator and a special class I chose the second one, since as @dmoisset mentioned, it is more flexible and suitable fox mixin use cases. Also, it is very in the spirit of gradual typing, one can mark particular bases as Implementation while still typecheck with respect to other bases.

@gvanrossum
Copy link
Member

gvanrossum commented Jul 10, 2016 via email

@ilevkivskyi
Copy link
Member

I think it is better to have a minimalistic solution, rather than try to make a complex proposal that will cover all use cases. Later, more features for most popular use cases could be added.

Here is a detailed proposal with examples:

  1. At runtime Implementation[MyClass] is MyClass. So that the separation of control between subclass/superclass is as usual. Also, all normal mechanisms work at runtime:
class Base:
    def __init__(self, number: int) -> None:
        self.number = number

class AnotherBase:
    ...

class Derived(Implementation[Base], AnotherBase):
    def __init__(self) -> None:
        super().__init__(42) # This works since Implementation[Base] is just Base
  1. Use cases for Implementation are situations where a user wants to reuse some code in another class without creating excessive "type contracts". Of course one can use Any or #type: ignore, but Implementation will allow more precise type checking.

  2. The derived class is not considered a subtype of implementation base classes. An explicit cast might be used under user's responsibility. Examples:

from typing import cast

def func(x: Base) -> None:
    ...
def another_func(x: AnotherBase) -> None:
    ...

func(Base(42))              # this is OK
another_func(Derived())     # this is also OK
func(Derived())             # Marked as error by a typechecker
func(cast(Base, Derived())) # Passes typecheck
  1. Methods of an implementation base class could be redefined in an incompatible way:
class Base:
    def method(self, x: int) -> None:
        ...

class AnotherBase:
    def another_method(self, y: int) -> None:
        ...

class Derived(Implementation[Base], AnotherBase):
    def another_method(self, y: str) -> None: ... # This is prohibited by a typechecker
    def method(self, x: str) -> None: ...         # Although, this is OK
  1. As @JukkaL proposed, a typechecker checks that the redefined methods do not "spoil" the already checked code by re-type-checking the bodies of all base class methods as if they were written in the subclass (after translating type variables suitably if the base class is generic).

I think the re-check should be recursive, i.e. include bases of bases, etc. Of course, this could slow down the type-check, but not the runtime.

  1. A typechecker uses the normal MRO to find the signature of a method. For example, in this code:
class Base:
    def method(self, x: int) -> None: ...

class AnotherBase:
    def another_method(self, y: int) -> None: ...

class Derived(Implementation[Base], AnotherBase):
    def method(self, x: str) -> None: ...

a typechecker infers that Derived().method has signature (str) -> None, and Derived().another_method has signature (int) -> None.

  1. Clearly, Implementation could be used only with a proper class. Implementation[Union[t1, t2]], Implementation[Callable[[t1, t2, ...], tr]], etc. rise TypeError.

If we agree on this, then I will submit PRs for typing.py, PEP484, and PEP483.

@JukkaL
Copy link
Contributor

JukkaL commented Jul 14, 2016

This has the problem that a type checker doesn't see inside the implementations of stubbed classes, and you could break some of the base class methods without a type checker having any way of detecting it.

Maybe implementation inheritance would only be allowed from within your own code somehow. For example, behavior would be undefined if you inherit the implementation of a stdlib or 3rd party library class, unless you include the implementation of the library module in the set of checked files. object would be fine as a base class, of course, but you probably shouldn't override any methods defined in object in an incompatible manner, since this can break a lot of things. This would only be enforced by static checkers, not at runtime.

Not sure if the latter would be too restrictive. To generalize a bit, it would okay if the implementation base class has library classes in the MRO, but methods defined in those classes can't be overridden arbitrarily.

@ilevkivskyi
Copy link
Member

@JukkaL This is a good point. We need to prohibit incompatibly overriding methods everywhere except for accessible in the set of checked files. I don't think it is too restrictive. Alternatively, we could allow this with a disclaimer that complete typecheking is not guaranteed in this case.

@gvanrossum
Copy link
Member

gvanrossum commented Jul 14, 2016 via email

@ilevkivskyi
Copy link
Member

@gvanrossum Indeed, there is not much could be done. But I don't think this is bad. As with Any or # type: ignore, people who are going to use this feature know what they do. More important is that Implementation provides more precise typechecking, more "granularity", and probably a bit more safety than Any.

@JukkaL
Copy link
Contributor

JukkaL commented Jul 14, 2016

The QPixmap example shouldn't be a problem, since it seems to be for a library stub file. We don't verify that stubs are correct anyway, so a stub would be able to use implementation inheritance without restrictions but there are no guarantees that things are safe -- we expect the writer of the stub to reason about whether implementation inheritance is a reasonable thing to use.

For user code, I think that we can provide a reasonable level of safety if we enforce the restrictions that I mentioned.

@JukkaL
Copy link
Contributor

JukkaL commented Jul 14, 2016

I have a related proposal. What about having a magic class decorator (with no runtime or minimal effect) or a base class that declares a class as a "pure mixin"? A pure mixin could only be used for implementation inheritance, and it wouldn't be type checked by itself at all -- it would only be type checked as part of implementation inheritance. I think that this would allow type checkers to check some mixin use cases that are tricky right now with minimal code changes. The alternative would be to use ABCs to declare the things that the mixin expects to be defined elsewhere, but they are harder to use and arguably add boilerplate.

Example:

@puremixin
class MyMixin:
    def do_stuff(self) -> int:
        self.one_thing()  # ok, even though not defined in this class
        return self.second_thing(1)  # also ok

class Thing(Implementation[MyMixin]):
    def one_thing(self) -> None: pass  # ok
    def second_thing(self, x: int) -> str:    # error: return type not compatible with do_stuff
        return 'x'

@gvanrossum
Copy link
Member

gvanrossum commented Jul 14, 2016

But is that the way mixins are typically used? It would be good to look
carefully at some examples of mixins (e.g. one of our internal code bases
uses them frequently).

@ilevkivskyi
Copy link
Member

@JukkaL How about the following disclaimer to be added?

"""
Note that implementation inheritance could be used with stubbed extension modules, but there are no guarantees of safety in this case (type checkers could not verify correctness of extension modules). The writer of the stub is expected to reason about whether implementation inheritance is appropriate.

In user code, implementation inheritance is only allowed with classes defined in the set of checked files. Built-in classes could be used as implementation base classes, but their methods could not be redefined in an incompatible way.
"""

I like you proposal of @puremixin decorator (maybe we could choose other name). It looks like it is independent of the initial proposal and could be easily added. I propose @gvanrossum to decide whether we need this.

@JukkaL
Copy link
Contributor

JukkaL commented Jul 14, 2016

@gvanrossum I've seen mixins like that (that use things that aren't defined anywhere in the mixin class). I'm not sure how common they are.

@ilevkivskyi Looks good. I'd do some small changes (feel free to edit at will):

Note that implementation inheritance could be used with stubbed extension modules, but there are no guarantees of safety in this case (type checkers could not verify correctness of extension modules). The writer of the stub is expected to reason about whether implementation inheritance is appropriate.

->

Note that implementation inheritance could be used with stubbed extension modules, but there are no guarantees of safety in this case (type checkers could not verify correctness of extension modules). As a special case, if a stub uses implementation inheritance within the stub, the writer of the stub is expected to reason about whether implementation inheritance is appropriate. Stubs don't contain implementations, so type checker can't verify the correctness of implementation inheritance in stubs.

Built-in classes could be used ...

->

Built-in and library classes could be used ...

@JukkaL
Copy link
Contributor

JukkaL commented Jul 14, 2016

Also, I'm not at all attached to the puremixin name.

@gvanrossum
Copy link
Member

gvanrossum commented Jul 14, 2016 via email

@JukkaL
Copy link
Contributor

JukkaL commented Jul 14, 2016

Created #246 for pure mixins.

@gvanrossum
Copy link
Member

@gvanrossum I've seen mixins like that (that use things that aren't defined anywhere in the mixin class). I'm not sure how common they are.

We're probably talking about the same mixin classes. :-) However, I'm not convinced that those would deserve being marked as Implementation[] -- the class that inherits from the mixin typically also inherits from some other class and the mixin should not conflict with any methods defined there (though it may override them). (OTOH maybe I didn't read the examples all the way through.)

@JukkaL
Copy link
Contributor

JukkaL commented Jul 14, 2016

Let's move the discussion of mixins to #246?

@dmoisset
Copy link
Contributor Author

I was writing a proposal and it may be a bit messy to post it now because I duplicated some stuff already said, but wrote some other things differently. Let me summarize different or subtle points that I think should be discussed:

  • Using a generic-like syntax like Implementation[Foo] looks to similar to generics while what we have is somewhat different. I've been thinking on a functional syntax, i.e.:
class Base: ...
class Derived(Foo, implementation_only(Base), Bar): ...

so it doesn't make sense to pass it a Typevar or anything that has no supporting class (like unions or protocols). I think the different syntax helps.

  • other names I considered for implementation_only() are reuse() or include() or insert() (the latter ones makes sense if you think of the mechanism it implements). "implementation" sounds a bit vague (even if it was my first proposal)
  • probably the type checker should verify that implementation_only is used only in the context of a class definition in the base class list.
  • the typechecker should ignore the Base methods if overriden in Derived, or in other parent (Foo, but not Bar in the example above)
  • inherited bodes should be checked if available; if stubs only no checking its done (it's consistent with the rest of mypy that ignore bodies if stubs are present)
  • super() needs some special care. The simplest way to approach it is that a super() object does not have methods that have a signature coming from an implementation_only parent, otherwise you could get mistyped calls from the mro chain. If you're doing implementation only inheritance, following the mro makes no sense anyway. We should allow some way to get the parent methods, maybe through an explicit class reference (allowing some leeway on the self parameter when some method on Derived says Base.method(self, ...))
  • A class defined having all its parents as implementation_only, is interpreted by the typechecker
    as a direct descendent of object.

As a side note, using functional syntax would make it trivial to use @implementation_only as a class decorator too, which can be used for the mixin mechanism proposed by @JukkaL . However for that example I'd prefer the mixin to actually have the signatures of its "abstract methods"; if I want to implement a mixin one of the things I want from a static type system id to give me some hints of what these methods should take. For me, the decorator would only make subclasses of the mixin to behave like the rest of this proposal

@dmoisset
Copy link
Contributor Author

Btw, for further reference there's a paper on the idea at http://smarteiffel.loria.fr/papers/insert2005.pdf

@dmoisset
Copy link
Contributor Author

dmoisset commented Jul 14, 2016

There are some things that can be added to this mechanism (I left these apart because they aren't
part of the core proposal to keep the above KISS).

  • Allow classes to be always implementation_only (like @JukkaL said):
@implementation_only
class SomeMixin:
    def method(self, x: int) -> str: ...
    def other_method(self, x: int) -> str: ...

class C(SomeMixin):
    def method(self) -> None: ... # OK to change the signature
  • Allow to selectively include or exclude parts of the parent class:
class C(
        implementation_only(
            SomeMixin, include=('method', 'attribute')
        )
       ): ...

class D(
        implementation_only(
            SomeMixin, exclude=('other_method')
        )
       ): ...
  • Allow renames of methods
class E(
        implementation_only(
            SomeMixin, renames={'method': '_d_method'}
        )
       ): ...

The exclusion/renaming mechanism are present in Eiffel, and the renaming allows to unambiguously resolve issues related to super() by given explicit different names to the overriden methods, although
the mechanism has some complexities that might be problematic in python.

I don't think the first proposal should include any of these, but I wanted to mention these to also clarify that the functional syntax gives more flexibility for future growth of the idea.

@ilevkivskyi
Copy link
Member

@dmoisset Indeed, some things you propose have already been said. This is rather a good sign.
Some comments:

  • I don't think that we need functional syntax, square brackets are used not only for generics they are used almost everywhere in the type checking context.
  • I like the name Implementation. It reminds me that I am inheriting only the "implementation" of the class, not the "interface" (i.e. type contracts).
  • I am not sure that the restriction about super() is really necessary.

@dmoisset
Copy link
Contributor Author

dmoisset commented Jul 14, 2016

The possible problem with super happens in the following snippet:

def implementation_only(C): return C

class Base:
    def f(self, a: int) -> None:
        print("Base")
        print(a)

class Left(Base):
    def f(self, a: int) -> None:
        print("Left.f")
        super().f(a)

class Right(implementation_only(Base)):
    def f(self, a: str) -> None:
        print("Right.f")
        super().f(len(a))

class C(Left, implementation_only(Right)):
    def f(self, a: int) -> None:
        print("C.f")
        super().f(a)

C().f(42)

This code fails with a TypeError if run with Python. but the rules defined above don't detect a problem:

  1. Base and Left are correctly typed
  2. Right redefines the signature of f (which is ok, given that it's using implementation inheritance).
  3. Right.f calls super.f using an int argument (which looks ok, given that the inherited f accepts an int)
  4. C redefines the signature of Right.f (which is ok, given that it is using implementation inheritance). It keeps the signature of Left.f so no problem there
  5. C.f calls super.f with an int. The type checker is only looking at the Left.f signature, so this is considered valid.

But even if everything looks ok by itself, the program has a type error, so the rules are unsound. So some of the rules above should actually be rejected in this scenario. 1 seems to be an ok rule (valid with the current semantics), and 2 is our proposal so let's assume it as valid (if we get to a contradiction the idea itself is unsound). So 3, 4, or 5 should be wrong, and we should choose (at least) one, under some conditions (which should include this scenario). My proposal was "3 is wrong. You can't call super from an implementation_only redefinition of a method". This actually will make the call that produces the TypeError in runtime to be invalid.

Other options are making this scenario impossible forbidding the redefinition in some multiple inheritance cases (4 would be invalid), but it's probably harder to explain to users

@ilevkivskyi
Copy link
Member

Thank you for elaborating, I understand your point. However, it looks like a more sophisticated typechecker could catch the TypeError. By re-type-checking bodies of methods in C parent classes, a typechecker could detect that the super().f(a) call in Left has incompatible signature, since super() for Left is now Right, and f in Right requires a str.

Prohibiting 3 is an absolutely valid option (although it feels a bit false positive to me), also prohibiting 4 is a valid option. What I propose is to leave this choice (sophisticated recursive re-checking vs prohibitng 3 vs prohibiting 4) up to the implementer of a particular typechecker.

Restrictions mentioned by Jukka are necessary, because it is extremely difficult (probably impossible in principle) to check types inside extension modules, so that we:
a) "forgive" typecheckers all uncaught TypeErrors related to this;
b) propose some restrictions for user code that nevertheless allow to maintain safety.
At the same time, typecheckers should be able to catch the error like you show in either way they choose (i.e. if a typechecker is not able, then this could be considered a bug in that checker).

@gvanrossum
Copy link
Member

gvanrossum commented Jul 15, 2016 via email

@JukkaL
Copy link
Contributor

JukkaL commented Jul 15, 2016

The point about incremental checking is a good one. General performance could also be an important concern if people would start using this more widely. My expectation would be that this would mostly be used in fairly simple cases, but of course any feature given to users tends to be used in unforeseen ways, and once the genie is out of the bottle it's difficult to put it back there. There could be also other interactions with particular language features that we haven't though of yet.

Due to all the concerns and tricky interactions that we've uncovered, I'd like to see a prototype implementation of this before deciding whether to add this to the PEP -- I have a feeling that we don't yet understand this well enough.

@ilevkivskyi
Copy link
Member

@gvanrossum

But honestly I would rather hear from people who find they need this for their own code that they are annotating

The situation described by @PetterS is quite widespread in the scientific community. I did such thing several times. If one adds #type: ignore to

class Expr:
    def __eq__(self, other: 'Expr') -> 'Expr':
        ...

then this method will be typed as Callable[[Any], Any]. That will lead to two kinds of errors:

  1. If Expr is used where object is expected
def eq_one(x: object) -> bool:
    return x == 1
eq_one(Expr()) # error here
  1. If something other than Expr is passed to its __eq__ method:
Expr() == 'boom!' # error here

Both kinds of errors are not caught by mypy now, but will be caught with Implementation:

class Expr(Implementation[object]):
    def __eq__(self, other: 'Expr') -> 'Expr':
        ...

Of course I could not judge how widespread are such situations in general. Maybe we could ask others?
@vlasovskikh Have you heard about such situations from users of PyCharm?

@pkch
Copy link

pkch commented Apr 3, 2017

I wanted to add class constructors to the list of issues to clear up in this discussion. At the moment, they receive a confusing treatment:

# example 1
class A:
    def f(self, x: Sequence) -> None: ...

class B(A):
    # error: Argument 1 of "f" incompatible with supertype "A"
    def f(self, x: List) ->  None: ...

but

# example 2
class A:
    def __init__(self, x: Sequence) -> None: ...

class B(A):
    # passes type check
    def __init__(self, x: List) ->  None: ...

OTOH, class creation passes type check:

# example 3
class A: 
    def __init__(self, a: int) -> None:
        self.a = a
def f(a: Type[A]) -> A:
    return a(0)

The only difference between constructors and other methods is that they affect Liskov substitutability of different objects: constructors affect the classes, while other methods affect the instances of those classes.

My preference would be to do what this thread is proposing - let the programmer easily choose whether Liskov constraint (=subtyping) should apply, regardless of subclassing; so __init__ could be either constrained or not. If it's constrained, then example (2) should fail type check but (3) should pass; if it's not constrained, it (2) should pass but (3) should fail.

BTW, another argument in favor of separation of subytping from subclassing is that substitutability only makes sense if the semantics is compatible -- it's not enough that the methods have matching types; for example:

  • it's rarely ok pass iterators in place of iterables (despite the subclass relationship) because they might be exhausted (leading to incorrect behavior)
  • it's dangerous to pass a x: defaultdict instead of a x: dict because there might be try dostuff(x[k]) except KeyError ... in the code

@pkch
Copy link

pkch commented Apr 7, 2017

Let's say Y inherits from X (through regular class Y(X) inheritance, not through ABCMeta.register).

Currently:

  • Automatically Y is subtype of X and type safety is ensured (i.e., subclass methods compatibility is checked).
  • Automatically Type[Y] is subtype of Type[X] but type safety is not ensured (i.e., subclass __init__ compatibility is not checked).

The proposal so far is:

  • Programmer can choose whether or not Y should be a subtype of X. If he chooses to create this subtyping relationship, then type safety is ensured (same as at present).

I think for consistency, the second part of the proposal should be:

  • Programmer can choose whether or not Type[Y] should be a subtype of Type[X]. If he chooses to create this subtyping relationship, then type safety is ensured (i.e., subclass __init__ compatibility is checked, unlike at present).

@gvanrossum
Copy link
Member

Canyou please use words instead of symbols? I can never remember which way <: or :> goes. (It's fine to edit the comment in place if GitHub let you.)

@pkch
Copy link

pkch commented Apr 7, 2017

Done. To summarize, I think X.__init__ is relevant for type checking of class objects (type Type[X]) and irrelevant for type checking of instance objects (type X). My rationale is that:

  • X.__init__ signature defines the details of the Callable type that is used to represent class X, and that Callable type is nearly equivalent to Type[X]
  • while technically a programmer can call __init__ on an instance: X().__init__(), there are no good use cases for that

@gvanrossum
Copy link
Member

Ah, that summary is very helpful. I agree that one shouldn't call __init__ manually (and in fact Python always reserves the right to have "undefined" behavior when you define or use dunder names other than documented).

But of course Type[X] has a bunch of behaviors that Callable[..., X] does not (e.g. it has class methods).

@mitar
Copy link
Contributor

mitar commented Sep 26, 2017

I have tried to read through this whole thread and I am maybe missing something, but I think I have a slight variation/use case to share to add to this discussion.

If you are familiar with sklearn you know that it provides many various machine learning classes. It is great because many classes share the same interface, so it is easy to programmers to use them. The issue is that while method names are the same, arguments they receive vary greatly. If I would like to define types for them in a meaningful way, there is a problem. The base class might be something like:

class Estimator(Generic[Inputs, Outputs]):
    @abstractmethod
    def fit(self, *, inputs: Inputs, outputs: Outputs) -> None:
        pass

    @abstractmethod
    def predict(self, *, inputs: Inputs) -> Outputs:
        pass

class SupervisedEstimator(Estimator[Inputs, Outputs]):
    pass

class UnsupervisedEstimator(Estimator[Inputs, Outputs]):
    @abstractmethod
    def fit(self, *, inputs: Inputs) -> None:
        pass

class GeneratorEstimator(Estimator[List[None], Outputs]):
    @abstractmethod
    def fit(self, *, outputs: Outputs) -> None:
        pass

class LupiSupervisedEstimator(SupervisedEstimator[Inputs, Outputs]):
    @abstractmethod
    def fit(self, *, inputs: Inputs, privileged_inputs: Inputs, outputs: Outputs) -> None:
        pass

So, somehow the idea is that Estimator defines which methods are available, and general arguments, but that then subclasses can define their own combination of arguments for their methods. And that then subclasses of those have to uphold the Liskov Substitution Principle. But one between Estimator and direct subclasses should not matter, because it should be clear that if you are expecting base Estimator, you might know which methods you can use, and what is semantics of those, but you have to inspect required arguments at runtime for an instance you have.

It seems this is not possible to express with current Python typing so that mypy would not complain?

@JukkaL
Copy link
Contributor

JukkaL commented Sep 26, 2017

@mitar Can you give a more complete example which illustrates the errors generated by mypy?

@gvanrossum
Copy link
Member

It seems this is not possible to express with current Python typing so that mypy would not complain?

I can up with an example that seems to work:

from typing import *
T = TypeVar('T')

class Base:
    fit: Callable

class Foo(Base):
    def fit(self, arg1: int) -> Optional[str]:
        pass

class Bar(Foo):
    def fit(self, arg1: float) -> str:
        pass

Maybe this pattern helps? This seems to be about the best you can do given that you don't want the base class to specify the method signatures.

@mitar
Copy link
Contributor

mitar commented Oct 12, 2017

Interesting. I never thought of just specifying Callable. This is very useful.

My ideal typing semantics would be: this is a method signature for fit with arguments inputs and outputs. If your subclass fit uses these arguments, they have to match base fit types, but they can also be specified as omitted (by not listing them in subclass fit arguments).

Sadly, Optional is not useful here because it is not that a subclass is saying you can pass an argument or not, but that you have to pass it always, or you cannot pass it ever. (So different subclasses would choose one or the other.)

Then consumers of those objects could check and say: does this method accepts inputs, if yes, then I know what is the type and semantics of this argument. And same for outputs.

@danmou
Copy link

danmou commented Oct 18, 2019

I've come across a simple case which hasn't been mentioned yet. Variadic arguments can be overridden without (technically) violating the Liskov principle. For example:

class A:
    def foo(self, *args: Any, **kwargs: Any) -> int:
        ...

class B(A):
    def foo(self, *args: Any, bar: int, **kwargs: Any) -> int:
        ...

is (to my understanding) in principle valid, but still gives an incompatible with supertype error with mypy.

The reason I want to do this is: Suppose a library defines the following classes

class A:
    def foo(self, *args: Any, **kwargs: Any) -> int:
        raise NotImplementedError

class B:
    def __init__(self, obj: A) -> None:
        self.obj = obj
    def get_foo(self) -> int:
        return obj.foo()

And user code could be something like the following:

class MyA(A):
    def foo(self, *args: Any, bar: int, **kwargs: Any) -> int:
        return bar

class MyB(B):
    def __init__(self, obj: MyA) -> None:
        super().__init__(obj)
    def get_foo(self) -> int:
        return obj.foo(bar=2)

Importantly, MyA and MyB could both be defined in different third-party libraries. The original library just defines the general interface and basic functionality. Should this example be accepted by the type-checker? If not, how could it be type annotated?

@JelleZijlstra
Copy link
Member

JelleZijlstra commented Oct 18, 2019

Your example is actually unsafe, because a call to x.foo(bar="string") where x: A would be invalid if x is actually an instance of B.

@danmou
Copy link

danmou commented Oct 20, 2019

@JelleZijlstra that's true. I guess then a better solution would be using Callable without specifying the arguments in the base class, e.g.:

class A:
    foo: Callable[..., int]

But of course this still means mypy won't flag something like A().foo() as invalid.

@ziima
Copy link

ziima commented Oct 22, 2019

@gvanrossum

Maybe this pattern helps? This seems to be about the best you can do given that you don't want the base class to specify the method signatures.

Declaring the methods/attributes in mixins is fine if there are just a few of them, but you don't really want to do that when there are a lot of them, for example a mixins for TestCase.

Having

class MyTestMixin:
    assertEqual: Callable
    assertRaises: Callable
    ... # 10 other asserts I used in the mixin

is not really a way I want to go.

@leonardoramirezr
Copy link

Try with:

from typing import Type, TYPE_CHECKING, TypeVar

T = TypeVar('T')


def with_typehint(baseclass: Type[T]) -> Type[T]:
    """
    Useful function to make mixins with baseclass typehint

    ```
    class ReadonlyMixin(with_typehint(BaseAdmin))):
        ...
    ```
    """
    if TYPE_CHECKING:
        return baseclass
    return object

Example tested in Pyright:

class ReadOnlyInlineMixin(with_typehint(BaseModelAdmin)):
    def get_readonly_fields(self,
                            request: WSGIRequest,
                            obj: Optional[Model] = None) -> List[str]:

        if self.readonly_fields is None:
            readonly_fields = []
        else:
            readonly_fields = self.readonly_fields # self get is typed by baseclass

        return self._get_readonly_fields(request, obj) + list(readonly_fields)

    def has_change_permission(self,
                              request: WSGIRequest,
                              obj: Optional[Model] = None) -> bool:
        return (
            request.method in ['GET', 'HEAD']
            and super().has_change_permission(request, obj) # super is typed by baseclass
        )

>>> ReadOnlyAdminMixin.__mro__
(<class 'custom.django.admin.mixins.ReadOnlyAdminMixin'>, <class 'object'>)

@srittau srittau added topic: feature Discussions about new features for Python's type annotations and removed enhancement labels Nov 4, 2021
@minrk
Copy link

minrk commented Jan 11, 2022

A pattern I've been struggling with is async subclasses of a concrete base class, following the pattern:

class Thing:
    def count(self) -> int: ...
    def name(self): -> str: ...

class AsyncThing(Thing):
    def count(self): -> Awaitable[int]: ...
    def name(self): -> Awaitable[str]: ...

where several methods on the subclass return a wrapper of the same type as the parent (in this case, Awaitable). This feels like it should be easy. It certainly is easy to implement, but I don't understand enough yet. In particular, it seems like Awaitable should be something like a TypeVar, e.g.

WT = TypeVar("T") # 'wrapper' type, may be Null for no wrapping
class Thing(Generic[WT]):
    def __init__(self: Thing[Null]): # default: no wrapper
    def count(self) -> WT[int]: ... # actually int
    def name(self): -> WT[str]: ...

    def not_wrapped(self): -> int # always int, not wrapped

    def derivative_method(self) -> WT[str]:
        "Not overridden in subclass, but should match return type of name"
        return self.name()
 
class AsyncThing(Thing[Awaitable]):
    def count(self): -> Awaitable[int]: ...
    def name(self): -> Awaitable[str]: ...

but you can't use getitem on TypeVars, so it seems they can't actually be used in this way. Derivative methods that didn't previously need implementations, now also need empty implementations, just to get the updated types.

Is there any way to accomplish something like this, or is it better to just forego types altogether if one follows this kind of pattern?

@JelleZijlstra
Copy link
Member

@minrk that pattern is unsafe, because you're violating the Liskov principle. (If you do thing.count() where thing is of type Thing, you may get an Awaitable if thing is actually an AsyncThing.)

I'd suggest not using inheritance and making AsyncThing instead an independent class that wraps Thing. You could use __getattr__ in that case to implement it, though for type checking purposes you'd probably need to explicitly implement every method.

An alternative is to just # type: ignore everything. That's what we do in typeshed for the redis library, which has a similar pattern where the pipeline class inherits from the normal client but wraps everything in a pipeline.

@minrk
Copy link

minrk commented Jan 11, 2022

you're violating the Liskov principle

Sorry for misunderstanding, I thought that was the point of this Issue: subclassing without preserving Liskov is valid (and in my experience, hugely valuable).

I'd suggest not using inheritance and making AsyncThing instead an independent class that wraps Thing

Thanks for the suggestion. Using a HasA wrapper would be significantly more complex because, as you say, I'd have to provide empty implementations of all methods, just to declare the type, when the only concrete benefit would be satisfying mypy. So far, type: ignore seems to be the best compromise, so I'll go with that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic: feature Discussions about new features for Python's type annotations
Projects
None yet
Development

No branches or pull requests