Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

iterate is broken with subclasses of Interval #75

Closed
adm271828 opened this issue Jan 28, 2023 · 12 comments
Closed

iterate is broken with subclasses of Interval #75

adm271828 opened this issue Jan 28, 2023 · 12 comments
Labels
bug Something isn't working

Comments

@adm271828
Copy link

Hi Alexandre,

in version 2.3.0, when using subclasses of Interval, iterate will lead to infinite recursion.

Seems to be because, in iterate(...), exclude(value, i) has become exclude(singleton(value),i) where the singleton is an instance of Interval and not of the subclass being used. And the infinite recursion takes place in __lt__ operator that keeps calling itself it __gt__ has not been overwriten in the subclass.

Possible fix: replace singleton(value) calls in exclude/include loops with interval.__class__.from_atomic(Bound.CLOSED, value, value, Bound.CLOSED)

Great library BTW.

Best regards.

Antoine

@AlexandreDecan AlexandreDecan added the bug Something isn't working label Jan 28, 2023
@AlexandreDecan
Copy link
Owner

Hi Antoine,

Thanks for reporting this issue! Would it be possible to have the code of your subclass? I don't see why __lt__ should keep calling __gt__ even if the latter is not overwritten in the subclass (since, in that case, it will implicitly call Interval.__gt__). Also, having the code causing the issue would be nice, so I can test and confirm the bug locally ;)

@AlexandreDecan
Copy link
Owner

That said, the proposed fix is worth to implement anyway, since it allows to avoid mixing object types anyway ;-)

AlexandreDecan added a commit that referenced this issue Jan 28, 2023
@AlexandreDecan
Copy link
Owner

Could you check whether e49898c fixes the issue? I bypassed creating singleton instances (since the only reason to create singletons comes from a deprecated "feature"), so it should work :)

@adm271828
Copy link
Author

Hi Antoine,

Thanks for reporting this issue! Would it be possible to have the code of your subclass? I don't see why __lt__ should keep calling __gt__ even if the latter is not overwritten in the subclass (since, in that case, it will implicitly call Interval.__gt__). Also, having the code causing the issue would be nice, so I can test and confirm the bug locally ;)

Yes, this is what I said myself when I woke up this morning. The recursion is clearly in the comparaison operator. But not quite sure if/how it could be related on overwritting it...

Anyway, my class is specialized to deal with datetime intervals. Here is the smallest simplified code I can provide that triggers the bug:

import portion as P
import datetime as dt

class TimeExtent(P.Interval):
  def __init__(self):
    t0 = dt.datetime(2022,1,1)
    t1 = dt.datetime(2022,1,2)
    super().__init__(P.closedopen(t0,t1))

t=TimeExtent()
next(P.iterate(t, dt.datetime.resolution))

Antoine

@AlexandreDecan
Copy link
Owner

AlexandreDecan commented Jan 28, 2023

Thanks! I can reproduce the bug, and it is indeed fixed by the last commit. I don't know exactly what happened. I looked at Python's documentation for rich comparisons, and discovered something specific for subclasses I wasn't aware of:

There are no swapped-argument versions of these methods (to be used when the left argument does not support the operation but the right argument does); rather, lt() and gt() are each other’s reflection, le() and ge() are each other’s reflection, and eq() and ne() are their own reflection. If the operands are of different types, and right operand’s type is a direct or indirect subclass of the left operand’s type, the reflected method of the right operand has priority, otherwise the left operand’s method has priority. Virtual subclassing is not considered.

That means when singleton(v) < i (where i is a TimeExtent object), that's not singleton(v).__lt__(i) that is called, but rather i.__gt__(singleton(v)). Since __gt__ is defined in terms of __lt__, it returns other < self where other is v and self is i , hence again v < i that is again translated to i > v and so on.

I'll change our definition of __gt__ so that it does no longer rely on __lt__ for doing its job.

@AlexandreDecan
Copy link
Owner

Version 2.3.1 will be available with the corresponding fix on PyPI in a few minutes.

@adm271828
Copy link
Author

Good!
Explanation is subtle indeed.

BTW, this is completly unrelated, but you could also add a __format__ method to Interval, such as (or something equivalent, idea is easy to grasp):

  def __format__(self, spec):
    return P.to_string(self, lambda v: ('{:'+spec+'}').format(v))

Best regards,

Antoine

@AlexandreDecan
Copy link
Owner

That's something I already considered but I don't remember exactly why I decided not to implement it. :-)

Can you open a new issue for this, so that we'll keep the related discussion in it and, depending on the fate of this issue, either it will be implemented or reasons for not doing so will be described somewhere? Thanks :-)

@AlexandreDecan
Copy link
Owner

Oh I remember : that's because format can also be used to define some padding/alignment and there's no easy way to apply this to infinities, leading to unexpected string rendering and "inconsistencies".

@AlexandreDecan
Copy link
Owner

AlexandreDecan commented Feb 11, 2023

Hi @adm271828

I've made several recent (and experimental) changes to portion to ease defining and using subclasses of Interval. The "public side" of these changes is documented here: https://github.com/AlexandreDecan/portion#specialize--customize-intervals
In particular, the new create_api function might interest you :) (the documentation only scratches the surface about it, more details are available on https://github.com/AlexandreDecan/portion/blob/master/portion/api.py#L20)

Since you're the most recent user I know who already subclassed Interval, may I ask you to have a look at those changes and to provide feedback? :-)

@adm271828
Copy link
Author

Hi Alexandre,

I had a quick look. I don't know exactly what to say.

On one hand the new feature is a step forward handling what you call discrete intervals (the use case you selected) and, as a side effect, exposes the from_atomic method (although the need does for from_atomic not come from this specific use case) which is good. From a syntax point of view, the so called API looks good and keeps looking like the 'functional' design style you chose.

On the other hand (please don't consider what follows as an offense), I have the feeling there is an initial design flaw and I'm not sure the choices and the used vocabulary is not a continuation of it.

Let me try to explain this (from a library user perspective).

Basicaly I see an instance of Interval as a set of values taken from a domain where there is a strict order. I expect all set operations such as union, intersection, test for inclusion (value included into an interval, or between intervals)... to be available and be efficient. I dont really care about the internal representation: whether this is a list of sorted non overlapping atomic invervals or not. If this is the best way to do it, fine. If another internal representation is more efficient, even better! My code does not use / need atomic intervals. From my point of view this is an implementation detail of portion.

Of course the fact that their external (printed) representation is simplified is a bonus (much more readable for a user). And the fact I can add values to the set using comprehension (all values between a and b) is nice to have. Using a specific object (i.e an atomic interval) in order to insert many values at once make it easier, but I consider this object to be transient. I also expect Interval to provide every values only once when iterating over values in the set, which implies interval are simplified when enumerating.

It seems to me this was how the version 1.x was designed (I already used it) and I'm not sure I understood the rationale toward a new design in version 2.x.

The design in portion implies that a Interval is no longer a set of values but a set of atomic intervals, that now becomes objects exposed to the caller, to be partially managed by the caller and to which Interval applies some transformations (for instance in order to keep a list of non overlapping intervals). Hence the need to cooperate with the user if atomic intervals are made more specific, and the need for an API to manage this interaction. from_atomic is part of this API, though it is not really documented (it forces subclasses of Interval to become factories of themselfves for some restricted cases, i.e. atomic, which I consider to be an akward pattern).

I agree the description of the library that you provide is ambiguous enough so that it can be claimed it was explicit: "The library provides data structure and operations for intervals". I'm not sure this is how that vast majority of users unterstood it.

Back to the use case you selected, the underlying problem is to restrict the domain from which values can be taken, and to let Interval know what are the constrains on this domain. What I would do is to have the domain parameter in the constructor of Interval that would provide this knowledge (in C++ I would make Interval a template class and domain a template parameter). The domain could provide a class (all values in interval must be instances of - for intance int), a prev / next callable for iterate in order over value (I don't like incr / decr that have a numeric connotation). This could also be an iterator class (backward and forward), and perhaps an includes method that check if a value is in the domain.

Of course, open, closed, openclosed and their friends shall become methods of the Interval class in order to know the domain. And operations between Intervals would be performed only when they have compatible domain...

Just my two cents...

Best regards,

Antoine

@AlexandreDecan
Copy link
Owner

Thanks for this feedback! ;-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants