-
-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add more precise but flexible typing for statistics, time, and datetime. #12175
base: main
Are you sure you want to change the base?
Conversation
Most of the functions in the module include fairly straightforward arithmetic operations that can be executed on any type of Real number, not just floats or fractions. Decimal still needs to be included explicitly, as it is not a subtype of Real.
…ent it. While the C standard does not define the type for time_t, all of the APIs using it can only provide resolution of one second. While `SupportsInt` should describe this correctly, passing a Decimal object does appear to cause a `DeprecationWarning` right now: >>> time.localtime(decimal.Decimal(1)) <stdin>:1: DeprecationWarning: an integer is required (got type decimal.Decimal). Implicit conversion to integers using __int__ is deprecated, and may be removed in a future version of Python. time.struct_time(tm_year=1969, tm_mon=12, tm_mday=31, tm_hour=16, tm_min=0, tm_sec=1, tm_wday=2, tm_yday=365, tm_isdst=0)
The `fromtimestamp()` API differs in behaviour between `date` and `datetime`. In the case of `date`, the parameter is passed directly to `time.localtime()` or `time.gmtime()`, which only accept integral values, so it is obvious for it to take `SupportsInt` (despite the `DeprecationWarning` when passing, for instance, `Decimal` numbers.) Due to the nature of the type, sub-second precision is not necessary. In the case of `datetime`, sub-second precision *is* necessary, so at least up to Python 3.12, it's this implementation that extracts the decimal part of the parameter. But it then take the remainder integral part and passes it to the same `time.localtime()`, which requires a conversion to integer. This means that while the function takes `float` parameters just fine, it requires them to also support `int` conversion — so it will not accept e.g. `Fraction` parameters. >>> datetime.date.fromtimestamp(fractions.Fraction(1, 1)) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: an integer is required (got type Fraction) >>> datetime.datetime.fromtimestamp(decimal.Decimal(3)) datetime.datetime(1969, 12, 31, 16, 0, 3) It also doesn't quite always preserve the sub-second precision for non-float types: >>> t = time.time() >>> t 1718884427.5039804 >>> d = decimal.Decimal(t) >>> d Decimal('1718884427.5039803981781005859375') >>> datetime.datetime.fromtimestamp(t) datetime.datetime(2024, 6, 20, 4, 53, 47, 503980) >>> datetime.datetime.fromtimestamp(d) datetime.datetime(2024, 6, 20, 4, 53, 47)
@@ -32,8 +32,8 @@ if sys.version_info >= (3, 13): | |||
__all__ += ["kde", "kde_random"] | |||
|
|||
# Most functions in this module accept homogeneous collections of one of these types | |||
_Number: TypeAlias = float | Decimal | Fraction | |||
_NumberT = TypeVar("_NumberT", float, Decimal, Fraction) | |||
_Number: TypeAlias = Real | Decimal |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm afraid type checkers don't understand float to be a subtype of numbers.Real
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit scared now — I know pyre does, and ISTR mypy did as well?
Is there an easy-to-use playground to run the same typing information across all the supported checkers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mypy has never properly supported the ABCs in the numbers module, and I don't believe pyright has either. There's no cross-typechecker playground, but MyPy's playground is at https://mypy-play.net/?mypy=latest&python=3.12 and pyright's is at https://pyright-play.net/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can see evidence of this in the new openlibrary
hit reported by mypy_primer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay this is worse than I expected :(
Not only mypy doesn't support it (I might be confusing it with pytype, but it's been years, I'm lost), but neither of them seem to even think that a * 2
maintains the type.
Do you have a suggestion on how to handle this? Just add Real
without removing float
? Replace it with SupportsFloat
even if it's not quite the right type?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The numbers ABCs are sort-of fundamentally broken when it comes to static typing, I'm afraid. It's impossible to add proper annotations to many of the methods for those ABCs in numbers.pyi here in typeshed, and so as a result we deliberately don't have the builtin numeric classes inherit from them in builtins.pyi; this is why type checkers do not understand them.
I'll take a look in a bit and see if I can think of what to do here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can I ask what the problem you're trying to solve is here? I feel like our current versions of _Number
and _NumberT
here are going to work well for the vast majority of uses of the statistics
functions. Is there a specific false positive mypy emitted on some code that you're trying to fix?
Diff from mypy_primer, showing the effect of this PR on open source code: pandas (https://github.com/pandas-dev/pandas)
+ pandas/_libs/tslibs/timestamps.pyi:88: error: Signature of "fromtimestamp" incompatible with supertype "date" [override]
+ pandas/_libs/tslibs/timestamps.pyi:88: note: Superclass:
+ pandas/_libs/tslibs/timestamps.pyi:88: note: @classmethod
+ pandas/_libs/tslibs/timestamps.pyi:88: note: def fromtimestamp(cls, SupportsInt, /) -> Timestamp
+ pandas/_libs/tslibs/timestamps.pyi:88: note: Subclass:
+ pandas/_libs/tslibs/timestamps.pyi:88: note: @classmethod
+ pandas/_libs/tslibs/timestamps.pyi:88: note: def fromtimestamp(cls, ts: float, tz: str | tzinfo | int | None = ...) -> Timestamp
+ pandas/_libs/tslibs/timestamps.pyi:88: error: Argument 1 of "fromtimestamp" is incompatible with supertype "datetime"; supertype defines the argument type as "SupportsInt" [override]
+ pandas/_libs/tslibs/timestamps.pyi:88: note: This violates the Liskov substitution principle
+ pandas/_libs/tslibs/timestamps.pyi:88: note: See https://mypy.readthedocs.io/en/stable/common_issues.html#incompatible-overrides
+ pandas/_libs/tslibs/timestamps.pyi:90: error: Argument 1 of "utcfromtimestamp" is incompatible with supertype "datetime"; supertype defines the argument type as "SupportsInt" [override]
+ pandas/_libs/tslibs/timestamps.pyi:90: note: This violates the Liskov substitution principle
+ pandas/_libs/tslibs/timestamps.pyi:90: note: See https://mypy.readthedocs.io/en/stable/common_issues.html#incompatible-overrides
openlibrary (https://github.com/internetarchive/openlibrary)
+ openlibrary/solr/updater/work.py: note: In member "number_of_pages_median" of class "WorkSolrBuilder":
+ openlibrary/solr/updater/work.py:377: error: Value of type variable "_NumberT" of "median" cannot be "int" [type-var]
pandas-stubs (https://github.com/pandas-dev/pandas-stubs)
+ pandas-stubs/_libs/tslibs/timestamps.pyi:107: error: Signature of "fromtimestamp" incompatible with supertype "date" [override]
+ pandas-stubs/_libs/tslibs/timestamps.pyi:107: note: Superclass:
+ pandas-stubs/_libs/tslibs/timestamps.pyi:107: note: @classmethod
+ pandas-stubs/_libs/tslibs/timestamps.pyi:107: note: def fromtimestamp(cls, SupportsInt, /) -> Timestamp
+ pandas-stubs/_libs/tslibs/timestamps.pyi:107: note: Subclass:
+ pandas-stubs/_libs/tslibs/timestamps.pyi:107: note: @classmethod
+ pandas-stubs/_libs/tslibs/timestamps.pyi:107: note: def fromtimestamp(cls, t: float, tz: tzinfo | str | None = ...) -> Timestamp
+ pandas-stubs/_libs/tslibs/timestamps.pyi:107: error: Argument 1 of "fromtimestamp" is incompatible with supertype "datetime"; supertype defines the argument type as "SupportsInt" [override]
+ pandas-stubs/_libs/tslibs/timestamps.pyi:107: note: This violates the Liskov substitution principle
+ pandas-stubs/_libs/tslibs/timestamps.pyi:107: note: See https://mypy.readthedocs.io/en/stable/common_issues.html#incompatible-overrides
+ pandas-stubs/_libs/tslibs/timestamps.pyi:115: error: Argument 1 of "utcfromtimestamp" is incompatible with supertype "datetime"; supertype defines the argument type as "SupportsInt" [override]
+ pandas-stubs/_libs/tslibs/timestamps.pyi:115: note: This violates the Liskov substitution principle
+ pandas-stubs/_libs/tslibs/timestamps.pyi:115: note: See https://mypy.readthedocs.io/en/stable/common_issues.html#incompatible-overrides
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes to use SupportsInt
don't look correct either, unfortunately. Possibly SupportsIndex
would be a better fit than SupportsInt
:
@@ -51,7 +51,7 @@ class date: | |||
resolution: ClassVar[timedelta] | |||
def __new__(cls, year: SupportsIndex, month: SupportsIndex, day: SupportsIndex) -> Self: ... | |||
@classmethod | |||
def fromtimestamp(cls, timestamp: float, /) -> Self: ... | |||
def fromtimestamp(cls, timestamp: SupportsInt, /) -> Self: ... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't correct:
>>> class HasInt:
... def __int__(self): return 42
...
>>> import datetime as dt
>>> dt.date.fromtimestamp(HasInt())
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'HasInt' object cannot be interpreted as an integer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe you meant typing.SupportsIndex
?
>>> class HasIndex:
... def __index__(self): return 42
...
>>> dt.date.fromtimestamp(HasIndex())
datetime.date(1970, 1, 1)
@@ -259,14 +259,14 @@ class datetime(date): | |||
# meaning it is only *safe* to pass it as a keyword argument on 3.12+ | |||
if sys.version_info >= (3, 12): | |||
@classmethod | |||
def fromtimestamp(cls, timestamp: float, tz: _TzInfo | None = ...) -> Self: ... | |||
def fromtimestamp(cls, timestamp: SupportsInt, tz: _TzInfo | None = ...) -> Self: ... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nor is this:
>>> dt.datetime.fromtimestamp(HasInt())
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'HasInt' object cannot be interpreted as an integer
|
||
@classmethod | ||
@deprecated("Use timezone-aware objects to represent datetimes in UTC; e.g. by calling .fromtimestamp(datetime.UTC)") | ||
def utcfromtimestamp(cls, t: float, /) -> Self: ... | ||
def utcfromtimestamp(cls, t: SupportsInt, /) -> Self: ... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nor is this:
>>> dt.datetime.utcfromtimestamp(HasInt())
<stdin>:1: DeprecationWarning: datetime.datetime.utcfromtimestamp() is deprecated and scheduled for removal in a future version. Use timezone-aware objects to represent datetimes in UTC: datetime.datetime.fromtimestamp(timestamp, datetime.UTC).
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'HasInt' object cannot be interpreted as an integer
def ctime(seconds: SupportsInt | None = None, /) -> str: ... | ||
def gmtime(seconds: SupportsInt | None = None, /) -> struct_time: ... | ||
def localtime(seconds: SupportsInt | None = None, /) -> struct_time: ... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are all incorrect as well:
>>> import time
>>> class HasInt:
... def __int__(self): return 42
...
>>> time.ctime(HasInt())
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'HasInt' object cannot be interpreted as an integer
>>> time.gmtime(HasInt())
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'HasInt' object cannot be interpreted as an integer
>>> time.localtime(HasInt())
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'HasInt' object cannot be interpreted as an integer
No description provided.