Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

timedate.apply_year_offset causes datetime overflow for sufficiently large start year #96

Closed
spencerahill opened this issue Oct 24, 2016 · 2 comments

Comments

@spencerahill
Copy link
Owner

Due to the nanosecond resolution, the lower and upper bounds on dates are 1677-09-21 and 2262-04-11 (e.g. here). timedate.apply_year_offset adds the constant TimeManager.YEAROFFSET, which is currently set to 1899, if the inputted datetime.datetime object starts before the minimum year.

This causes an overflow and a Pandas OutOfBoundsDatetime error if any of the inputted years are greater than 2262-1899=363, because then the start year exceeds the 2262 bound:

In [1]: import aospy
import
In [2]: import datetime

In [3]: aospy.TimeManager.apply_year_offset(datetime.datetime(363,1,1))
Out[3]: Timestamp('2262-01-01 00:00:00')

In [4]: aospy.TimeManager.apply_year_offset(datetime.datetime(364,1,1))
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
/Users/spencer/miniconda3/lib/python3.5/site-packages/pandas/tseries/tools.py in _convert_listlike(arg, box, format, name)
    408             try:
--> 409                 values, tz = tslib.datetime_to_datetime64(arg)
    410                 return DatetimeIndex._simple_new(values, name=name, tz=tz)

pandas/tslib.pyx in pandas.tslib.datetime_to_datetime64 (pandas/tslib.c:29768)()

TypeError: Unrecognized value type: <class 'datetime.date'>

During handling of the above exception, another exception occurred:

OutOfBoundsDatetime                       Traceback (most recent call last)
<ipython-input-4-17b710b49c96> in <module>()
----> 1 aospy.TimeManager.apply_year_offset(datetime.datetime(364,1,1))

/Users/spencer/Dropbox/py/aospy/aospy/timedate.py in apply_year_offset(cls, date)
    126             offset = 0
    127         return pd.to_datetime(cls.ymd_to_numpy(date.year + offset,
--> 128                                                date.month, date.day))
    129
    130

/Users/spencer/miniconda3/lib/python3.5/site-packages/pandas/util/decorators.py in wrapper(*args, **kwargs)
     89                 else:
     90                     kwargs[new_arg_name] = new_arg_value
---> 91             return func(*args, **kwargs)
     92         return wrapper
     93     return _deprecate_kwarg

/Users/spencer/miniconda3/lib/python3.5/site-packages/pandas/tseries/tools.py in to_datetime(arg, errors, dayfirst, yearfirst, utc, box, format, exact, coerce, unit, infer_datetime_format)
    289                         yearfirst=yearfirst,
    290                         utc=utc, box=box, format=format, exact=exact,
--> 291                         unit=unit, infer_datetime_format=infer_datetime_format)
    292
    293

/Users/spencer/miniconda3/lib/python3.5/site-packages/pandas/tseries/tools.py in _to_datetime(arg, errors, dayfirst, yearfirst, utc, box, format, exact, unit, freq, infer_datetime_format)
    427         return _convert_listlike(arg, box, format)
    428
--> 429     return _convert_listlike(np.array([arg]), box, format)[0]
    430
    431 # mappings for assembling units

/Users/spencer/miniconda3/lib/python3.5/site-packages/pandas/tseries/tools.py in _convert_listlike(arg, box, format, name)
    410                 return DatetimeIndex._simple_new(values, name=name, tz=tz)
    411             except (ValueError, TypeError):
--> 412                 raise e
    413
    414     if arg is None:

/Users/spencer/miniconda3/lib/python3.5/site-packages/pandas/tseries/tools.py in _convert_listlike(arg, box, format, name)
    396                     yearfirst=yearfirst,
    397                     freq=freq,
--> 398                     require_iso8601=require_iso8601
    399                 )
    400

pandas/tslib.pyx in pandas.tslib.array_to_datetime (pandas/tslib.c:41972)()

pandas/tslib.pyx in pandas.tslib.array_to_datetime (pandas/tslib.c:40843)()

pandas/tslib.pyx in pandas.tslib.array_to_datetime (pandas/tslib.c:39338)()

pandas/tslib.pyx in pandas.tslib.array_to_datetime (pandas/tslib.c:39232)()

pandas/tslib.pyx in pandas.tslib._check_dts_bounds (pandas/tslib.c:29245)()

OutOfBoundsDatetime: Out of bounds nanosecond timestamp: 2263-01-01 00:00:00

@spencerkclark, what if, rather than applying a constant offset, we just directly set the start year to be 1678? The catch is then retaining the original year values. But looking through the code and remembering our most recent conversation (although I can't find it on GH), right now we aren't even bothering to change the years back anyways...am I remembering that right?

@spencerkclark
Copy link
Collaborator

@spencerahill I'm all for this 👍

We should design things such that one can use the full valid Pandas window, regardless of the start date of the analysis. You are correct that we currently do not convert dates back to their original values (see #94).

Obviously this still doesn't solve the issue of how to handle time ranges greater than 585 years; that's a tricky one, but perhaps if it came to it, we could think about some way to systematically break the long time range into manageable chunks, perform some operation on each chunk, and bring the chunks back together for some final reduction. This seems complicated, but I'm not sure if there is an easier alternative.

@spencerahill
Copy link
Owner Author

Thanks! Ha, not sure how I missed #94...literally the most recent Issue before this one.

Ok cool, as a quick fix I'll implement setting the first year directly to (one year after) the minimum.

Longer term, yes that might be how we have to go if it doesn't get fixed upstream.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants