Skip to content

Commit

Permalink
API: Remove CalendarDay (pandas-dev#24330)
Browse files Browse the repository at this point in the history
  • Loading branch information
mroeschke authored and Pingviinituutti committed Feb 28, 2019
1 parent ceba41c commit 82f0958
Show file tree
Hide file tree
Showing 13 changed files with 52 additions and 252 deletions.
25 changes: 2 additions & 23 deletions doc/source/timeseries.rst
Original file line number Diff line number Diff line change
Expand Up @@ -408,7 +408,7 @@ In practice this becomes very cumbersome because we often need a very long
index with a large number of timestamps. If we need timestamps on a regular
frequency, we can use the :func:`date_range` and :func:`bdate_range` functions
to create a ``DatetimeIndex``. The default frequency for ``date_range`` is a
**day** while the default for ``bdate_range`` is a **business day**:
**calendar day** while the default for ``bdate_range`` is a **business day**:

.. ipython:: python
Expand Down Expand Up @@ -927,26 +927,6 @@ in the operation).
.. _relativedelta documentation: https://dateutil.readthedocs.io/en/stable/relativedelta.html

.. _timeseries.dayvscalendarday:

Day vs. CalendarDay
~~~~~~~~~~~~~~~~~~~

:class:`Day` (``'D'``) is a timedelta-like offset that respects absolute time
arithmetic and is an alias for 24 :class:`Hour`. This offset is the default
argument to many pandas time related function like :func:`date_range` and :func:`timedelta_range`.

:class:`CalendarDay` (``'CD'``) is a relativedelta-like offset that respects
calendar time arithmetic. :class:`CalendarDay` is useful preserving calendar day
semantics with date times with have day light savings transitions, i.e. :class:`CalendarDay`
will preserve the hour before the day light savings transition.

.. ipython:: python
ts = pd.Timestamp('2016-10-30 00:00:00', tz='Europe/Helsinki')
ts + pd.offsets.Day(1)
ts + pd.offsets.CalendarDay(1)

Parametric Offsets
~~~~~~~~~~~~~~~~~~
Expand Down Expand Up @@ -1243,8 +1223,7 @@ frequencies. We will refer to these aliases as *offset aliases*.

"B", "business day frequency"
"C", "custom business day frequency"
"D", "day frequency"
"CD", "calendar day frequency"
"D", "calendar day frequency"
"W", "weekly frequency"
"M", "month end frequency"
"SM", "semi-month end frequency (15th and end of month)"
Expand Down
42 changes: 0 additions & 42 deletions doc/source/whatsnew/v0.24.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -591,48 +591,6 @@ that the dates have been converted to UTC
pd.to_datetime(["2015-11-18 15:30:00+05:30",
"2015-11-18 16:30:00+06:30"], utc=True)
.. _whatsnew_0240.api_breaking.calendarday:

CalendarDay Offset
^^^^^^^^^^^^^^^^^^

:class:`Day` and associated frequency alias ``'D'`` were documented to represent
a calendar day; however, arithmetic and operations with :class:`Day` sometimes
respected absolute time instead (i.e. ``Day(n)`` and acted identically to ``Timedelta(days=n)``).

*Previous Behavior*:

.. code-block:: ipython
In [2]: ts = pd.Timestamp('2016-10-30 00:00:00', tz='Europe/Helsinki')
# Respects calendar arithmetic
In [3]: pd.date_range(start=ts, freq='D', periods=3)
Out[3]:
DatetimeIndex(['2016-10-30 00:00:00+03:00', '2016-10-31 00:00:00+02:00',
'2016-11-01 00:00:00+02:00'],
dtype='datetime64[ns, Europe/Helsinki]', freq='D')
# Respects absolute arithmetic
In [4]: ts + pd.tseries.frequencies.to_offset('D')
Out[4]: Timestamp('2016-10-30 23:00:00+0200', tz='Europe/Helsinki')
*New Behavior*:

:class:`CalendarDay` and associated frequency alias ``'CD'`` are now available
and respect calendar day arithmetic while :class:`Day` and frequency alias ``'D'``
will now respect absolute time (:issue:`22274`, :issue:`20596`, :issue:`16980`, :issue:`8774`)
See the :ref:`documentation here <timeseries.dayvscalendarday>` for more information.

Addition with :class:`CalendarDay` across a daylight savings time transition:

.. ipython:: python
ts = pd.Timestamp('2016-10-30 00:00:00', tz='Europe/Helsinki')
ts + pd.offsets.Day(1)
ts + pd.offsets.CalendarDay(1)
.. _whatsnew_0240.api_breaking.period_end_time:

Time values in ``dt.end_time`` and ``to_timestamp(how='end')``
Expand Down
28 changes: 14 additions & 14 deletions pandas/core/arrays/datetimes.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@
import pandas.core.common as com

from pandas.tseries.frequencies import get_period_alias, to_offset
from pandas.tseries.offsets import Tick, generate_range
from pandas.tseries.offsets import Day, Tick, generate_range

_midnight = time(0, 0)

Expand Down Expand Up @@ -255,7 +255,8 @@ def _from_sequence(cls, data, dtype=None, copy=False,

@classmethod
def _generate_range(cls, start, end, periods, freq, tz=None,
normalize=False, ambiguous='raise', closed=None):
normalize=False, ambiguous='raise',
nonexistent='raise', closed=None):

periods = dtl.validate_periods(periods)
if freq is None and any(x is None for x in [periods, start, end]):
Expand Down Expand Up @@ -285,7 +286,7 @@ def _generate_range(cls, start, end, periods, freq, tz=None,
start, end, _normalized = _maybe_normalize_endpoints(start, end,
normalize)

tz, _ = _infer_tz_from_endpoints(start, end, tz)
tz = _infer_tz_from_endpoints(start, end, tz)

if tz is not None:
# Localize the start and end arguments
Expand All @@ -295,22 +296,22 @@ def _generate_range(cls, start, end, periods, freq, tz=None,
end = _maybe_localize_point(
end, getattr(end, 'tz', None), end, freq, tz
)
if start and end:
# Make sure start and end have the same tz
start = _maybe_localize_point(
start, start.tz, end.tz, freq, tz
)
end = _maybe_localize_point(
end, end.tz, start.tz, freq, tz
)
if freq is not None:
# We break Day arithmetic (fixed 24 hour) here and opt for
# Day to mean calendar day (23/24/25 hour). Therefore, strip
# tz info from start and day to avoid DST arithmetic
if isinstance(freq, Day):
if start is not None:
start = start.tz_localize(None)
if end is not None:
end = end.tz_localize(None)
# TODO: consider re-implementing _cached_range; GH#17914
index = _generate_regular_range(cls, start, end, periods, freq)

if tz is not None and index.tz is None:
arr = conversion.tz_localize_to_utc(
index.asi8,
tz, ambiguous=ambiguous)
tz, ambiguous=ambiguous, nonexistent=nonexistent)

index = cls(arr)

Expand Down Expand Up @@ -1878,7 +1879,6 @@ def _infer_tz_from_endpoints(start, end, tz):
Returns
-------
tz : tzinfo or None
inferred_tz : tzinfo or None
Raises
------
Expand All @@ -1901,7 +1901,7 @@ def _infer_tz_from_endpoints(start, end, tz):
elif inferred_tz is not None:
tz = inferred_tz

return tz, inferred_tz
return tz


def _maybe_normalize_endpoints(start, end, normalize):
Expand Down
21 changes: 13 additions & 8 deletions pandas/core/resample.py
Original file line number Diff line number Diff line change
Expand Up @@ -1403,7 +1403,9 @@ def _get_time_bins(self, ax):
start=first,
end=last,
tz=tz,
name=ax.name)
name=ax.name,
ambiguous='infer',
nonexistent='shift')

# GH 15549
# In edge case of tz-aware resapmling binner last index can be
Expand Down Expand Up @@ -1607,7 +1609,7 @@ def _get_timestamp_range_edges(first, last, offset, closed='left', base=0):
Adjust the `first` Timestamp to the preceeding Timestamp that resides on
the provided offset. Adjust the `last` Timestamp to the following
Timestamp that resides on the provided offset. Input Timestamps that
already reside on the offset will be adjusted depeding on the type of
already reside on the offset will be adjusted depending on the type of
offset and the `closed` parameter.
Parameters
Expand All @@ -1627,18 +1629,21 @@ def _get_timestamp_range_edges(first, last, offset, closed='left', base=0):
-------
A tuple of length 2, containing the adjusted pd.Timestamp objects.
"""
if not all(isinstance(obj, pd.Timestamp) for obj in [first, last]):
raise TypeError("'first' and 'last' must be instances of type "
"Timestamp")

if isinstance(offset, Tick):
is_day = isinstance(offset, Day)
day_nanos = delta_to_nanoseconds(timedelta(1))

# #1165 and #24127
if (is_day and not offset.nanos % day_nanos) or not is_day:
return _adjust_dates_anchored(first, last, offset,
closed=closed, base=base)
first, last = _adjust_dates_anchored(first, last, offset,
closed=closed, base=base)
if is_day and first.tz is not None:
# _adjust_dates_anchored assumes 'D' means 24H, but first/last
# might contain a DST transition (23H, 24H, or 25H).
# Ensure first/last snap to midnight.
first = first.normalize()
last = last.normalize()
return first, last

else:
first = first.normalize()
Expand Down
17 changes: 4 additions & 13 deletions pandas/tests/indexes/datetimes/test_date_range.py
Original file line number Diff line number Diff line change
Expand Up @@ -359,18 +359,18 @@ def test_range_tz_pytz(self):
Timestamp(datetime(2013, 11, 6), tz='US/Eastern')]
])
def test_range_tz_dst_straddle_pytz(self, start, end):
dr = date_range(start, end, freq='CD')
dr = date_range(start, end, freq='D')
assert dr[0] == start
assert dr[-1] == end
assert np.all(dr.hour == 0)

dr = date_range(start, end, freq='CD', tz='US/Eastern')
dr = date_range(start, end, freq='D', tz='US/Eastern')
assert dr[0] == start
assert dr[-1] == end
assert np.all(dr.hour == 0)

dr = date_range(start.replace(tzinfo=None), end.replace(
tzinfo=None), freq='CD', tz='US/Eastern')
tzinfo=None), freq='D', tz='US/Eastern')
assert dr[0] == start
assert dr[-1] == end
assert np.all(dr.hour == 0)
Expand Down Expand Up @@ -604,14 +604,6 @@ def test_mismatching_tz_raises_err(self, start, end):
with pytest.raises(TypeError):
pd.date_range(start, end, freq=BDay())

def test_CalendarDay_range_with_dst_crossing(self):
# GH 20596
result = date_range('2018-10-23', '2018-11-06', freq='7CD',
tz='Europe/Paris')
expected = date_range('2018-10-23', '2018-11-06',
freq=pd.DateOffset(days=7), tz='Europe/Paris')
tm.assert_index_equal(result, expected)


class TestBusinessDateRange(object):

Expand Down Expand Up @@ -766,8 +758,7 @@ def test_cdaterange_weekmask_and_holidays(self):
holidays=['2013-05-01'])

@pytest.mark.parametrize('freq', [freq for freq in prefix_mapping
if freq.startswith('C')
and freq != 'CD']) # CalendarDay
if freq.startswith('C')])
def test_all_custom_freq(self, freq):
# should not raise
bdate_range(START, END, freq=freq, weekmask='Mon Wed Fri',
Expand Down
4 changes: 2 additions & 2 deletions pandas/tests/indexes/datetimes/test_timezones.py
Original file line number Diff line number Diff line change
Expand Up @@ -436,7 +436,7 @@ def test_dti_tz_localize_utc_conversion(self, tz):

@pytest.mark.parametrize('idx', [
date_range(start='2014-01-01', end='2014-12-31', freq='M'),
date_range(start='2014-01-01', end='2014-12-31', freq='CD'),
date_range(start='2014-01-01', end='2014-12-31', freq='D'),
date_range(start='2014-01-01', end='2014-03-01', freq='H'),
date_range(start='2014-08-01', end='2014-10-31', freq='T')
])
Expand Down Expand Up @@ -1072,7 +1072,7 @@ def test_date_range_span_dst_transition(self, tzstr):

dr = date_range('2012-11-02', periods=10, tz=tzstr)
result = dr.hour
expected = Index([0, 0, 0, 23, 23, 23, 23, 23, 23, 23])
expected = Index([0] * 10)
tm.assert_index_equal(result, expected)

@pytest.mark.parametrize('tzstr', ['US/Eastern', 'dateutil/US/Eastern'])
Expand Down
4 changes: 0 additions & 4 deletions pandas/tests/indexes/timedeltas/test_timedelta_range.py
Original file line number Diff line number Diff line change
Expand Up @@ -49,10 +49,6 @@ def test_timedelta_range(self):
result = df.loc['0s':, :]
tm.assert_frame_equal(expected, result)

with pytest.raises(ValueError):
# GH 22274: CalendarDay is a relative time measurement
timedelta_range('1day', freq='CD', periods=2)

@pytest.mark.parametrize('periods, freq', [
(3, '2D'), (5, 'D'), (6, '19H12T'), (7, '16H'), (9, '12H')])
def test_linspace_behavior(self, periods, freq):
Expand Down
8 changes: 4 additions & 4 deletions pandas/tests/resample/test_datetime_index.py
Original file line number Diff line number Diff line change
Expand Up @@ -1279,7 +1279,7 @@ def test_resample_dst_anchor(self):
# 5172
dti = DatetimeIndex([datetime(2012, 11, 4, 23)], tz='US/Eastern')
df = DataFrame([5], index=dti)
assert_frame_equal(df.resample(rule='CD').sum(),
assert_frame_equal(df.resample(rule='D').sum(),
DataFrame([5], index=df.index.normalize()))
df.resample(rule='MS').sum()
assert_frame_equal(
Expand Down Expand Up @@ -1333,14 +1333,14 @@ def test_resample_dst_anchor(self):

df_daily = df['10/26/2013':'10/29/2013']
assert_frame_equal(
df_daily.resample("CD").agg({"a": "min", "b": "max", "c": "count"})
df_daily.resample("D").agg({"a": "min", "b": "max", "c": "count"})
[["a", "b", "c"]],
DataFrame({"a": [1248, 1296, 1346, 1394],
"b": [1295, 1345, 1393, 1441],
"c": [48, 50, 48, 48]},
index=date_range('10/26/2013', '10/29/2013',
freq='CD', tz='Europe/Paris')),
'CD Frequency')
freq='D', tz='Europe/Paris')),
'D Frequency')

def test_downsample_across_dst(self):
# GH 8531
Expand Down
7 changes: 4 additions & 3 deletions pandas/tests/resample/test_period_index.py
Original file line number Diff line number Diff line change
Expand Up @@ -289,10 +289,11 @@ def test_resample_nonexistent_time_bin_edge(self):
index = date_range(start='2017-10-10', end='2017-10-20', freq='1H')
index = index.tz_localize('UTC').tz_convert('America/Sao_Paulo')
df = DataFrame(data=list(range(len(index))), index=index)
result = df.groupby(pd.Grouper(freq='1D'))
result = df.groupby(pd.Grouper(freq='1D')).count()
expected = date_range(start='2017-10-09', end='2017-10-20', freq='D',
tz="America/Sao_Paulo")
tm.assert_index_equal(result.count().index, expected)
tz="America/Sao_Paulo", nonexistent='shift',
closed='left')
tm.assert_index_equal(result.index, expected)

def test_resample_ambiguous_time_bin_edge(self):
# GH 10117
Expand Down
2 changes: 1 addition & 1 deletion pandas/tests/series/test_timezones.py
Original file line number Diff line number Diff line change
Expand Up @@ -343,7 +343,7 @@ def test_getitem_pydatetime_tz(self, tzstr):

def test_series_truncate_datetimeindex_tz(self):
# GH 9243
idx = date_range('4/1/2005', '4/30/2005', freq='CD', tz='US/Pacific')
idx = date_range('4/1/2005', '4/30/2005', freq='D', tz='US/Pacific')
s = Series(range(len(idx)), index=idx)
result = s.truncate(datetime(2005, 4, 2), datetime(2005, 4, 4))
expected = Series([1, 2, 3], index=idx[1:4])
Expand Down
Loading

0 comments on commit 82f0958

Please sign in to comment.