Fix Series[timedelta64]+DatetimeIndex[tz] bugs #18884

jbrockmendel · 2017-12-20T22:22:45Z

ser + index lost timezone
index + ser retained timezone but returned a DatetimeIndex

closes DatetimeIndex + TimeDelta gives wrong results when timezone is set #13905
tests added / passed
passes git diff upstream/master -u -- "*.py" | flake8 --diff
whatsnew entry

ser + index lossed timezone closes pandas-dev#13905 index + ser retained timezone but returned a DatetimeIndex

jbrockmendel · 2017-12-21T01:28:51Z

pandas/core/ops.py

+            result = op(pd.TimedeltaIndex(left), right)
+            return construct_result(left, result,
+                                    index=left.index, name=left.name,
+                                    dtype=result.dtype)


should name be passed using com._maybe_match_name here? Not sure what the convention is.

jreback · 2017-12-21T14:04:32Z

pandas/core/ops.py

+            result = op(pd.TimedeltaIndex(left), right)
+            return construct_result(left, result,
+                                    index=left.index, name=left.name,
+                                    dtype=result.dtype)


jreback · 2017-12-21T14:05:43Z

pandas/core/ops.py

@@ -709,6 +709,15 @@ def wrapper(left, right, name=name, na_op=na_op):

        left, right = _align_method_SERIES(left, right)

+        if is_timedelta64_dtype(left) and isinstance(right, pd.DatetimeIndex):


don't special case things here, this is the point of the _TimeOp class.

This is the appropriate place. Adding additional wrapping/unwrapping in _TimeOp obscures whats going on. Even if it wasn't a third(!) level of wrapping, b/c of namespacing/closure issues it is essentially impossible to handle overflow checks correctly with the current setup.

Eventually this is going to have to look like:

def wrapper(...) if isinstance(right, ABCDataFrame): return NotImplemented elif is_timedelta64_dtype(left): [dispatch to TimedeltaIndex op] elif (is_datetime64_dtype(left) or is_datetime64tz_dtype(left)): [dispatch to DatetimeIndex op] [everything else]

Everything _TimeOp does is done (better) in the index classes. Anything other than dispatching to those methods is unnecessary duplication and an invitation to inconsistency.

so let's untangle that first. This bug fix just addsmore things to undo later.

This bug fix just addsmore things to undo later.

This won't need to be undone. Eventually the and isinstance(right, pd.DatetimeIndex) part of if is_timedelta64_dtype(left) and isinstance(right, pd.DatetimeIndex): is removed and the timedelta64_dtype case is complete.

I've given the order of edits quite a bit of thought. Transitioning towards the dispatch approach case-by-case and implementing corresponding tests along the way is the way to go.

so let's untangle that first.

Untangle which first? The mess of closures that prevents overflow checks?

I guess if you want to untangle _TimeOp independently, #18832 is a step in that direction, is orthogonal to the other outstanding PRs.

well, until #18832 this needs to change as I have indicated.

#18832 is orthogonal.

What change have you indicated? Not clear on what "untangle this first" refers to.

well I don't want this here. put it where the other conversions are. that can be reactored at some point if its worthile.

I guess if I squint it kind of makes sense to avoid fixing the overflow-check bug in this PR and instead do it separately. Will change.

jreback · 2017-12-21T18:45:33Z

pandas/tests/indexes/datetimes/test_arithmetic.py

+                        index=index, name=names[1])
+        expected = pd.Series(index + pd.Timedelta(seconds=5),
+                             index=index, name=names[2])
+        # passing name arg isn't enough when names[2] is None


pls add blank lines before comments

jreback · 2017-12-21T18:46:18Z

pandas/tests/indexes/datetimes/test_arithmetic.py

+        expected.name = names[2]
+        assert expected.dtype == index.dtype
+        res = ser + index
+        tm.assert_series_equal(res, expected)


pls use result=

…x_with_series

codecov · 2017-12-21T21:44:15Z

Codecov Report

Merging #18884 into master will increase coverage by <.01%.
The diff coverage is 93.33%.

@@            Coverage Diff             @@
##           master   #18884      +/-   ##
==========================================
+ Coverage   91.57%   91.57%   +<.01%     
==========================================
  Files         150      150              
  Lines       48941    48948       +7     
==========================================
+ Hits        44817    44824       +7     
  Misses       4124     4124

Flag	Coverage Δ
#multiple	`89.93% <93.33%> (ø)`	⬆️
#single	`41.75% <26.66%> (-0.01%)`	⬇️

Impacted Files	Coverage Δ
pandas/core/ops.py	`90.41% <100%> (+0.17%)`	⬆️
pandas/core/indexes/datetimelike.py	`97.08% <100%> (+0.02%)`	⬆️
pandas/core/indexes/datetimes.py	`95.17% <50%> (-0.28%)`	⬇️
pandas/util/testing.py	`84.95% <0%> (+0.21%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c19bdc9...190197c. Read the comment docs.

jreback · 2017-12-23T20:32:41Z

pandas/core/ops.py

@@ -709,6 +709,15 @@ def wrapper(left, right, name=name, na_op=na_op):

        left, right = _align_method_SERIES(left, right)

+        if is_timedelta64_dtype(left) and isinstance(right, pd.DatetimeIndex):


well I don't want this here. put it where the other conversions are. that can be reactored at some point if its worthile.

…x_with_series

jbrockmendel · 2017-12-24T00:52:51Z

well I don't want this here. put it where the other conversions are. that can be reactored at some point if its worthile.

I'm in the process of implementing this, but please reconsider.

Moving this into _TimeOp requires implementing logic to figure out what the output dtype is. While this is doable, it is entirely unnecessary because it is essentially re-implementing (worse) the logic in the TimedeltaIndex operation.

jbrockmendel · 2017-12-24T18:12:42Z

The most recent attempt to kludge this into TimeOp still got the dtypes wrong. The way presented is the right way to do this.

jreback · 2017-12-24T18:26:31Z

then let’s refactor this first
i don’t want to add a sub optimal soln which will then be removed

jbrockmendel · 2017-12-24T18:38:47Z

Huh? The change here is in the direction of the long-term solution, is not going to be removed, is not sub-optimal.

jbrockmendel · 2017-12-24T18:39:01Z

Pls reopen.

jbrockmendel · 2017-12-26T03:40:17Z

Got the requested (but much worse) version working. Please reopen.

jreback · 2017-12-26T21:30:09Z

ok let's see what you have.

jbrockmendel · 2017-12-26T22:44:33Z

ok let's see what you have.

Pushed yesterday, looks like CI just finished.

jreback · 2017-12-27T20:24:28Z

pandas/core/ops.py

            dtype=dtype,
        )

    return wrapper


+def _get_series_result_name(left, right):


rather than right new code, this should be threaded into the existing routine

OK. Threading it in separates name-convention logic into multiple places and makes the existing kludge kludgier, but will change.

jreback · 2017-12-27T20:25:00Z

pandas/core/ops.py

@@ -515,7 +515,8 @@ def _convert_to_array(self, values, name=None, other=None):

            # a datelike
            elif isinstance(values, pd.DatetimeIndex):
-                values = values.to_series()
+                # TODO: why are we casting to_series in the first place?


and if you change this does it work?

Tentative yes, but only with the change made by _get_series_result_name below.

Update: tinkering with this and doing the threading requested below don't play nicely together. We should keep this as-is and address separately. This fixes a bug and we should call it a win. There are a bunch of these bug-fixes to get in before we can clean up the mess that is _TimeOp.

…x_with_series

jreback · 2017-12-29T19:26:56Z

pandas/core/ops.py

@@ -515,7 +515,8 @@ def _convert_to_array(self, values, name=None, other=None):

            # a datelike
            elif isinstance(values, pd.DatetimeIndex):


change to use ABCDatetimeIndex

jreback · 2017-12-29T19:28:06Z

pandas/core/ops.py

            dtype=dtype,
        )

    return wrapper


+def _get_series_op_result_name(left, right):
+    # `left` is always a Series object
+    if isinstance(right, (pd.Series, pd.Index)):


use generic instance checks, thought why can't you just duck type, e.g.

name = _maybe_match_name(left, getattr(right, 'name', None))

On the generics, sure. On the ducktype:

pretty sure what you have in mind is name = _maybe_match_name(left, right)

That would represent a non-trivial change in pandas convention/policy for name propagation. Up until a few days ago even Index names were ignored (except for unintentional corner cases caused by conversion within _TimeOp). Allowing through anything with a name attribute would include e.g. most DateOffset subclasses, which I don't think is desired. If this is something you'd like to see changed, I'd ask you to open an issue or something and consider it out of scope for this bug-fixing PR.

jreback · 2017-12-29T19:39:10Z

@jbrockmendel FYI I cancelled a couple of your builds on travis. trying to get 0.22 out. most of these PR's have comments anyhow.

jbrockmendel · 2017-12-29T19:41:42Z

FYI I cancelled a couple of your builds on travis. trying to get 0.22 out. most of these PR's have comments anyhow.

Sounds good. I'll flag any comments that need clarification.

BTW appveyor usually seems to be the constraining factor when the pipeline gets clogged. In cases where I screw up is there a way to cancel build for my own PRs there?

jreback · 2017-12-29T19:49:05Z

when you push again appveyor (and travis) both cancel the previous. so it doesn't matter really. travis does it as soon as you push, while appveyor won't cancel until it actually runs (so it looks like its taking longer).

jbrockmendel · 2017-12-29T20:17:50Z

Just pushed fixes, including for #18989. You may need to cancel the build again.

…x_with_series

jbrockmendel · 2017-12-31T00:09:07Z

Repush or hang tight?

jreback · 2017-12-31T00:20:04Z

all good now
go ahead and repush

…x_with_series

jreback · 2017-12-31T00:45:44Z

pandas/core/ops.py

@@ -535,6 +536,11 @@ def _convert_to_array(self, values, name=None, other=None):
        elif inferred_type in ('timedelta', 'timedelta64'):
            # have a timedelta, convert to to ns here
            values = to_timedelta(values, errors='coerce', box=False)
+            if isinstance(other, pd.DatetimeIndex):


use ABC here

Will change.

jreback · 2017-12-31T00:46:54Z

pandas/core/ops.py

            dtype=dtype,
        )

    return wrapper


+def _get_series_op_result_name(left, right):
+    # `left` is always a Series object


i suggested a simplification

i suggested a simplification

I'm guessing you're referring to this, which I responded to here.

ok, rather than creating another function like this, just in-line it as its only used once. .

jreback · 2017-12-31T00:47:28Z

pandas/tests/indexes/datetimes/test_arithmetic.py

+                                 tz=tz, name=names[0])
+        ser = pd.Series([pd.Timedelta(seconds=5)] * 2,
+                        index=index, name=names[1])
+        expected = pd.Series(index + pd.Timedelta(seconds=5),


convention is not to use pd.

Will change.

jreback · 2017-12-31T15:00:15Z

pandas/core/ops.py

            dtype=dtype,
        )

    return wrapper


+def _get_series_op_result_name(left, right):
+    # `left` is always a Series object


ok, rather than creating another function like this, just in-line it as its only used once. .

jreback · 2017-12-31T15:04:11Z

pandas/tests/indexes/datetimes/test_arithmetic.py

+
+        # passing name arg isn't enough when names[2] is None
+        expected.name = names[2]
+        assert expected.dtype == index.dtype


can you also test adding with ser.values as well (obviously will be index result). was part of the OP as an example.

…x_with_series

jreback · 2018-01-01T14:19:43Z

rebase and push again, fixed some hanging by Travis CI

…x_with_series

jbrockmendel · 2018-01-02T04:13:25Z

This should be ready, is orthogonal to #19024.

jreback · 2018-01-02T11:23:53Z

thanks!

Fix Series[timedelta64]+DatetimeIndex[tz] bugs

8e856d1

ser + index lossed timezone closes pandas-dev#13905 index + ser retained timezone but returned a DatetimeIndex

jbrockmendel commented Dec 21, 2017

View reviewed changes

jreback added Bug Timedelta Timedelta data type Datetime Datetime data dtype Timezones Timezone data dtype labels Dec 21, 2017

jreback requested changes Dec 21, 2017

View reviewed changes

set result name, add name checks to test

2e824bc

jreback reviewed Dec 21, 2017

View reviewed changes

jreback requested changes Dec 21, 2017

View reviewed changes

jbrockmendel added 2 commits December 21, 2017 13:17

edits per reviewer request

5471d40

Merge branch 'master' of https://github.com/pandas-dev/pandas into id…

d5a0862

…x_with_series

jbrockmendel mentioned this pull request Dec 23, 2017

BUG: fix Series[timedelta64] arithmetic with Timedelta scalars #18831

Merged

4 tasks

jreback requested changes Dec 23, 2017

View reviewed changes

Merge branch 'master' of https://github.com/pandas-dev/pandas into id…

e142946

…x_with_series

jreback closed this Dec 24, 2017

implement requested kludge

277b8cb

jreback reopened this Dec 26, 2017

jreback requested changes Dec 27, 2017

View reviewed changes

jbrockmendel added 2 commits December 28, 2017 22:31

do names hapharzardly per request

5a5d98c

Merge branch 'master' of https://github.com/pandas-dev/pandas into id…

76582ac

…x_with_series

jreback requested changes Dec 29, 2017

View reviewed changes

jbrockmendel mentioned this pull request Dec 29, 2017

TST: catch performance warnings #18989

Closed

check against ABC classes

26c9b51

jbrockmendel added 2 commits December 29, 2017 12:16

check against ABCDatetimeIndex

a7a3d6f

assert_produces_warning(PerformanceWarning)

3e71017

Merge branch 'master' of https://github.com/pandas-dev/pandas into id…

dead356

…x_with_series

Merge branch 'master' of https://github.com/pandas-dev/pandas into id…

51234b3

…x_with_series

jreback requested changes Dec 31, 2017

View reviewed changes

requested convention edits

7b831ff

jreback requested changes Dec 31, 2017

View reviewed changes

jreback reviewed Dec 31, 2017

View reviewed changes

jbrockmendel added 3 commits December 31, 2017 09:56

Merge branch 'master' of https://github.com/pandas-dev/pandas into id…

c09ac0f

…x_with_series

requested edit, extend tests to include ser.values

0fc30bb

dummy commit to force CI

ae1af46

Merge branch 'master' of https://github.com/pandas-dev/pandas into id…

190197c

…x_with_series

jreback added this to the 0.23.0 milestone Jan 2, 2018

jreback approved these changes Jan 2, 2018

View reviewed changes

jreback merged commit 04beec7 into pandas-dev:master Jan 2, 2018

This was referenced Jan 2, 2018

DatetimeIndex + Series[timedelta64] returns DatetimeIndex #18963

Closed

DataFrame vs Series vs Index arithmetic Roundup #18824

Closed

jbrockmendel deleted the idx_with_series branch January 23, 2018 04:40

		@@ -709,6 +709,15 @@ def wrapper(left, right, name=name, na_op=na_op):

		left, right = _align_method_SERIES(left, right)

		if is_timedelta64_dtype(left) and isinstance(right, pd.DatetimeIndex):

		@@ -515,7 +515,8 @@ def _convert_to_array(self, values, name=None, other=None):

		# a datelike
		elif isinstance(values, pd.DatetimeIndex):

Fix Series[timedelta64]+DatetimeIndex[tz] bugs #18884

Fix Series[timedelta64]+DatetimeIndex[tz] bugs #18884

Conversation

jbrockmendel commented Dec 20, 2017 • edited by jreback Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov bot commented Dec 21, 2017 • edited Loading

Codecov Report

Choose a reason for hiding this comment

jbrockmendel commented Dec 24, 2017

jbrockmendel commented Dec 24, 2017

jreback commented Dec 24, 2017

jbrockmendel commented Dec 24, 2017

jbrockmendel commented Dec 24, 2017

jbrockmendel commented Dec 26, 2017

jreback commented Dec 26, 2017

jbrockmendel commented Dec 26, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jreback commented Dec 29, 2017

jbrockmendel commented Dec 29, 2017

jreback commented Dec 29, 2017

jbrockmendel commented Dec 29, 2017

jbrockmendel commented Dec 31, 2017

jreback commented Dec 31, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jreback commented Jan 1, 2018

jbrockmendel commented Jan 2, 2018

jreback commented Jan 2, 2018

jbrockmendel commented Dec 20, 2017 •

edited by jreback

Loading

codecov bot commented Dec 21, 2017 •

edited

Loading