Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Maintain inplace operation on series #36498

1 change: 1 addition & 0 deletions doc/source/whatsnew/v1.1.3.rst
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@ Fixed regressions
- Fixed regression in :meth:`Series.__getitem__` incorrectly raising when the input was a frozenset (:issue:`35747`)
- Fixed regression in :meth:`read_excel` with ``engine="odf"`` caused ``UnboundLocalError`` in some cases where cells had nested child nodes (:issue:`36122`, :issue:`35802`)
- Fixed regression in :class:`DataFrame` and :class:`Series` comparisons between numeric arrays and strings (:issue:`35700`, :issue:`36377`)
- Fixed regression where inplace arithmetic operation on :class:`Series` no longer updated parent :class:`DataFrame` (:issue:`36373`)
- Fixed regression in :meth:`DataFrame.apply` with ``raw=True`` and user-function returning string (:issue:`35940`)
- Fixed regression when setting empty :class:`DataFrame` column to a :class:`Series` in preserving name of index in frame (:issue:`36527`)
- Fixed regression in :class:`Period` incorrect value for ordinal over the maximum timestamp (:issue:`36430`)
Expand Down
8 changes: 6 additions & 2 deletions pandas/core/generic.py
Original file line number Diff line number Diff line change
Expand Up @@ -3373,8 +3373,12 @@ def _maybe_cache_changed(self, item, value) -> None:
"""
The object has called back to us saying maybe it has changed.
"""
loc = self._info_axis.get_loc(item)
self._mgr.iset(loc, value)
try:
loc = self._info_axis.get_loc(item)
except KeyError:
pass
else:
self._mgr.iset(loc, value)

@property
def _is_cached(self) -> bool_t:
Expand Down
2 changes: 0 additions & 2 deletions pandas/core/ops/methods.py
Original file line number Diff line number Diff line change
Expand Up @@ -93,8 +93,6 @@ def _wrap_inplace_method(method):

def f(self, other):
result = method(self, other)
# Delete cacher
self._reset_cacher()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was added to fix an existing bug (see #30501) so removing it would likely cause a regression

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could this have been the incorrect location for the original fix? it appears there were a couple of iterations in #30501.

# this makes sure that we are aligned like the input
# we are updating inplace so we want to ignore is_copy
self._update_inplace(
Expand Down
10 changes: 10 additions & 0 deletions pandas/tests/arithmetic/test_numeric.py
Original file line number Diff line number Diff line change
Expand Up @@ -1317,3 +1317,13 @@ def test_dataframe_div_silenced():
)
with tm.assert_produces_warning(None):
pdf1.div(pdf2, fill_value=0)


def test_inplace_arithmetic_operation_on_series_updating_parent_dataframe():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this test passes for me on master. is there another case that fails?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes I think need to test df["A"] is s which is also True on 1.1.2

>>> pd.__version__
'1.1.2'
>>>
>>> import pandas.testing as tm
>>>
>>> df = pd.DataFrame({"A": [1, 2, 3]})
>>> s = df["A"]
>>> s += 1
>>> tm.assert_series_equal(df["A"], s)
>>>
>>> s
0    2
1    3
2    4
Name: A, dtype: int64
>>>
>>> df["A"]
0    2
1    3
2    4
Name: A, dtype: int64
>>>
>>> df
   A
0  1
1  2
2  3
>>> df["A"] is s
True
>>>

but the output of __repr__ and also other ops is incorrect

so should also compare using assert_frame_equal

>>> tm.assert_frame_equal(df, pd.DataFrame({"A": [2, 3, 4]}))
Traceback (most recent call last):
...
AssertionError: DataFrame.iloc[:, 0] (column name="A") are different

DataFrame.iloc[:, 0] (column name="A") values are different (100.0 %)
[index]: [0, 1, 2]
[left]:  [1, 2, 3]
[right]: [2, 3, 4]
>>>

but this should pass with the fix here

# https://github.com/pandas-dev/pandas/issues/36373
df = pd.DataFrame({"A": [1, 2, 3]})
s = df["A"]
s += 1
assert s is df["A"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we also assert df['A'].to_numpy().base is s.to_numpy().base

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess if we do this we no longer need the cacher machinery. At first glance his appears a big change so will open a separate PR as an alternative to this one.

expected = pd.DataFrame({"A": [2, 3, 4]})
tm.assert_frame_equal(df, expected)
16 changes: 0 additions & 16 deletions pandas/tests/indexing/test_loc.py
Original file line number Diff line number Diff line change
Expand Up @@ -888,22 +888,6 @@ def test_identity_slice_returns_new_object(self):
original_series[:3] = [7, 8, 9]
assert all(sliced_series[:3] == [7, 8, 9])

def test_loc_copy_vs_view(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

umm pls revert this test.

# GH 15631
x = DataFrame(zip(range(3), range(3)), columns=["a", "b"])

y = x.copy()
q = y.loc[:, "a"]
q += 2

tm.assert_frame_equal(x, y)

z = x.copy()
q = z.loc[x.index, "a"]
q += 2

tm.assert_frame_equal(x, z)

def test_loc_uint64(self):
# GH20722
# Test whether loc accept uint64 max value as index.
Expand Down