Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: Inserting array of same size with Series.loc raises ValueError #38266

Merged
merged 10 commits into from
Dec 5, 2020

Conversation

ma3da
Copy link
Contributor

@ma3da ma3da commented Dec 3, 2020

closes #37748
closes #37486
closes #38271

  • tests added / passed
  • passes black pandas
  • passes git diff upstream/master -u -- "*.py" | flake8 --diff
  • whatsnew entry

@ma3da ma3da changed the title Gh 37748 loc error same size BUG: Error for Series.loc on array of same size Dec 3, 2020
@ma3da ma3da changed the title BUG: Error for Series.loc on array of same size BUG: Inserting array of same size with Series.loc raises ValueError Dec 3, 2020
@ma3da ma3da marked this pull request as ready for review December 3, 2020 17:45
@jbrockmendel
Copy link
Member

cc @phofl

# GH37748
ser = Series(0, index=range(5), dtype="object")

expected = np.zeros(size)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would you get the same bug with a length size python list?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes

@@ -677,6 +677,7 @@ Indexing
- Bug in :meth:`DataFrame.loc` and :meth:`DataFrame.__getitem__` raising ``KeyError`` when columns were :class:`MultiIndex` with only one level (:issue:`29749`)
- Bug in :meth:`Series.__getitem__` and :meth:`DataFrame.__getitem__` raising blank ``KeyError`` without missing keys for :class:`IntervalIndex` (:issue:`27365`)
- Bug in setting a new label on a :class:`DataFrame` or :class:`Series` with a :class:`CategoricalIndex` incorrectly raising ``TypeError`` when the new label is not among the index's categories (:issue:`38098`)
- Bug in :meth:`Series.loc` and :meth:`Series.iloc` raising ``ValueError`` when inserting an array in a ``object`` Series of equal length (:issue:`37748`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

array seems vague. Do you mean NumPy array, list, ExtensionArray (or selection of those)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all of those i guess :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, np.array, list, tuple do cause the raise. From your example, ExtensionArray don't but also behave unexpectedly.

@arw2019
Copy link
Member

arw2019 commented Dec 3, 2020

Also does this solve #38271

@ma3da
Copy link
Contributor Author

ma3da commented Dec 3, 2020

@arw2019

Also does this solve #38271

It seems so:

In [1]: import numpy as np 
   ...:     ...: import pandas as pd 
   ...:     ...:  
   ...:     ...: ser = pd.Series(1, index=list("abcde"), dtype="object") 
   ...:     ...:  
   ...:     ...: expected = pd.array([0,0,3,0,0]) 
   ...:     ...: ser.loc["a"] = expected 
   ...:     ...: result = ser[0]                                                                                                                                                                                                               

In [2]: ser                                                                                                                                                                                                                                    
Out[2]: 
a    [0, 0, 3, 0, 0]
b                  1
c                  1
d                  1
e                  1
dtype: object

Copy link
Member

@arw2019 arw2019 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay great can you add tests for the list, ExtensionArray cases and update whatsnew

@@ -677,6 +677,7 @@ Indexing
- Bug in :meth:`DataFrame.loc` and :meth:`DataFrame.__getitem__` raising ``KeyError`` when columns were :class:`MultiIndex` with only one level (:issue:`29749`)
- Bug in :meth:`Series.__getitem__` and :meth:`DataFrame.__getitem__` raising blank ``KeyError`` without missing keys for :class:`IntervalIndex` (:issue:`27365`)
- Bug in setting a new label on a :class:`DataFrame` or :class:`Series` with a :class:`CategoricalIndex` incorrectly raising ``TypeError`` when the new label is not among the index's categories (:issue:`38098`)
- Bug in :meth:`Series.loc` and :meth:`Series.iloc` raising ``ValueError`` when inserting an array in a ``object`` Series of equal length (:issue:`37748`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all of those i guess :)

Comment on lines 1758 to 1767
def assert_python_equal(left, right):
"""
Check left and right are equal w.r.t the ``==`` operator.

Parameters
----------
left : object
right : object
"""
assert left == right
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, I'm guessing this is too much, but I failed to find helper funcs to assert list/tuple equality.
I'll welcome advice on how to write the tests :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah don't do this, just do assert left == right pytest already handles this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok


ser.loc["a"] = expected
result = ser[0]
assert_fn(result, expected)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason for not checking the whole Series? This would also avoid your helper function

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point, thanks

Comment on lines 1758 to 1767
def assert_python_equal(left, right):
"""
Check left and right are equal w.r.t the ``==`` operator.

Parameters
----------
left : object
right : object
"""
assert left == right
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah don't do this, just do assert left == right pytest already handles this

@jreback jreback added the Indexing Related to indexing on series/frames, not to indexes themselves label Dec 3, 2020
@jreback jreback added the Bug label Dec 4, 2020
@jreback jreback added this to the 1.2 milestone Dec 4, 2020
@@ -677,6 +677,8 @@ Indexing
- Bug in :meth:`DataFrame.loc` and :meth:`DataFrame.__getitem__` raising ``KeyError`` when columns were :class:`MultiIndex` with only one level (:issue:`29749`)
- Bug in :meth:`Series.__getitem__` and :meth:`DataFrame.__getitem__` raising blank ``KeyError`` without missing keys for :class:`IntervalIndex` (:issue:`27365`)
- Bug in setting a new label on a :class:`DataFrame` or :class:`Series` with a :class:`CategoricalIndex` incorrectly raising ``TypeError`` when the new label is not among the index's categories (:issue:`38098`)
- Bug in :meth:`Series.loc` and :meth:`Series.iloc` raising ``ValueError`` when inserting a listlike ``np.array``, ``list`` or ``tuple`` in an ``object`` Series of equal length (:issue:`37748`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you list 3 issues in the top of PR, what's missing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I added a whastnew note for #38271 and one for #37748.
In the PR, I listed #37486 because it is a particular case of #37748 (iloc on a Series with a list, both of size 1).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pls add the issue number to the note for #37748 then

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

@jreback
Copy link
Contributor

jreback commented Dec 4, 2020

cc @phofl and @jbrockmendel if any comments.

@phofl
Copy link
Member

phofl commented Dec 5, 2020

lgtm otherwise

@jreback jreback merged commit 5399c6d into pandas-dev:master Dec 5, 2020
@jreback
Copy link
Contributor

jreback commented Dec 5, 2020

thanks @ma3da

@ma3da ma3da deleted the GH_37748_loc_error_same_size branch December 6, 2020 00:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Indexing Related to indexing on series/frames, not to indexes themselves
Projects
None yet
5 participants