Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pandas deprecates Index.get_loc with method #5721

Closed
mathause opened this issue Aug 20, 2021 · 7 comments · Fixed by #6195
Closed

pandas deprecates Index.get_loc with method #5721

mathause opened this issue Aug 20, 2021 · 7 comments · Fixed by #6195

Comments

@mathause
Copy link
Collaborator

pandas deprecates the method keyword in Index.get_loc, see pandas-dev/pandas#42269. Therefore we end up with about 5000 warnings in our upstream tests:

FutureWarning: Passing method to Index.get_loc is deprecated and will raise in a future version. Use index.get_indexer([item], method=...) instead

We should fix this before pandas releases because the warning will not be silent (FutureWarning) or ask pandas to give us more time and use a DeprecationWarning at the moment.

We use this here:

indexer = self.index.get_loc(
label_value, method=method, tolerance=tolerance
)

Is this only ever called with one item? Then we might be able to use

 indexer = self.index.get_indexer( 
     [label_value], method=method, tolerance=tolerance 
 ).item()
if indexer == -1:
    raise KeyError(label_value)

imin = index.get_loc(minval, method="nearest")
imax = index.get_loc(maxval, method="nearest")

This one could be easy to fix (replace with imin = index.get_indexer([minval], method="nearest").item())


It is also defined in CFTimeIndex, which complicates things:

def get_loc(self, key, method=None, tolerance=None):
"""Adapted from pandas.tseries.index.DatetimeIndex.get_loc"""
if isinstance(key, str):
return self._get_string_slice(key)
else:
return pd.Index.get_loc(self, key, method=method, tolerance=tolerance)

because get_indexer expects an iterable and thus the if isinstance(key, str) test no longer works.

@benbovy @spencerkclark

@spencerkclark
Copy link
Member

Thanks for the heads up @mathause. We'll need to think carefully about this with respect to partial datetime string indexing. DatetimeIndex.get_indexer and DatetimeIndex.get_loc behave differently with respect to datetime strings. get_indexer interprets strings as specific dates, while get_loc interprets them as ranges:

>>> import pandas as pd
>>> times = pd.date_range("2000", periods=5)
>>> times.get_indexer(["2000"])
array([0])
>>> times.get_loc("2000")
slice(0, 5, None)

In other words -- at least for partial datetime string indexing -- it may not be as simple as swapping in get_indexer for get_loc.

Perhaps @jbrockmendel has thoughts on how we should approach this?

@shoyer
Copy link
Member

shoyer commented Aug 20, 2021

These lines seem to be the problematic place?

indexer = self.index.get_loc(
label_value, method=method, tolerance=tolerance
)

We might be able to do something like:

if method is not None:
    indexer = get_indexer_nd(self.index, label, method, tolerance)
else:
    indexer = self.index.get_loc(label_value)

@spencerkclark
Copy link
Member

Exactly, yeah, those are the problematic lines. That's an elegant solution. I think it will work with the minor modification to raise a KeyError if any of the indices returned by get_indexer_nd are less than zero:

if method is not None:
    indexer = get_indexer_nd(self.index, label, method, tolerance)
    if np.any(indexer < 0):
        raise KeyError(f"not all values found in index {coord_name!r}")
else:
    indexer = self.index.get_loc(label_value)

@jbrockmendel
Copy link

The "if method is not None" approach seems reasonable.

@benbovy
Copy link
Member

benbovy commented Aug 23, 2021

👍 I can take care of it in #5692 or in a separate PR.

@mathause
Copy link
Collaborator Author

👍 I can take care of it in #5692 or in a separate PR.

I don't have a strong opinion but would be good to do this before the next release.

@benbovy
Copy link
Member

benbovy commented Aug 30, 2021

Not sure that #5692 will be ready for the next release so I'll do this in a separate PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants