Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: isin numeric vs string #38279

Merged
merged 1 commit into from
Dec 4, 2020

Conversation

jbrockmendel
Copy link
Member

  • closes #xxxx
  • tests added / passed
  • passes black pandas
  • passes git diff upstream/master -u -- "*.py" | flake8 --diff
  • whatsnew entry

Using the new hashtables!

In [3]: arr = np.arange(10**7).astype(np.int32)

In [4]: %timeit isin(arr, arr[:10])
88.8 ms ± 497 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)   # <-- master
42.7 ms ± 385 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)   # <-- PR

@jreback jreback added Performance Memory or execution speed performance Dtype Conversions Unexpected or buggy dtype conversions labels Dec 4, 2020
@jreback jreback added this to the 1.2 milestone Dec 4, 2020
@jreback jreback merged commit 4699c2b into pandas-dev:master Dec 4, 2020
@jreback
Copy link
Contributor

jreback commented Dec 4, 2020

nice!

@jreback
Copy link
Contributor

jreback commented Dec 4, 2020

I don't think a whatsnew note is needed here

@jbrockmendel jbrockmendel deleted the ref-isin-hashtables branch December 4, 2020 15:12
@jorisvandenbossche
Copy link
Member

@jbrockmendel can you check if this regresssion is related? https://pandas.pydata.org/speed/pandas/#series_methods.IsIn.time_isin?python=3.8&Cython=0.29.21&p-dtype='int64'&commits=11d0176c-5b91febe

(this commit seems the most likely in the range 11d0176...5b91feb)

@jbrockmendel
Copy link
Member Author

Good catch, looks like we're casting [1, 2] to ndarray[object] instead of ndarray[int]. ill see if theres an easy fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Dtype Conversions Unexpected or buggy dtype conversions Performance Memory or execution speed performance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants