Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG/Internals: maybe_upcast_putmask #23823

Closed
h-vetinari opened this issue Nov 20, 2018 · 1 comment · Fixed by #25431
Closed

BUG/Internals: maybe_upcast_putmask #23823

h-vetinari opened this issue Nov 20, 2018 · 1 comment · Fixed by #25431
Labels
Algos Non-arithmetic algos: value_counts, factorize, sorting, isin, clip, shift, diff Bug Dtype Conversions Unexpected or buggy dtype conversions
Milestone

Comments

@h-vetinari
Copy link
Contributor

In the context of #23192 (and #23604 / #23606), I want to use pandas.core.dtypes.cast.maybe_upcast_putmask, because it solves exactly the problem I need it to solve.

Unfortunately, it does not work as advertised (and I already found the culprit).
The docstring says:

def maybe_upcast_putmask(result, mask, other):
    """
    A safe version of putmask that potentially upcasts the result

    Parameters
    ----------
    result : ndarray
        The destination array. This will be mutated in-place if no upcasting is
        necessary.
    mask : boolean ndarray
    other : ndarray or scalar
        The source array or value

in other words, it expects result and other to be ndarrays. Curiously enough, in some branches, it only works for Series and produces wrong results for ndarray, e.g.

>>> import pandas as pd
>>> import numpy as np
>>> s = pd.Series([10, 11, 12])
>>> t = pd.Series([np.nan, 61, np.nan])
>>> from pandas.core.dtypes.cast import maybe_upcast_putmask
>>> result, _ = maybe_upcast_putmask(s, np.array([False, True, False]), t)
>>> result  # correct
0    10
1    61
2    12
dtype: int64
>>> result, _ = maybe_upcast_putmask(s.values, np.array([False, True, False]), t.values)
>>> result  # incorrect
array([10., nan, 12.])

This is because the code does

try:
    [...]
    new_result = result.values.copy()
    [...]
    return [...]
except: 
    # do something else

which actually expects a Series (since .values won't ever work on an ndarray).

@gfyoung gfyoung added Bug Dtype Conversions Unexpected or buggy dtype conversions Algos Non-arithmetic algos: value_counts, factorize, sorting, isin, clip, shift, diff labels Nov 21, 2018
@gfyoung
Copy link
Member

gfyoung commented Nov 21, 2018

How odd! Some investigation and cleaning up would be great.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Algos Non-arithmetic algos: value_counts, factorize, sorting, isin, clip, shift, diff Bug Dtype Conversions Unexpected or buggy dtype conversions
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants