BUG: Change in behavior of replace with integer series and float to_replace #40371

lbittarello · 2021-03-11T12:43:42Z

I have checked that this issue has not already been reported.
I have confirmed this bug exists on the latest version of pandas.
I have confirmed this bug exists on the master branch of pandas.

Code Sample, a copy-pastable example

import numpy as np
import pandas as pd

pd.Series([1]).replace(np.array([1.]), [0])

Problem description

As of version 1.1.5 (and earlier), the code snippet above yielded pd.Series([0]) (i.e. the replacement took place). As of version 1.2.0 (and later), the code snippet above yields pd.Series([1]) (i.e. the replacement does not take place). In both versions, the replacement takes place if we pass either an integer array (np.array([1])) or a list ([1.]) to replace instead of a float array (np.array([1.])). Is this change in behavior intentional?

Output of `pd.show_versions()`

INSTALLED VERSIONS
------------------
commit           : f2c8480af2f25efdbd803218b9d87980f416563e
python           : 3.8.8.final.0
python-bits      : 64
OS               : Linux
OS-release       : 5.4.0-1038-aws
Version          : #40~18.04.1-Ubuntu SMP Sat Feb 6 01:56:56 UTC 2021
machine          : x86_64
processor        : x86_64
byteorder        : little
LC_ALL           : C.UTF-8
LANG             : C.UTF-8
LOCALE           : en_US.UTF-8

pandas           : 1.2.3
numpy            : 1.19.5
pytz             : 2020.5
dateutil         : 2.8.1
pip              : 21.0.1
setuptools       : 49.6.0.post20210108
Cython           : None
pytest           : 6.2.2
hypothesis       : None
sphinx           : 3.5.2
blosc            : None
feather          : None
xlsxwriter       : 1.3.7
lxml.etree       : None
html5lib         : None
pymysql          : None
psycopg2         : 2.8.6 (dt dec pq3 ext lo64)
jinja2           : 2.11.3
IPython          : 7.21.0
pandas_datareader: None
bs4              : None
bottleneck       : None
fsspec           : 0.8.7
fastparquet      : None
gcsfs            : None
matplotlib       : 3.3.4
numexpr          : 2.7.3
odfpy            : None
openpyxl         : 3.0.6
pandas_gbq       : None
pyarrow          : 3.0.0
pyxlsb           : None
s3fs             : None
scipy            : 1.6.0
sqlalchemy       : 1.3.23
tables           : None
tabulate         : 0.8.9
xarray           : 0.17.0
xlrd             : 2.0.1
xlwt             : None
numba            : 0.52.0

The text was updated successfully, but these errors were encountered:

dsaxton · 2021-03-12T00:57:01Z

Looks like this may have been due to #38097

cc @jbrockmendel

hasan-yaman · 2021-03-21T14:20:43Z

take

hasan-yaman · 2021-03-21T15:16:20Z

Similar error also exists in DataFrame.

pd.DataFrame([1]).replace(np.array([1.]), [0])

returns pd.DataFrame([1]).

I attempted to fix the issue with #40555

jbrockmendel · 2021-03-31T22:38:39Z

best guess is we need to either patch can_hold_element for integer dtype so it can tell that 1.0 may be present, or early-on coerce the np.array([1.0]) to np.array([1])

simonjayhawkins · 2021-04-12T11:23:22Z

Looks like this may have been due to #38097

can confirm first bad commit: [45ac7da] PERF: replace_list (#38097)

lbittarello added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Mar 11, 2021

lbittarello changed the title ~~BUG: Change in behavior ofreplace with integer series and float to_replace~~ BUG: Change in behavior of replace with integer series and float to_replace Mar 11, 2021

dsaxton added Regression Functionality that used to work in a prior pandas version replace replace method and removed Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Mar 12, 2021

github-actions bot assigned hasan-yaman Mar 21, 2021

hasan-yaman mentioned this issue Mar 21, 2021

BUG: Fix behavior of replace_list with mixed types. #40555

Merged

4 tasks

simonjayhawkins added a commit to simonjayhawkins/pandas that referenced this issue Apr 12, 2021

code sample for pandas-dev#40371

f749f92

simonjayhawkins added this to the 1.2.5 milestone Apr 12, 2021

simonjayhawkins closed this as completed in #40555 Jun 1, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG: Change in behavior of replace with integer series and float to_replace #40371

BUG: Change in behavior of replace with integer series and float to_replace #40371

lbittarello commented Mar 11, 2021

dsaxton commented Mar 12, 2021

hasan-yaman commented Mar 21, 2021

hasan-yaman commented Mar 21, 2021

jbrockmendel commented Mar 31, 2021

simonjayhawkins commented Apr 12, 2021

BUG: Change in behavior of replace with integer series and float to_replace #40371

BUG: Change in behavior of replace with integer series and float to_replace #40371

Comments

lbittarello commented Mar 11, 2021

Code Sample, a copy-pastable example

Problem description

Output of pd.show_versions()

dsaxton commented Mar 12, 2021

hasan-yaman commented Mar 21, 2021

hasan-yaman commented Mar 21, 2021

jbrockmendel commented Mar 31, 2021

simonjayhawkins commented Apr 12, 2021

Output of `pd.show_versions()`