Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simple operation unexpectedly changes dtype #10503

Closed
jeggleston opened this issue Jul 3, 2015 · 3 comments
Closed

Simple operation unexpectedly changes dtype #10503

jeggleston opened this issue Jul 3, 2015 · 3 comments
Labels
Bug Dtype Conversions Unexpected or buggy dtype conversions
Milestone

Comments

@jeggleston
Copy link

Hi all,
I can't find any documentation that says this should happen, so I think it's a bug. But maybe something's happening that I don't understand. When I do a simple operation (adding 1 to a slice), suddenly the dtype of the columns changes from uint32 to int64.
Any ideas why this is happening? Bug?
Thanks

Make a sample dataframe. Columns are dtype uint32.

In [1]: import pandas as pd

In [2]: df = pd.DataFrame({'a':[0, 1, 1], 'b':[100, 200, 300]}, dtype='uint32')

In [3]: df.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 3 entries, 0 to 2
Data columns (total 2 columns):
a    3 non-null uint32
b    3 non-null uint32
dtypes: uint32(2)
memory usage: 48.0 bytes

Take a slice of a column. Adding 1 to that slice still results in dtype uint32.

In [4]: ix = df['a'] == 1

In [5]: z = df.loc[ix, 'b']

In [6]: z + 1
Out[6]: 
1    201
2    301
Name: b, dtype: uint32

But, if I modify that slice in the original dataframe, suddenly both columns of the dataframe are int64.

In [7]: df.loc[ix, 'b'] = z + 1

In [8]: df.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 3 entries, 0 to 2
Data columns (total 2 columns):
a    3 non-null int64
b    3 non-null int64
dtypes: int64(2)
memory usage: 72.0 bytes

I've seen this in 0.16, 0.16.1, and 0.16.2.

In [9]: pd.__version__
Out[9]: '0.16.2'
@jreback
Copy link
Contributor

jreback commented Jul 3, 2015

buggy here: https://github.com/pydata/pandas/blob/master/pandas/core/common.py#L1289

this shouldn't downcast when dtype='infer' if its already of the same type

want to do a pull-request? (this may need a few tests to validate properly, as this is used in a number of places).

@jreback jreback added Bug Dtype Conversions Unexpected or buggy dtype conversions Difficulty Intermediate labels Jul 3, 2015
@jreback jreback added this to the Next Major Release milestone Jul 3, 2015
@kemingts
Copy link

I just started working on this. If somebody else is also looking at this, please let me know. Thanks.

@kemingts
Copy link

pull request Gh10503 #12477 is submitted.

@jreback jreback modified the milestones: 0.18.0, Next Major Release Feb 27, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Dtype Conversions Unexpected or buggy dtype conversions
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants