pd.read_csv()/dtype and index_col combo #32930

sbwiecko · 2020-03-23T14:10:16Z

Code Sample, a copy-pastable example if possible

data=pd.read_csv(file, dtype={'donor':str}, index_col='donor') # dtype of the index IS NOT str
# to fix:
data= pd.read_csv(file, dtype={'donor': str})
data.set_index('donor', drop=True, inplace=True)

Problem description

during pd.read_csv(), when I set a dtype to one column and set the same column as index, the index dtype is lost

Expected Output

When both dtype and index_col are used with the same column, I want the index in dtype specificied in the pd.read_csv() call, i.e. in my example str

Output of `pd.show_versions()`

INSTALLED VERSIONS

commit : None
python : 3.7.6.final.0
python-bits : 64
OS : Windows
OS-release : 10
machine : AMD64
processor : Intel64 Family 6 Model 69 Stepping 1, GenuineIntel
byteorder : little
LC_ALL : None
LANG : None
LOCALE : None.None

pandas : 1.0.1
numpy : 1.18.1
pytz : 2019.3
dateutil : 2.8.0
pip : 20.0.2
setuptools : 45.2.0
Cython : None
pytest : 5.0.1
hypothesis : 4.27.0
sphinx : 2.1.2
blosc : None
feather : None
xlsxwriter : None
lxml.etree : 4.4.1
html5lib : 1.0.1
pymysql : None
psycopg2 : None
jinja2 : 2.10.1
IPython : 7.13.0
pandas_datareader: None
bs4 : 4.8.2
bottleneck : 1.2.1
fastparquet : None
gcsfs : None
lxml.etree : 4.4.1
matplotlib : 3.1.3
numexpr : None
odfpy : None
openpyxl : 3.0.0
pandas_gbq : None
pyarrow : None
pytables : None
pytest : 5.0.1
pyxlsb : None
s3fs : None
scipy : 1.4.1
sqlalchemy : 1.3.13
tables : None
tabulate : None
xarray : 0.12.3
xlrd : 1.2.0
xlwt : None
xlsxwriter : None
numba : None

The text was updated successfully, but these errors were encountered:

simonjayhawkins · 2020-08-25T12:43:30Z

closing as duplicate of #9435

jbrockmendel added the IO CSV read_csv, to_csv label Apr 1, 2020

simonjayhawkins mentioned this issue Jul 28, 2020

BUG: index_col in read_csv ignores dtype #35431

Closed

3 tasks

simonjayhawkins added the Bug label Jul 28, 2020

simonjayhawkins added this to the Contributions Welcome milestone Jul 28, 2020

simonjayhawkins mentioned this issue Aug 25, 2020

BUG: read_excel doesn't honor dtype for index #35816

Open

3 tasks

simonjayhawkins closed this as completed Aug 25, 2020

simonjayhawkins added Duplicate Report Duplicate issue or pull request and removed Bug IO CSV read_csv, to_csv labels Aug 25, 2020

simonjayhawkins removed this from the Contributions Welcome milestone Aug 25, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pd.read_csv()/dtype and index_col combo #32930

pd.read_csv()/dtype and index_col combo #32930

sbwiecko commented Mar 23, 2020

INSTALLED VERSIONS

simonjayhawkins commented Aug 25, 2020

pd.read_csv()/dtype and index_col combo #32930

pd.read_csv()/dtype and index_col combo #32930

Comments

sbwiecko commented Mar 23, 2020

Code Sample, a copy-pastable example if possible

Problem description

Expected Output

Output of pd.show_versions()

INSTALLED VERSIONS

simonjayhawkins commented Aug 25, 2020

Output of `pd.show_versions()`