ENH: Support mangle_dupe_cols=False in pd.read_csv() #13262

gfyoung · 2016-05-23T21:59:20Z

#12935 added full support for duplicate column names (in header or in names) by mangling them. While this has been considered acceptable by users, ideally, we would like to not have to mangle them.

The text was updated successfully, but these errors were encountered:

gfyoung · 2017-07-27T16:15:27Z

@jreback : Given what you said in #17060, is this something we should still pursue ?

jorisvandenbossche · 2017-07-31T19:28:29Z

Depending on how difficult this is, I would personally still have it as our goal to have mangle_dupe_cols=False implemented some time.

caniko · 2018-09-18T22:02:33Z

What is the ETA on this issue?

jreback · 2018-09-18T23:02:59Z

when / if a community pull request happens

gfyoung · 2018-09-19T04:15:56Z

@caniko2 : This is quite a tricky one given that duplicate column names have unusual behavior in pandas. You are more than welcome to submit a PR to implement it if you like.

jackzhenguo · 2019-05-22T03:48:21Z

Could anyone help me to check whether current pandas 0.24.2 support "mangle_dupe_cols=False"?

I find docs at http://pandas.pydata.org/pandas-docs/stable/user_guide/io.html, showing : Passing in False will cause data to be overwritten if there are duplicate names in the columns.

Thanks so much!

gfyoung · 2019-05-22T05:20:36Z

Still no support, as behavior of data handling has proven to be quite non-trivial when there are duplicate column names. You are welcome to give it a shot though!

grisaitis · 2021-08-09T20:04:50Z

Still no support, as behavior of data handling has proven to be quite non-trivial when there are duplicate column names. You are welcome to give it a shot though!

is this issue still difficult to resolve?

hepcat72 · 2021-08-20T18:42:44Z

Since I cannot set it to False and I cannot otherwise check for duplicated using df.columns.duplicated() on the dataframe returned by read_excel, how do I raise an exception when a duplicate is found - because it definitely causes a problem with other code - and the user needs to rectify it.

Jaakkonen · 2022-03-21T09:09:59Z

The documentation really shouldn't have this option if it in reality doesn't exist. Or the docs should say that this is to-be-implemented (and has been in that state for almost 6 years already)

phofl · 2023-04-09T16:02:34Z

The argument was removed, so closing

jreback changed the title ~~ENH: Support 'mangle_dupe_cols=False' in parsers.py~~ ENH: Support mangle_dupe_cols=False in pd.read_csv() May 23, 2016

jreback added IO CSV read_csv, to_csv Compat pandas objects compatability with Numpy or Python functions Difficulty Intermediate labels May 23, 2016

jreback added this to the Next Major Release milestone May 23, 2016

rahulporuri mentioned this issue Oct 6, 2016

BUG/ENH : Column name mangling doesn't strip white space #14367

Closed

jorisvandenbossche mentioned this issue Oct 20, 2016

Strip columns/column names in data frame of white spaces #14460

Open

gfyoung mentioned this issue Jul 25, 2017

BUG: Thoroughly dedup columns in read_csv #17060

Merged

chris-b1 mentioned this issue Jan 25, 2018

FR: Allow duplicate column names in pandas.read_csv #19383

Closed

gfyoung mentioned this issue Nov 13, 2018

Reading from Excel mangles columns #10523

Closed

jbrockmendel removed Difficulty Intermediate labels Oct 21, 2019

zedomel mentioned this issue Dec 3, 2019

Add function to check duplicated names in DWCA fields and rename them BelgianBiodiversityPlatform/python-dwca-reader#81

Closed

mroeschke added Enhancement and removed Compat pandas objects compatability with Numpy or Python functions labels Apr 10, 2020

hepcat72 mentioned this issue Aug 20, 2021

Sample name uniqueness check added to accucor data loader Princeton-LSI-ResearchComputing/tracebase#170

Merged

5 tasks

JosephShyFang mentioned this issue May 17, 2022

Update mangle_dupe_cols documentation to reflect actual state of implementation #47046

Closed

3 tasks

datapythonista mentioned this issue Jul 14, 2022

API: Consistent handling of duplicate input columns #47718

Open

mroeschke mentioned this issue Aug 11, 2022

CLN: Remove mangle_dupe_cols argument #48037

Closed

5 tasks

mroeschke removed this from the Contributions Welcome milestone Oct 13, 2022

phofl closed this as completed Apr 9, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ENH: Support mangle_dupe_cols=False in pd.read_csv() #13262

ENH: Support mangle_dupe_cols=False in pd.read_csv() #13262

gfyoung commented May 23, 2016

gfyoung commented Jul 27, 2017

jorisvandenbossche commented Jul 31, 2017

caniko commented Sep 18, 2018

jreback commented Sep 18, 2018

gfyoung commented Sep 19, 2018

jackzhenguo commented May 22, 2019 •

edited

Loading

gfyoung commented May 22, 2019

grisaitis commented Aug 9, 2021

hepcat72 commented Aug 20, 2021

Jaakkonen commented Mar 21, 2022 •

edited

Loading

phofl commented Apr 9, 2023

ENH: Support mangle_dupe_cols=False in pd.read_csv() #13262

ENH: Support mangle_dupe_cols=False in pd.read_csv() #13262

Comments

gfyoung commented May 23, 2016

gfyoung commented Jul 27, 2017

jorisvandenbossche commented Jul 31, 2017

caniko commented Sep 18, 2018

jreback commented Sep 18, 2018

gfyoung commented Sep 19, 2018

jackzhenguo commented May 22, 2019 • edited Loading

gfyoung commented May 22, 2019

grisaitis commented Aug 9, 2021

hepcat72 commented Aug 20, 2021

Jaakkonen commented Mar 21, 2022 • edited Loading

phofl commented Apr 9, 2023

jackzhenguo commented May 22, 2019 •

edited

Loading

Jaakkonen commented Mar 21, 2022 •

edited

Loading