Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: Don't parse index column as numeric when parse_dates=True #14077

Conversation

gfyoung
Copy link
Member

@gfyoung gfyoung commented Aug 24, 2016

When a thousands parameter is specified, if the index column data contains that thousands value for date purposes (e.g. '.'), do not interpret those characters as the thousands parameter.

Closes #14066.

@gfyoung gfyoung force-pushed the thousands-parse-date-index-col branch from b7cfef0 to 68ecee3 Compare August 25, 2016 03:35
@jreback
Copy link
Contributor

jreback commented Aug 25, 2016

this is related to #9435 (though don't think this fixes)

@jreback jreback added Bug IO CSV read_csv, to_csv labels Aug 25, 2016
@@ -1474,6 +1474,13 @@ def _set(x):
else:
_set(val)

elif self.parse_dates:
if isinstance(self.index_col, list):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is_list_like? (I don't know if its coerced to a list before this), same below

Copy link
Member Author

@gfyoung gfyoung Aug 26, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We expect parse_dates to a be bool, list, or dict per the docs. This is explicitly validated as well (see here), so is_list_like is unnecessary.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you need to handle dict at this point? (or is that already transformed)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

index_col can never be a dict per the docs.

If you're referring to parse_dates, parse_dates being a dict has a completely different meaning that is independent of the index_col.

@gfyoung gfyoung force-pushed the thousands-parse-date-index-col branch from 68ecee3 to 235d624 Compare August 26, 2016 05:10
@codecov-io
Copy link

codecov-io commented Aug 26, 2016

Current coverage is 85.26% (diff: 75.00%)

Merging #14077 into master will decrease coverage by <.01%

@@             master     #14077   diff @@
==========================================
  Files           139        139          
  Lines         50492      50504    +12   
  Methods           0          0          
  Messages          0          0          
  Branches          0          0          
==========================================
+ Hits          43053      43061     +8   
- Misses         7439       7443     +4   
  Partials          0          0          

Powered by Codecov. Last update 185fcbe...f2ed334

@gfyoung gfyoung force-pushed the thousands-parse-date-index-col branch 3 times, most recently from bce56d0 to 676c606 Compare August 26, 2016 07:31
When a thousands parameter is specified, if the index column data
contains that thousands value for date purposes (e.g. '.'), do not
interpret those characters as the thousands parameter.

Closes pandas-devgh-14066.
@gfyoung gfyoung force-pushed the thousands-parse-date-index-col branch from 676c606 to f2ed334 Compare August 26, 2016 07:35
tm.assert_frame_equal(result, expected)

expected = DataFrame([[datetime(2016, 4, 15),
datetime(2013, 9, 16)]],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a dict for parse_dates as well

@jreback jreback added this to the 0.19.0 milestone Aug 26, 2016
@gfyoung
Copy link
Member Author

gfyoung commented Aug 27, 2016

@jreback : Travis is passing, so this is ready to merge if there are no other concerns.

@jorisvandenbossche jorisvandenbossche merged commit 9d10b76 into pandas-dev:master Aug 27, 2016
@jorisvandenbossche
Copy link
Member

@gfyoung Thanks!

@gfyoung gfyoung deleted the thousands-parse-date-index-col branch August 27, 2016 16:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug IO CSV read_csv, to_csv
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants