Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Axis slicer #4994

Closed
wants to merge 2 commits into from
Closed

Axis slicer #4994

wants to merge 2 commits into from

Conversation

dalejung
Copy link
Contributor

The internal axis slicer. I overloaded indexer to be start when stop is passed in instead of having a separate indexer kwargs.

@jreback
Copy link
Contributor

jreback commented Sep 26, 2013

looks good....once travis finishes...can merge

@dalejung
Copy link
Contributor Author

Weird. One of the tests failed on plotting. All I did was have Travis rebuild and it worked. Shrug.

Sent from my iPhone

On Sep 26, 2013, at 10:04 AM, jreback notifications@github.com wrote:

looks good....once travis finishes...can merge


Reply to this email directly or view it on GitHub.

@jreback
Copy link
Contributor

jreback commented Sep 26, 2013

ok...

I think @jtratner has a couple of question about some of the excetions..

2 21 23 25
"""
if not isinstance(axis, int):
raise TypeError("axis paramter must be an int and not {axis}".format(axis=axis))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please change this to assert isinstance(axis, int), "Expected 'axis' to be an int. Got: %r" % axis (since we're assuming that this check is made already by callers - so it's a completely internal error.

@jtratner
Copy link
Contributor

@dalejung just left a bunch of comments. Basically, I want it to be clear that this is an internal function and callers should be checking for all those things. assert statements help make that clear that caller code should check this. (also, generally can phrase positively instead of negatively which is nice). Finally, please note all the requirements you have in the docstring (ndim < 2, axis < ndim, must pass stop if indexer is None).

@@ -997,10 +997,9 @@ def drop(self, labels, axis=0, level=None):
else:
indexer = -axis.isin(labels)

slicer = [slice(None)] * self.ndim
slicer[self._get_axis_number(axis_name)] = indexer
slicer = _axis_slicer(indexer, axis=self._get_axis_number(axis_name), ndim=self.ndim)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you make sure that indexer and ndim will always be correct for passing here? I know _get_axis_number checks and raises an apporpriate error.

@dalejung
Copy link
Contributor Author

@jtratner I'm confused. I thought we had removed all asserts from core code a year ago since assert statements are removed when python is run with the optimize flag.

@jtratner
Copy link
Contributor

@dalejung there were a lot of places using asserts incorrectly (like checking that input to apply was correct or something). But if this function is actually internal (i.e., everything that calls it is supposed to check that the arguments are correct), then asserts make sense because they're doing exactly what they're supposed to be doing: checking that the internal logic is what you expect. (and if they ever bubble up to the user, that's an error in programming). Thus, it's okay if they are removed with python -O because they should not be necessary for effective running of the program.

Moreover, any exception that this could raise wouldn't make sense to the user. What the heck does it mean to have this wrong start or stop or ndim or indexer? The caller needs to raise a more useful error to the user.

@dalejung
Copy link
Contributor Author

How is an AssertionError, with the same message, any more understandable by an end user?

If python -O is run, doesn't it make more sense to raise than to silently return bad data and have it possibly error in some less useful place? Or even worse, follow the chain and return bad data? There is one assert statement in core and it was put in there 10 days ago.

@jtratner
Copy link
Contributor

@dalejung it's not, that's why caller functions have to check for these conditions first. The point is that none of those checks are necessary, because callers need to check for it - you're just establishing the logic that you think should be the case when this function is called. The reason that python -O strips out asserts is that those statements are supposed to be unnecessary. (which, again, they are going to be, because callers have to check it).

@jtratner
Copy link
Contributor

the entirety of the change with -O is that asserts are stripped - http://stackoverflow.com/questions/2830358/what-are-the-implications-of-running-python-with-the-optimize-flag , there's no benefit to it other than that.

@jtratner
Copy link
Contributor

@dalejung last comment on this from me: the reason that all the asserts were changed to raise AssertionError is because we weren't clear which were "internal" asserts (i.e., all callers check that the input is correct) vs. errors validating user input (which need to stay in regardless). Anything that validates user input will eventually be changed to a different kind of error because AssertionError is the wrong type. Those that still remain we still haven't gotten to yet. AssertionErrors that have a # pragma: no cover with them could be changed to asserts because it it (should be) impossible to get to that state from the public API.

@jtratner
Copy link
Contributor

Anyways, if you don't agree with me on this, then I'll just agree to disagree with you. Then you'd just need to choose the right exceptions to raise (can't raise Exception - it's bad style and hard to catch later) and, for each exception, it would be good to add a test case that triggers it from the public API.

@dalejung
Copy link
Contributor Author

@jtratner Well, I think I can actual resolve this. From the original comment about it on #4874:

It's just a small util I've been using when generating tests, but I'm not sure if it should go into pandas and if so where.

I actually use it external to pandas core development. It came out of a private repo. So it's only quasi-internal. Personally, I use a lot of "internal" functions from pandas like the low level binning. And outside of me, who's to say other people might not use a convenience function in their own libs.

I'll update the exceptions.

@jreback
Copy link
Contributor

jreback commented Sep 29, 2013

@dalejung @jtratner mergeable?

@dalejung
Copy link
Contributor Author

@jreback Not atm. Need to update exceptions. Got sidetracked on other work.

@jreback
Copy link
Contributor

jreback commented Sep 29, 2013

@dalejung ok...np...I believe you need this before the Panel.shift.

ping when ready

@jreback
Copy link
Contributor

jreback commented Oct 4, 2013

@dalejung ?

1 similar comment
@jreback
Copy link
Contributor

jreback commented Oct 11, 2013

@dalejung ?

@jreback jreback mentioned this pull request Oct 14, 2013
7 tasks
@jreback jreback closed this Jan 3, 2014
@jreback jreback reopened this Jan 3, 2014
@jreback
Copy link
Contributor

jreback commented Jan 3, 2014

@dalejung was accid closed, but needs to revisit this in any event

@jreback
Copy link
Contributor

jreback commented Feb 16, 2014

pls rebase this....this would be nice as an internal function

@jreback
Copy link
Contributor

jreback commented Feb 18, 2014

@dalejung this is actually a nice, PR, pls reopen/resubmit when ready

@jreback jreback closed this Feb 18, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Internals Related to non-user accessible pandas implementation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants