Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

REF: ensure name and cname are always str #29692

Merged
merged 5 commits into from
Nov 20, 2019

Conversation

jbrockmendel
Copy link
Member

Once this is assured, there are a lot of other things (including in core.computation!) that we can infer in follow-ups.

return False
return getattr(self.table.cols, self.cname).is_indexed # type: ignore
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@simonjayhawkins I expected the hasattr check above to be enough to make this # type: ignore unnecessary, but it wasn't. any ideas?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see python/mypy#1424

best to add the mypy error message as a comment and also a link to the mypy issue.

might also be best to split the return statement to isolate the ignore to the result of the getattr expression.

then casting as suggested in the mypy issue would still have the return type of is_indexed of the casted type checked against the return type of this method.
This could result in less clear code so will leave it to you to decide whether you think it warrants changing the code here.

This could also be done my error codes, see #29197, but needs more discussion

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comment added

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry I couldn't look in detail yesterday and just gave a generic answer.

the error message is error: "None" has no attribute "cols".

reveal_type(self.table) gives note: Revealed type is 'None'

in IndexCol.__init__, self.table = None

I think if you add the type annotation for self.table and add assert self.table is not None before the return statement, we can avoid the ignore.

(in general if you can include the exact mypy error message as a comment makes it easier/quicker to review. We do this in other places)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if you add the type annotation for self.table and add assert self.table is not None before the return statement, we can avoid the ignore.

AFAICT this still doesnt work because the Table class still doesnt have a "cols" attr.

(in general if you can include the exact mypy error message as a comment makes it easier/quicker to review. We do this in other places)

Will update. Thanks for being patient with me

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAICT this still doesnt work because the Table class still doesnt have a "cols" attr.

fair enough. the generic comment above is valid. a type ignore is likely required in many places where we are using hasattr.

I thought that maybe the type of self.table would be a type with a "cols" attr. and that the hasattr was used to allow duck typing of other classes with a "cols" attr.

If this were the case mypy would only be checking the nominal type defined for self.table ( and None)

@@ -1691,29 +1691,37 @@ class IndexCol:
is_data_indexable = True
_info_fields = ["freq", "tz", "index_name"]

name: str
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think had this discussion on another PR can't remember where we landed, but do these declarations need to be in the class scope? Not sure every reader would know why these are there and may in fact infer that they should be class variables.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This topic has definitely come up and the conclusion may have just gone over my head. When I see this, I read it as "self.name is going to be defined in __init__, and is always going to be a str. This is helpful to me, the reader, since this class seems to set its attributes all over the place"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha. Actually reading through PEP 526 this looks to be correct, so nice find

https://www.python.org/dev/peps/pep-0526/#class-and-instance-variable-annotations

@simonjayhawkins

Copy link
Member

@WillAyd WillAyd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@@ -4649,12 +4661,15 @@ def _set_tz(values, tz, preserve_UTC: bool = False, coerce: bool = False):
return values


def _convert_index(index, encoding=None, errors="strict", format_type=None):
def _convert_index(name: str, index, encoding=None, errors="strict", format_type=None):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

encoding and errors are I believe str if wanted to add here; I assume index might be an instance of Index though if non-trivial to verify certainly leave out

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ive got another branch going that i think gets these. tried to keep this one pretty narrow since its not just annotations

@WillAyd WillAyd added the Refactor Internal refactoring of code label Nov 18, 2019
@jreback jreback added this to the 1.0 milestone Nov 20, 2019
@@ -3867,6 +3877,9 @@ def get_blk_items(mgr, blocks):
if data_columns and len(b_items) == 1 and b_items[0] in data_columns:
klass = DataIndexableCol
name = b_items[0]
if not (name is None or isinstance(name, str)):
# TODO: should the message here be more specifically non-str?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this actually hit anywhere? I think this is by-definition not possible.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is. we have a test that gets here with an np.int64

@jreback jreback merged commit eddd9f0 into pandas-dev:master Nov 20, 2019
@jreback
Copy link
Contributor

jreback commented Nov 20, 2019

thanks

@jbrockmendel jbrockmendel deleted the cln-pytables5 branch November 20, 2019 17:47
proost pushed a commit to proost/pandas that referenced this pull request Dec 19, 2019
proost pushed a commit to proost/pandas that referenced this pull request Dec 19, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Refactor Internal refactoring of code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants