Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix mypy errors in testcommon py #29179

Merged
merged 114 commits into from
Nov 7, 2019

Conversation

HawkinsBA
Copy link
Contributor

@HawkinsBA HawkinsBA commented Oct 23, 2019

  • xref #28926
  • tests added / passed
  • passes black pandas
  • passes git diff upstream/master -u -- "*.py" | flake8 --diff
  • whatsnew entry

Blake Hawkins added 2 commits October 19, 2019 13:55
Apply black formatting to test_common.py
@jreback jreback added the Typing type annotations, mypy/pyright type checking label Oct 23, 2019
@WillAyd
Copy link
Member

WillAyd commented Oct 23, 2019

Can you post the error this solves? Does annotating these instead of unpacking change anything?

@HawkinsBA
Copy link
Contributor Author

HawkinsBA commented Oct 23, 2019

pandas\tests\dtypes\test_common.py:327: error: Unsupported operand types for + ("List[Series]" and "List[str]")
pandas\tests\dtypes\test_common.py:355: error: Unsupported operand types for + ("List[Series]" and "List[object]")
pandas\tests\dtypes\test_common.py:387: error: Unsupported operand types for + ("List[Series]" and "List[str]")

In the original file it looks like both lists of Strings and Dtypes (ALL_INT_DTYPES,and to_numpy_dtypes(ALL_INT_DTYPES)) are are concatenated to a List[Series], so my rationale behind unpacking these is to preserve the resulting list of both of these list types being concatenated without the "+" operator.

Would it be better to approach this problem preserving concatenation with type annotations?

Copy link
Member

@WillAyd WillAyd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK might just have to live with the unpacking then

+ ALL_EA_INT_DTYPES
+ to_ea_dtypes(ALL_EA_INT_DTYPES),
[
type(pd.Series([1, 2])),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this change to type intentional?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. If I understand the list correctly then a Series is a data type that should be accounted for in these tests, right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can see where the naming is confusing but you should keep that the same - is_integer_dtype works on types and arrays

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, I reverted that change.

Removed type(pd.Series[...])
@jbrockmendel
Copy link
Member

@HawkinsBA can you rebase

jbrockmendel and others added 21 commits October 28, 2019 19:13
* xfail on numpy 1.18

* CI: try using numpy wheel
* API: Restore getting name from MultiIndex level

xref https://issues.apache.org/jira/browse/ARROW-6922 /
pandas-dev#27242 (comment)
/ pandas-dev#29032

No docs yet, since it isn't clear how this will eventually sort out. But
we at least want to preserve this behavior for 1.0

* fixups
@pep8speaks
Copy link

pep8speaks commented Nov 3, 2019

Hello @HawkinsBA! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2019-11-07 01:14:18 UTC

@HawkinsBA
Copy link
Contributor Author

HawkinsBA commented Nov 3, 2019

So without Any in the type annotation I get the following MyPy errors while the Azure Pipelines run:

pandas/tests/dtypes/test_common.py:336: error: Unsupported operand types for + ("List[Union[Series, str, str, Any, ExtensionDtype, ExtensionDtype]]" and "List[str]") pandas/tests/dtypes/test_common.py:404: error: Unsupported operand types for + ("List[Union[Series, str, str, Any, ExtensionDtype, ExtensionDtype]]" and "List[str]")

However when Any is included in the type annotation I get the following errors:

pandas/tests/dtypes/test_common.py:336: error: Unsupported operand types for + ("List[Union[Series, str, str, Any, ExtensionDtype, Any, ExtensionDtype]]" and "List[str]") pandas/tests/dtypes/test_common.py:404: error: Unsupported operand types for + ("List[Union[Series, str, str, Any, ExtensionDtype, Any, ExtensionDtype]]" and "List[str]")

I'm not sure what to make of this, but I'm looking into trying to figure out what the Any might be so I can put it into the type annotation.

@WillAyd
Copy link
Member

WillAyd commented Nov 3, 2019

Does changing from List to Sequence help? Haven't reviewed thoroughly but the invariant nature of the List might be the issue

@@ -322,9 +326,13 @@ def test_is_datetimelike():
assert com.is_datetimelike(s)


integer_dtypes = [] # type: Sequence[Union[pd.Series, str, Dtype, ExtensionDtype]]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK I think just type: List will be fine for these

Copy link
Member

@WillAyd WillAyd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm. @simonjayhawkins care to take a look as well?

@HawkinsBA the CI failure is unrelated. If you merge master and re-push should go away

@simonjayhawkins
Copy link
Member

lgtm. @simonjayhawkins care to take a look as well?

@WillAyd

noting @jbrockmendel comment regarding unpacking, which mypy accepts in an expression. I'm not convinced that this is more clear. Especially, since the use of generic List without type parameters is the same as List[Any] so does not add value to the type checking.

I know we have different opinions on the use of type: ignore but could potentially be used here and reference the mypy issue on list concatenation in expressions?

alternatively, perhaps assign all the values to the list in the initial assignment? could still use #type: List but would not need an expression concatenating items to an empty list. I think this would be clearer to a newcomer. Maybe use block capitals for the variable names. wdyt?

@WillAyd
Copy link
Member

WillAyd commented Nov 7, 2019

I don't have a strong preference; whatever works for me

@simonjayhawkins
Copy link
Member

both these suggestions are effectively silencing mypy, whereas at least with unpacking the inferred type should have been more precise.

@jbrockmendel considering the above, are you against unpacking? This problem is likely to surface again?

@WillAyd
Copy link
Member

WillAyd commented Nov 7, 2019

Misinterpreted your earlier comment but I would prefer not to change to unpacking. These are just arguments to parameterize; we aren't mapping any kind of operations against the entire container so we don't need to fine-grain inference; either List or ignore would be OK I think

@simonjayhawkins
Copy link
Member

I'm OK with just List but think the assignment should be in one statement for clarity.

@jbrockmendel
Copy link
Member

considering the above, are you against unpacking? This problem is likely to surface again?

If you're referring to the *args pattern I'd consider it near-last-resort, but not a strong enough opinion to hold up a PR if there is consensus.

@simonjayhawkins
Copy link
Member

yeah, so I think we're agreed.

summarising since this (mypy) issue will crop up again...

the option that works for mypy, i.e. unpacking, means changing the code and is not newcomer friendly.

type:ignore may mask other issues

so refactoring out list concatenation in an expression into a variable and adding a variable annotation is the preferred approach.

Copy link
Member

@simonjayhawkins simonjayhawkins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so refactoring out list concatenation in an expression into a variable and adding a variable annotation is the preferred approach.

it appears that to do this in one statement would require a cast...

integer_dtypes = (
    cast(List, [pd.Series([1, 2])])
    + ALL_INT_DTYPES
    + to_numpy_dtypes(ALL_INT_DTYPES)
    + ALL_EA_INT_DTYPES
    + to_ea_dtypes(ALL_EA_INT_DTYPES)
)


@pytest.mark.parametrize("dtype", integer_dtypes)

otherwise would still need two statements...

integer_dtypes = []  # type: List
integer_dtypes = (
    integer_dtypes
    + [pd.Series([1, 2])]
    + ALL_INT_DTYPES
    + to_numpy_dtypes(ALL_INT_DTYPES)
    + ALL_EA_INT_DTYPES
    + to_ea_dtypes(ALL_EA_INT_DTYPES)
)


@pytest.mark.parametrize("dtype", integer_dtypes)

@WillAyd given that this is still not much clearer, merge as is if happy.

@WillAyd
Copy link
Member

WillAyd commented Nov 7, 2019

Yea pretty tricky with these...I suppose the real intent is tough to realize from a static analysis. Curious to see how many we come across going forward

@WillAyd WillAyd added this to the 1.0 milestone Nov 7, 2019
@WillAyd WillAyd merged commit b313f0c into pandas-dev:master Nov 7, 2019
@WillAyd
Copy link
Member

WillAyd commented Nov 7, 2019

Thanks @HawkinsBA for the PR and @simonjayhawkins for review

Reksbril pushed a commit to Reksbril/pandas that referenced this pull request Nov 18, 2019
proost pushed a commit to proost/pandas that referenced this pull request Dec 19, 2019
proost pushed a commit to proost/pandas that referenced this pull request Dec 19, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Typing type annotations, mypy/pyright type checking
Projects
None yet
Development

Successfully merging this pull request may close these issues.