Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DOC: #22899, Fixed docstring of itertuples in pandas/core/frame.py #22902

Merged
merged 3 commits into from
Oct 7, 2018
Merged

DOC: #22899, Fixed docstring of itertuples in pandas/core/frame.py #22902

merged 3 commits into from
Oct 7, 2018

Conversation

leeyspaul
Copy link
Contributor

@leeyspaul leeyspaul commented Sep 30, 2018

  • closes #xxxx
  • tests added / passed
  • passes git diff upstream/master -u -- "*.py" | flake8 --diff
  • whatsnew entry

#22899
I went ahead and followed the docstring conventions mentioned here:

https://pandas.pydata.org/pandas-docs/stable/contributing_docstring.html

The fixes I made were for the following errors:

screen shot 2018-09-29 at 19 17 59

screen shot 2018-09-29 at 19 18 08

  • Fixed whitespace between closing line and end of docstring
  • Converted summary into single line
  • Added Returns section (Not sure if done correctly!)
  • Fixed examples to pass test by adding ... at line 914
  • Fixed namedtuple return values at 922 and 923

Please let me know if there are any issues and I will try my best to fix them ASAP.

Thank you!~

@pep8speaks
Copy link

Hello @leeyspaul! Thanks for submitting the PR.

Copy link
Member

@datapythonista datapythonista left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @leeyspaul, great fixes. I added some comments that I think should make the documentation even better, and to respect some other conventions not mentioned in the issue.

pandas/core/frame.py Show resolved Hide resolved
@@ -894,6 +893,10 @@ def itertuples(self, index=True, name="Pandas"):
The name of the returned namedtuples or None to return regular
tuples.

Returns
-------
Returns namedtuples.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case instead of Returns, can you use Yields, as the function is a generator, and the namedtuples are yielded from it.

The exact syntax would be something like:

Yields
------
collections.namedtuple
    A description of what is being yielded.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it! Working on it now.

pandas/core/frame.py Show resolved Hide resolved
>>> df = pd.DataFrame({'col1': [1, 2], 'col2': [0.1, 0.2]},
index=['a', 'b'])
... index=['a', 'b'])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general I think it's better to have data that somehow looks real. I think it helps understand better, as you don't need to check in the DataFrame constructor that the row a, in the col2, has a 0.1. It's self-contained that a cat has 4 legs (see this example: https://github.com/pandas-dev/pandas/blob/master/pandas/core/frame.py#L7580).

Also, if you can also one examples with index=False, and one with name='Animal (or whatever concept you think it's better), that would also be valuable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome. I really like the animal ideas, it is super intuitive!

I made the changes as such and will push it up now.

        Examples
        --------
        >>> df = pd.DataFrame({'num_legs': [4,2], 'num_wings': [0,2]},
        ...                   index=['dog','hawk'])
        >>> df
              num_legs  num_wings
        dog          4          0
        hawk         2          2
        >>> for row in df.itertuples():
        ...     print(row)
        ...
        Pandas(Index='dog', num_legs=4, num_wings=0)
        Pandas(Index='hawk', num_legs=2, num_wings=2)

        By setting the `index` parameter to False we can remove the index
        as the first element of the tuple:

        >>> for row in df.itertuples(index=False):
        ...     print(row)
        ...
        Pandas(num_legs=4, num_wings=0)
        Pandas(num_legs=2, num_wings=2)

        With the `name` parameter set we set a custom name for the yielded
        namedtuples:

        >>> for row in df.itertuples(name='Animal'):
        ...     print(row)
        ...
        Animal(Index='dog', num_legs=4, num_wings=0)
        Animal(Index='hawk', num_legs=2, num_wings=2)

@datapythonista datapythonista added Docs DataFrame DataFrame data structure labels Sep 30, 2018
@codecov
Copy link

codecov bot commented Sep 30, 2018

Codecov Report

Merging #22902 into master will increase coverage by 0.88%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #22902      +/-   ##
==========================================
+ Coverage   92.18%   93.07%   +0.88%     
==========================================
  Files         169      169              
  Lines       50830    58302    +7472     
==========================================
+ Hits        46860    54262    +7402     
- Misses       3970     4040      +70
Flag Coverage Δ
#multiple 91.46% <ø> (+0.86%) ⬆️
#single 44.46% <ø> (+2.08%) ⬆️
Impacted Files Coverage Δ
pandas/core/frame.py 97.76% <ø> (+0.56%) ⬆️
pandas/core/dtypes/base.py 100% <0%> (ø) ⬆️
pandas/core/strings.py 98.66% <0%> (+0.02%) ⬆️
pandas/core/panel.py 97.93% <0%> (+0.03%) ⬆️
pandas/core/dtypes/cast.py 88.9% <0%> (+0.32%) ⬆️
pandas/core/resample.py 97.29% <0%> (+0.32%) ⬆️
pandas/core/sparse/series.py 95.77% <0%> (+0.55%) ⬆️
pandas/core/arrays/base.py 94.8% <0%> (+0.56%) ⬆️
pandas/core/indexes/datetimelike.py 98.68% <0%> (+0.58%) ⬆️
... and 19 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f771ef6...064cf33. Read the comment docs.

Copy link
Member

@datapythonista datapythonista left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks really good now. Just added couple of comments, mainly formatting things.

@@ -910,17 +911,35 @@ def itertuples(self, index=True, name="Pandas"):

Examples
--------
>>> df = pd.DataFrame({'col1': [1, 2], 'col2': [0.1, 0.2]},
... index=['a', 'b'])
>>> df = pd.DataFrame({'num_legs': [4,2], 'num_wings': [0,2]},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add spaces after the commas? You can use flake8 --doctests frame.py. It will generate many messages from other docstrings, but you can check the line numbers of yours, and make sure there is nothing wrong between them.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry totally forgot about that second test. Fixed the whitespacing between the commas.

-------
Returns namedtuples.
collections.namedtuple
Yields namedtuples of corresponding index and column values.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see what you mean here, but I think it may be difficult to understand for a beginner. May be something like Yields a namedtuple for each row in the DataFrame, which its values, and possibly the index..

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the description just now! I think it's more intuitive now, please do let me know and if anything needs to be added I'll try again.

@@ -910,17 +911,35 @@ def itertuples(self, index=True, name="Pandas"):

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the See Also, can you make the A in the title capital? And then, the items, if you can prefix iterrows and iteritems with DataFrame.iterrows...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, just prefixed both iterrows and iteritems.

Copy link
Member

@datapythonista datapythonista left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm. Excellent docstring, I think it's very clear. Thanks for the contribution @leeyspaul, looking forward for your next one. Will merge this when checks are green.

@leeyspaul
Copy link
Contributor Author

@datapythonista awesome! Thanks for all your guidance - super helpful! I'd love to contribute more, I saw issue #22896 was still not picked up. I'd like to work on it after this gets merged if possible.

@datapythonista
Copy link
Member

@leeyspaul you're working on a branch, so... git checkout master && git checkout -b <other-docstring-branch> is the only thing you need to start, even if it takes a bit to merge this. :)

Just add a comment to the issue you start working on, so nobody else works on it. Thanks!

@leeyspaul
Copy link
Contributor Author

Hey @datapythonista ! I saw that the Travis CI build failed I was wondering if this was something on my end that I need to fix. Thanks, and I'll get working on that other issue #22896 .

@datapythonista
Copy link
Member

They seem unrelated to your changes, a url that was not available. Travis is running again, let's see if this time we're lucky.

@leeyspaul
Copy link
Contributor Author

Roger! :)

@leeyspaul
Copy link
Contributor Author

leeyspaul commented Oct 4, 2018

Still no luck, I'm wondering what the issue is. I haven't done much with travis-ci so I'm not sure. @datapythonista Please let me know if there is a way I can help. :)

@JustinZhengBC
Copy link
Contributor

Same thing is happening with some of my PR's. If you look in the travis log you can see this line in the feedback from the failing tests:

Exception requests.exceptions.ConnectionError: ConnectionError(u'Connection refused: PUT https://pandas-test.s3.amazonaws.com/pyarrow.parquet',) in <bound method S3File.__del__ of <S3File pandas-test/pyarrow.parquet>> ignored

It looks like an issue with an url in the tests, so not related to our code.

@datapythonista
Copy link
Member

@leeyspaul @JustinZhengBC the problem with travis is unrelated your PRs, the issue is being tracked in #22934. Just ignore the error, we'll take care of it.

@jreback jreback added this to the 0.24.0 milestone Oct 7, 2018
@jreback jreback merged commit 66c2e5f into pandas-dev:master Oct 7, 2018
@jreback
Copy link
Contributor

jreback commented Oct 7, 2018

thanks @leeyspaul

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
DataFrame DataFrame data structure Docs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants