Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pandas casting issues #1734

Merged
merged 6 commits into from
Jan 11, 2018
Merged

pandas casting issues #1734

merged 6 commits into from
Jan 11, 2018

Conversation

0x0L
Copy link
Contributor

@0x0L 0x0L commented Nov 21, 2017

Added a comment about constructing pandas objects from xarray objects

@fmaussion
Copy link
Member

Thanks for the PR! Doc improvements are always appreciated.

Maybe this would be better placed in the computations section? (http://xarray.pydata.org/en/latest/computation.html)

Also, I don't understand what you mean with Constructing `pandas` objects sometimes require explicit casting. Maybe we should refer to the working with pandas page then? (http://xarray.pydata.org/en/latest/pandas.html)

@0x0L
Copy link
Contributor Author

0x0L commented Nov 21, 2017

@fmaussion

In a nutshell,

x = xr.DataArray([1,2,3])
pd.Series({'mean': x.mean()})

does not work as expected. You'll want one of

pd.Series({'mean': float(x.mean())})
pd.Series({'mean': x.mean()}, dtype=float)
pd.Series({'mean': x.mean()}).astype(float)

@shoyer
Copy link
Member

shoyer commented Nov 29, 2017

Can you merge master? #1747 should fix the build failure

@0x0L
Copy link
Contributor Author

0x0L commented Nov 30, 2017

@shoyer all good

doc/faq.rst Outdated
@@ -58,6 +58,15 @@ fundamentally multi-dimensional. If your data is unstructured or
one-dimensional, stick with pandas.


Gotchas
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you frame this as a question, so it looks more similar to the other headings?

e.g., "Why don't aggregations return Python scalars?"

doc/faq.rst Outdated
-------

xarray tries hard to be self-consistent. Operations on xarray objects return
other xarray objects. In particular, operations like `mean` or `sum` - even
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also mention indexing here?

doc/faq.rst Outdated
xarray tries hard to be self-consistent. Operations on xarray objects return
other xarray objects. In particular, operations like `mean` or `sum` - even
when applied to all axes - will return xarray objects. Constructing `pandas`
objects sometimes require explicit casting.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't quite understand what this last comment means, maybe an example would help?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll add something similar to my reply to @fmaussion above

doc/faq.rst Outdated
Gotchas
-------

xarray tries hard to be self-consistent. Operations on xarray objects return
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would note that the self-consistency is that the DataArray methods already return DataArrays (and the same for Dataset), i.e., it preserves the type. This isn't true for NumPy or pandas, which return scalars.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just noticed but numpy aggregation funcs return boxed objects that look like an ndarray

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, NumPy and pandas return boxed scalar types (subclasses of np.generic).

Copy link
Member

@shoyer shoyer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great, I have one more suggestion to add to this entry.

doc/faq.rst Outdated
other xarray objects. In particular, operations like `mean` or `sum` - even
when applied to all axes - will return xarray objects. Constructing `pandas`
objects sometimes require explicit casting.
pd.Series({'x': arr[0], 'mean': arr.mean(), 'std': arr.std()}).astype(float)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also note float(arr[0]) and arr[0].item() as two others ways to convert scalars to builtin types?

Copy link
Member

@shoyer shoyer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. I'll leave this open for a little while before merging in case anyone else has suggestions.

Actually, one thing that could be nice is to call out a brief note in "what's new" pointing users to this new FAQ page entry.

@0x0L
Copy link
Contributor Author

0x0L commented Dec 4, 2017

@shoyer Thanks for the help for the writing, most of these are your own words ^^

Documentation
~~~~~~~~~~~~~

- New entry `Why don’t aggregations return Python scalars?` in the :doc:`faq`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider giving yourself credit here :)

@jhamman
Copy link
Member

jhamman commented Jan 11, 2018

@0x0L - I gave you credit on the whats new page and synced with master. I'll merge once the tests pass here again.

@0x0L
Copy link
Contributor Author

0x0L commented Jan 11, 2018

@jhamman Thank you

@jhamman jhamman merged commit 502a988 into pydata:master Jan 11, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants