Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

keep attrs in reset_index #4103

Merged
merged 5 commits into from
Jun 5, 2020
Merged

Conversation

OriolAbril
Copy link
Contributor

@OriolAbril OriolAbril commented May 28, 2020

Modifies the code in reset_index to keep attributes when converting an indexing coordinate to non indexing coordinate. I have added tests for single index for both dataarray and dataset. Not sure if both are needed as they end up calling the same base function.

Regarding multiindex, I think it is not possible to keep the metadata as it is removed when creating the multiindex/stacking.

@dcherian
Copy link
Contributor

Nice job @OriolAbril! you seem to have found a bug where renaming doesn't create an IndexVariable when necessary. I will look into this.

__________________ TestDataArray.test_reset_index_keep_attrs ___________________

self = <xarray.tests.test_dataarray.TestDataArray object at 0x7f266d36c4d0>

    def test_reset_index_keep_attrs(self):
        coord_1 = xr.DataArray([1, 2], dims=["coord_1"], attrs={"attrs": True})
        da = xr.DataArray([1, 0], [coord_1])
        obj = da.reset_index("coord_1").rename({"coord_1_": "coord_1"})
>       assert_identical(da, obj)

xarray/tests/test_dataarray.py:1837: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
xarray/testing.py:267: in _assert_internal_invariants
    _assert_dataarray_invariants(xarray_obj)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

da = <xarray.DataArray (coord_1: 2)>
array([1, 0])
Coordinates:
  * coord_1  (coord_1) int64 1 2

    def _assert_dataarray_invariants(da: DataArray):
        assert isinstance(da._variable, Variable), da._variable
        _assert_variable_invariants(da._variable)
    
        assert isinstance(da._coords, dict), da._coords
        assert all(isinstance(v, Variable) for v in da._coords.values()), da._coords
        assert all(set(v.dims) <= set(da.dims) for v in da._coords.values()), (
            da.dims,
            {k: v.dims for k, v in da._coords.items()},
        )
        assert all(
            isinstance(v, IndexVariable) for (k, v) in da._coords.items() if v.dims == (k,)
>       ), {k: type(v) for k, v in da._coords.items()}
E       AssertionError: {'coord_1': <class 'xarray.core.variable.Variable'>}

@OriolAbril
Copy link
Contributor Author

Thanks @dcherian! If there is anything I can do to help please say so. I don't really know where to start searching for this new error but I can run some tests or look into it if given some pointers. Whatever means less work for you.

Regarding current test, I can modify it so it does not trigger the bug and open an issue for this second bug. Is this ok or do yo prefer to tackle both in this PR?

@dcherian
Copy link
Contributor

Modified test sounds good to me. I'll open another issue.

@OriolAbril
Copy link
Contributor Author

Fixed tests. Now single index coordinates will keep their attributes when converted to non indexing coordinates.

I think changes in code would also make multi index keep their attributes, but I don't think multiindex can have attributes so it does not make any difference. I was wondering is this would be enough to close the original issue or if there is extra work to be done with multiindex.

@dcherian
Copy link
Contributor

Thanks @OriolAbril . This looks good to me. Can you add a note under enhancements in whats-new.rst?

@dcherian dcherian merged commit c07160d into pydata:master Jun 5, 2020
@dcherian
Copy link
Contributor

dcherian commented Jun 5, 2020

Thanks @OriolAbril . I see this is your first PR. Thanks!

@OriolAbril OriolAbril deleted the reset_index_attrs branch June 5, 2020 19:41
@OriolAbril
Copy link
Contributor Author

Yep, I hope I'll have time for more to come :)

dcherian added a commit to TomNicholas/xarray that referenced this pull request Jun 24, 2020
…o-combine

* 'master' of github.com:pydata/xarray: (81 commits)
  use builtin python types instead of the numpy alias (pydata#4170)
  Revise pull request template (pydata#4039)
  pint support for Dataset (pydata#3975)
  drop eccodes in docs (pydata#4162)
  Update issue templates inspired/based on dask (pydata#4154)
  Fix failing upstream-dev build & remove docs build (pydata#4160)
  Improve typehints of xr.Dataset.__getitem__ (pydata#4144)
  provide a error summary for assert_allclose (pydata#3847)
  built-in accessor documentation (pydata#3988)
  Recommend installing cftime when time decoding fails. (pydata#4134)
  parameter documentation for DataArray.sel (pydata#4150)
  speed up map_blocks (pydata#4149)
  Remove outdated note from datetime accessor docstring (pydata#4148)
  Fix the upstream-dev pandas build failure (pydata#4138)
  map_blocks: Allow passing dask-backed objects in args (pydata#3818)
  keep attrs in reset_index (pydata#4103)
  Fix open_rasterio() for WarpedVRT with specified src_crs (pydata#4104)
  Allow non-unique and non-monotonic coordinates in get_clean_interp_index and polyfit (pydata#4099)
  update numpy's intersphinx url (pydata#4117)
  xr.infer_freq (pydata#4033)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

reset_index does not keep attributes
2 participants