Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Combining attrs of member DataArrays of Datasets #4017

Closed
wants to merge 2 commits into from

Conversation

johnomotani
Copy link
Contributor

While looking at #4009, I noticed that the combine_attrs kwarg to concat or merge only affects the global attrs of the Dataset, not the attrs of DataArray or Variable members of the Datasets being combined. I think this is a bug.

So far this PR adds tests that reproduce the issue in #4009, and the issue described above. Fixing should be fairly simple: for #4009 pass combine_attrs="drop" to combine_by_coords and combine_nested in open_mfdataset; for this issue insert the combine_attrs handling in an appropriate place - possibly in merge.unique_variable. I'll update with fixes when I get a chance.

Need to pass combine_attrs="drop", to allow attrs_file to set the attrs.
Comment on lines +2670 to +2683
def test_open_mfdataset_dataarray_attr_by_coords(self):
"""
Case when an attribute of a member DataArray differs across the multiple files
"""
with self.setup_files_and_datasets() as (files, [ds1, ds2]):
# Give the files an inconsistent attribute
for i, f in enumerate(files):
ds = open_dataset(f).load()
ds["v1"].attrs["test_dataarray_attr"] = i
ds.close()
ds.to_netcdf(f)

with xr.open_mfdataset(files, combine="by_coords", concat_dim="t") as ds:
assert ds["v1"].test_dataarray_attr == 0
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@TomNicholas do you think we should make this test pass (i.e. support combining attrs on DataArrays contained in the Dataset being loaded)?

Proposal for a fix:

  • Support open_mfdataset-style attribute combining by upgrading the combine_attrs argument to combine_nested/combine_by_coords (and probably the other functions that have a combine_attrs argument) so it can be passed one of the Datasets or DataArrays being combined/merged, and will then take the attrs from that.
  • In open_mfdataset use the attrs_file argument to get the Dataset to take the attrs from, and pass that to the combine_attrs argument of combine_nested/combine_by_coords.

@dcherian dcherian mentioned this pull request May 26, 2020
23 tasks
@johnomotani johnomotani mentioned this pull request Jun 24, 2020
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Incoherencies between docs in open_mfdataset and combine_by_coords and its behaviour.
1 participant