Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FIX: h5py>=3 string decoding #4893

Merged
merged 8 commits into from
Feb 17, 2021
Merged
2 changes: 1 addition & 1 deletion ci/requirements/environment-windows.yml
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ dependencies:
- dask<2021.02.0
- distributed
- h5netcdf
- h5py=2
- h5py
- hdf5
- hypothesis
- iris
Expand Down
2 changes: 1 addition & 1 deletion ci/requirements/environment.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ dependencies:
- dask<2021.02.0
- distributed
- h5netcdf
- h5py=2
- h5py
- hdf5
- hypothesis
- iris
Expand Down
2 changes: 1 addition & 1 deletion ci/requirements/py38-all-but-dask.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ dependencies:
- cftime
- coveralls
- h5netcdf
- h5py=2
- h5py
- hdf5
- hypothesis
- lxml # Optional dep of pydap
Expand Down
7 changes: 7 additions & 0 deletions xarray/backends/h5netcdf_.py
Original file line number Diff line number Diff line change
Expand Up @@ -131,6 +131,7 @@ def open(
autoclose=False,
invalid_netcdf=None,
phony_dims=None,
decode_strings=None,
):

if isinstance(filename, bytes):
Expand All @@ -157,6 +158,10 @@ def open(
"h5netcdf backend keyword argument 'phony_dims' needs "
"h5netcdf >= 0.8.0."
)
if LooseVersion(h5netcdf.__version__) >= LooseVersion("0.9.0") and LooseVersion(
h5netcdf.core.h5py.__version__
) >= LooseVersion("3.0.0"):
kwargs["decode_strings"] = decode_strings

if lock is None:
if mode == "r":
Expand Down Expand Up @@ -358,6 +363,7 @@ def open_dataset(
lock=None,
invalid_netcdf=None,
phony_dims=None,
decode_strings=True,
):

store = H5NetCDFStore.open(
Expand All @@ -367,6 +373,7 @@ def open_dataset(
lock=lock,
invalid_netcdf=invalid_netcdf,
phony_dims=phony_dims,
decode_strings=decode_strings,
)

store_entrypoint = StoreBackendEntrypoint()
Expand Down
7 changes: 5 additions & 2 deletions xarray/coding/strings.py
Original file line number Diff line number Diff line change
Expand Up @@ -191,8 +191,11 @@ def _numpy_char_to_bytes(arr):
"""Like netCDF4.chartostring, but faster and more flexible."""
# based on: http://stackoverflow.com/a/10984878/809705
arr = np.array(arr, copy=False, order="C")
dtype = "S" + str(arr.shape[-1])
return arr.view(dtype).reshape(arr.shape[:-1])
shape = arr.shape
dtype = "S" + str(shape[-1])
if arr.dtype.kind == "O":
arr = arr.astype("S1")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. We should probably just fix this newly introduced issue over in h5netcdf first :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shoyer Sure, I'll ping you over there.

return arr.view(dtype).reshape(shape[:-1])


class StackedBytesArray(indexing.ExplicitlyIndexedNDArrayMixin):
Expand Down