Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

xr.decode_cf() fails to replace FillValue with np.nan with numpy >= 2.0 when data array dtype is np.float32 #9381

Open
5 tasks done
ghiggi opened this issue Aug 19, 2024 · 3 comments

Comments

@ghiggi
Copy link

ghiggi commented Aug 19, 2024

What happened?

Since numpy >= 2.0, xr.decode_cf() fails to replace FillValue with np.nan when:

  • the data array dtype is np.float32
  • the FillValue attribute is of type float.

The issue does not occur for data arrays with dtype np.float64.

This bug should not affect decoding of dataset opened with open_dataset because since numpy >= 2.0, numeric values in the attributes are now directly stored as np.int<...> or np.float32 types instead of the previously used int or float types.

To keep my code compatible with all numpy versions I currently use:

if version.parse(np.__version__) >= version.parse("2.0.0"):
    vars_and_coords = list(ds.data_vars) + list(ds.coords)
    for var in vars_and_coords:
        if "_FillValue" in ds[var].attrs:
            ds[var].attrs["_FillValue"] = ds[var].data.dtype.type(ds[var].attrs["_FillValue"])

What did you expect to happen?

xr.decode_cf() to replace fillvalues with np.nan also when the FillValue attribute is of type float.

Minimal Complete Verifiable Example

import numpy as np
import xarray as xr
 
fill_value = -99.9 
data = np.ones((2,3)) * fill_value
data = data.astype(np.float32) # The error does not occur with float64 !

ds = xr.DataArray(data, dims=["x", "y"]).to_dataset(name="var")
ds["var"].attrs["_FillValue"] = fill_value # float 
xr.decode_cf(ds)["var"].data # do not replace -99.9 values with np.nan
ds["var"].attrs["_FillValue"] = np.float32(fill_value) # np.float 
xr.decode_cf(ds)["var"].data # do replace -99.9 values with np.nan

MVCE confirmation

  • Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
  • Complete example — the example is self-contained, including all data and the text of any traceback.
  • Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
  • New issue — a search of GitHub Issues suggests this is not a duplicate.
  • Recent environment — the issue occurs with the latest version of xarray and its dependencies.

Environment

INSTALLED VERSIONS ------------------ commit: None python: 3.11.9 | packaged by conda-forge | (main, Apr 19 2024, 18:36:13) [GCC 12.3.0] python-bits: 64 OS: Linux OS-release: 6.2.0-33-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: ('en_US', 'UTF-8') libhdf5: 1.14.3 libnetcdf: 4.9.2

xarray: 2024.7.0
pandas: 2.2.2
numpy: 2.0.1
scipy: 1.14.0
netCDF4: 1.7.1
pydap: None
h5netcdf: None
h5py: 3.11.0
zarr: None
cftime: 1.6.4
nc_time_axis: None
iris: None
bottleneck: None
dask: 2024.8.1
distributed: 2024.8.1
matplotlib: 3.9.2
cartopy: 0.23.0
seaborn: None
numbagg: None
fsspec: 2024.6.1
cupy: None
pint: None
sparse: None
flox: None
numpy_groupies: None
setuptools: 72.1.0
pip: 24.2
conda: None
pytest: 8.3.2
mypy: None
IPython: 8.26.0
sphinx: 8.0.2

@ghiggi ghiggi added bug needs triage Issue that has not been reviewed by xarray team member labels Aug 19, 2024
@dcherian dcherian added topic-CF conventions and removed needs triage Issue that has not been reviewed by xarray team member labels Aug 23, 2024
dcherian added a commit to dcherian/xarray that referenced this issue Aug 23, 2024
@dcherian
Copy link
Contributor

Thanks for the nice bug report, this seems to be broken with float16 too.

Are you able to investigate a fix ?

@kmuehlbauer
Copy link
Contributor

It seems we need to relax the check for equality in case of different dtype.

@kmuehlbauer
Copy link
Contributor

Would it help to cast fv to dtype here:

https://github.com/pydata/xarray/blob/main/xarray/coding/variables.py#L228-L232

dcherian added a commit that referenced this issue Aug 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants