Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add hypothesis test for netCDF4 roundtrip #3283

Open
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

takluyver
Copy link
Member

Part of #1846: add a property-based test for reading & writing netCDF4 files.

This is the first time I've played with Hypothesis, but it seems to be working - e.g. I got an error with float16, and the netCDF docs show that 16-bit floats are not a supported data type.

However:

  • This currently only tests a dataset with a single variable - it could be extended to multiple variables if that's useful.
  • It looks like netCDF4 should support unicode characters, but it failed when I didn't have max_codepoint=255 in there. I don't know if that's an expected limitation I'm not aware of, or a bug somewhere. But I thought I'd make the test pass for now.

@pep8speaks
Copy link

pep8speaks commented Sep 6, 2019

Hello @takluyver! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2019-10-12 09:28:07 UTC

@dcherian
Copy link
Contributor

Thanks @takluyver. LGTM but maybe @Zac-HD can review this and #3285


# Run for a while - arrays are a bigger search space than usual
settings.register_profile("ci", deadline=None)
settings.load_profile("ci")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd delete this part - it's at best redundant given the changes in #3285 and I'd prefer that config.

compatible_names = st.text(
alphabet=st.characters(
whitelist_categories=("Ll", "Lu", "Nd"),
# It looks like netCDF should allow unicode names, but removing
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is this specified? It would be useful to provide a link and prehaps explain what's going on with the categories too 😄

properties/test_netcdf_roundtrip.py Show resolved Hide resolved
properties/test_netcdf_roundtrip.py Outdated Show resolved Hide resolved
@jhamman jhamman mentioned this pull request Oct 7, 2019
9 tasks
@jhamman
Copy link
Member

jhamman commented Oct 10, 2019

@takluyver - just checking in on this one to see if there is interested in wrapping this up.

@jhamman jhamman closed this Oct 10, 2019
@jhamman jhamman reopened this Oct 10, 2019
@takluyver
Copy link
Member Author

Yes, sorry, it dropped off my radar. I've implemented all the changes suggested above.

The datetime64 & timedelta64 cases don't roundtrip completely, so there are test failures. I've added these in a separate commit at the end, so it's easy to cut that back out if you prefer.

@Zac-HD
Copy link
Contributor

Zac-HD commented Oct 11, 2019

The datetime64 & timedelta64 cases don't roundtrip completely, so there are test failures. I've added these in a separate commit at the end, so it's easy to cut that back out if you prefer.

The basic problem seems to be that Hypothesis is generating timedelta64[Y], but Xarray is dealing with timedelta64[ns]:

____________________________ test_netcdf_roundtrip _____________________________
tmp_path = PosixPath('/tmp/pytest-of-vsts/pytest-0/test_netcdf_roundtrip0')
data = data(...), arr = array([293], dtype='timedelta64[Y]')
        ...
        with xr.open_dataset(tmp_path / "test.nc") as roundtripped:
>           xr.testing.assert_identical(original, roundtripped)
E           AssertionError: Left and right Dataset objects are not identical
E           
E           Differing data variables:
E           L   data     (0) timedelta64[ns] -106488 days +01:41:02.290448
E           R   data     (0) timedelta64[ns] -106488 days +01:41:02.290449

properties/test_netcdf_roundtrip.py:51: AssertionError
---------------------------------- Hypothesis ----------------------------------
Falsifying example: test_netcdf_roundtrip(
    tmp_path=PosixPath('/tmp/pytest-of-vsts/pytest-0/test_netcdf_roundtrip0'), 
    data=data(...), 
    arr=array([293], dtype='timedelta64[Y]')
)

So either that's a pretty serious bug, or you should specify the max_period of the timedelta dtype strategy.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants