Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug loading netCDF files when there is a gap in time between some of the files #247

Closed
spencerahill opened this issue Dec 8, 2017 · 3 comments

Comments

@spencerahill
Copy link
Owner

I have data for a simulation spanning years 1-100 and years 701-800. When both are present in the same directory, I get a failure:

%run ~/py/scripts/aospy_main_burls.py

Requested aospy calculations:
{'date_ranges': [(datetime.datetime(701, 1, 1, 0, 0),
                  datetime.datetime(800, 12, 31, 0, 0))],
 'input_time_datatypes': ['ts'],
 'input_time_intervals': ['monthly'],
 'input_time_offsets': [None],
 'input_vertical_datatypes': ['sigma'],
 'library': <module 'aospy_user' from '/home/Spencer.Hill/py/aospy_user/aospy_user/__init__.py'>,
 'models': [Model instance "cesm"],
 'output_time_intervals': ['ann'],
 'output_time_regional_reductions': ['av', 'ts'],
 'output_vertical_reductions': [None],
 'projects': [Project instance "burls"],
 'regions': [],
 'runs': [aospy.Run instance "abrupt4x_co2"],
 'variables': [Var instance "merid_total_energy_transport"]}

Perform these computations? [y/n] y
INFO:root:Connected to client: <Client: scheduler='tcp://127.0.0.1:32791' processes=1 cores=16>
INFO:root:Getting input data: Var instance "swdn_toa" (Fri Dec  8 18:33:38 2017)
/home/s1h/anaconda/envs/py36/lib/python3.6/site-packages/xarray/conventions.py:393: RuntimeWarning: Unable to decode time axis into full numpy.datetime64 objects, continuing using dummy netCDF4.datetime objects instead, reason: dates out of range
  result = decode_cf_datetime(example_value, units, calendar)
/home/s1h/anaconda/envs/py36/lib/python3.6/site-packages/xarray/conventions.py:412: RuntimeWarning: Unable to decode time axis into full numpy.datetime64 objects, continuing using dummy netCDF4.datetime objects instead, reason: dates out of range
  calendar=self.calendar)
WARNING:root:Skipping aospy calculation `<aospy.Calc instance: merid_total_energy_transport, burls, cesm, abrupt4x_co2>` due to error with the following traceback:
Traceback (most recent call last):
  File "/home/Spencer.Hill/py/aospy/aospy/automate.py", line 251, in _compute_or_skip_on_error
    return calc.compute(**compute_kwargs)
  File "/home/Spencer.Hill/py/aospy/aospy/calc.py", line 648, in compute
    self.end_date),
  File "/home/Spencer.Hill/py/aospy/aospy/calc.py", line 473, in _get_all_data
    for n, var in enumerate(self.variables)]
  File "/home/Spencer.Hill/py/aospy/aospy/calc.py", line 473, in <listcomp>
    for n, var in enumerate(self.variables)]
  File "/home/Spencer.Hill/py/aospy/aospy/calc.py", line 425, in _get_input_data
    **self.data_loader_attrs)
  File "/home/Spencer.Hill/py/aospy/aospy/data_loader.py", line 245, in load_variable
    np.datetime64(end_date_xarray)).load()
  File "/home/Spencer.Hill/py/aospy/aospy/utils/times.py", line 513, in sel_time
    _assert_has_data_for_time(da, start_date, end_date)
  File "/home/Spencer.Hill/py/aospy/aospy/utils/times.py", line 484, in _assert_has_data_for_time
    da_start, da_end)
AssertionError: Data does not exist for requested time range: 2378-01-01T00:00:00.000000 to 2477-12-31T00:00:00.000000; found data from time range: 1678-01-01T00:00:00.000000000 to 1768-01-01T00:00:00.000000000.

This traceback is for using the later period; essentially the same thing happens when using the earlier period also.

I don't have time right now to properly debug this, nor likely will I for some time, so I have to leave it at that for now.

@spencerkclark
Copy link
Collaborator

I have data for a simulation spanning years 1-100 and years 701-800.

I'll need more information on the specifics here, but right off the bat it seems (aospy workaround or not) that it would not be possible to load this set of files into a single dataset and decode the time index into a pandas DatetimeIndex (since the data spans 800 years); the limit with the current aospy workaround would be 2262 - 1678 = 584 years.

Until pydata/xarray#1084 is resolved I'm afraid the only workaround would be to split this into separate Runs.

@spencerahill
Copy link
Owner Author

Ohhh you're probably right. I was focused on the gap in times, since I introduced the years 1-100 data just today, and that's when the error started. I'll double check and update this accordingly.

@spencerahill
Copy link
Owner Author

Closing this, as I'm almost certain that @spencerkclark was right about the source of the bug.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants