-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Slicing DataArray can take longer than not slicing #2004
Comments
Here's a simpler case that gets at the essence of the problem: import xarray as xr
import numpy as np
source = xr.DataArray(np.zeros((100, 12000)), dims=['time', 'x'])
source.to_netcdf('test.nc', format='NETCDF4')
reopened = xr.open_dataarray('test.nc')
%time reopened[::1, ::1].compute()
# CPU times: user 1.35 ms, sys: 6.77 ms, total: 8.12 ms
%time reopened[::1, ::10].compute()
# CPU times: user 371 ms, sys: 1.33 s, total: 1.7 s |
Yeah, good example. Eliminates a lot of possible variables such as problems with netcdf4 compression and such. Probably should see if it happens in v0.10.0 to see if the changes to the indexing system caused this. |
The culprit appears to be netCDF4-python and/or netCDF-C:
When I try doing the same operation with h5netcdf, it runs very quickly: reopened = xr.open_dataarray('test.nc', engine='h5netcdf')
%time reopened[::1, ::10].compute()
# CPU times: user 6.11 ms, sys: 3.63 ms, total: 9.74 ms |
my bet is probably netCDF4-python. Don't want to write up the C code though to confirm it. Sigh... this isn't going to be a fun one to track down. Shall I open a bug report over there? |
This might be relevant: Unidata/netcdf4-python#680 Still reading through the thread. |
netcdf4-python does |
Dunno. I can't seem to get that engine working on my system. Reading through that thread, I wonder if the optimization they added only applies if there is only one stride greater than one? |
Ah, nevermind, I see that our examples only had one greater-than-one stride |
H5py is doing all the hard work for this in h5netcdf.
…On Wed, Mar 21, 2018 at 11:51 AM Benjamin Root ***@***.***> wrote:
Ah, nevermind, I see that our examples only had one greater-than-one stride
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#2004 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ABKS1g1ciNap4E9K2_dPKrol8ocz3DvLks5tgqEWgaJpZM4S0lM->
.
|
Confirmed that the slow performance of netcdf4-python on strided access is due to the way that netcdf-c calls HDF5. There's now an issue on the netcdf-c issue tracker to implement fast strided access for HDF5 files (Unidata/netcdf-c#908). |
corresponding HDF5 operations. re: github issue #908 also in reference to pydata/xarray#2004 The netcdf-c library has implemented the nc_get_vars and nc_put_vars operations as element at a time. This has resulted in very slow operation. This pr attempts to improve the situation for netcdf-4/hdf5 files by using the slab operations provided by the hdf5 library. The new implementation passes the get/put vars stride information down to the hdf5 slab operations. The result appears to improve performance significantly. Some simple tests on large 2-D arrays shows speedups in excess of 150. Misc. other changes: 1. fix bug in ncgen/semantics.c; using a list's allocated length instead of actual length. 2. Added a temporary hook in the netcdf library plus a performance test case (tst_varsperf.c) to estimate the speedup. After users have had some experience with this, I will remove it, probably after the 4.7 release.
netcdf-c master now includes the same mechanism for strided access of HDF5 files as h5py. If netcdf4-python is linked against netcdf-c >= 4.6.2, performance for strided access should be greatly improved. |
The performance difference here does indeed to have been fixed with netCDF-C 4.6.2 (but see also #2747) |
can this be closed? |
I think so, at least in terms of my original problem. |
Code Sample, a copy-pastable example if possible
So, without any slicing, it takes approximately 7.5 seconds for me to load this complete file into memory. Now, let's see what happens when I slice the DataArray and load it:
I killed this session after 17 minutes.
top
did not report any unusual io wait, and memory usage was not out of control. I am using v0.10.2 of xarray. My suspicion is that there is something wrong with the indexing system that is causing xarray to read in the data in a bad order. Notice that if I slice all the data, then the timing works out the same as reading it all in straight-up. Not shown here is a run where if I slice every 100 lats and 100 longitudes, then the timing is shorter again, but not to the same amount of time as reading it all in at once.Let me know if you want a copy of the file. It is a compressed netcdf4, taking up only 1.7MB.
I wonder if this is related to #1985?
The text was updated successfully, but these errors were encountered: