Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

module 'numcodecs' has no attribute 'Shuffle' #195

Closed
jonblower opened this issue Jul 18, 2022 · 10 comments
Closed

module 'numcodecs' has no attribute 'Shuffle' #195

jonblower opened this issue Jul 18, 2022 · 10 comments

Comments

@jonblower
Copy link

jonblower commented Jul 18, 2022

I'm a newbie here, so apologies if I'm missing something obvious! I'm trying to read a NetCDF4 file over http(s) and am getting the error message "module 'numcodecs' has no attribute 'Shuffle'". This has happened with a range of files.

I've done quite a lot of googling about this error but haven't found a solution (or at least, not one that I understood!) Perhaps this is the closest discussion: zarr-developers/numcodecs#260.

I'm not sure if this is an issue in kerchunk.hdf or whether it's deeper in numcodecs. In case it helps, I'm on an M1 Mac - sometimes this causes issues because there's a missing binary (ARM) dependency somewhere in the tree.

Here's some code that reproduces the error on my system. I'm using one of the free NetCDF4 files provided by Unidata as the example:

import json
import fsspec
import kerchunk.hdf

URL = "https://www.unidata.ucar.edu/software/netcdf/examples/test_echam_spectral-deflated.nc"

with fsspec.open(URL, "rb") as f:
    h5chunks = kerchunk.hdf.SingleHdf5ToZarr(f, URL, inline_threshold=100)
    print(json.dumps(h5chunks.translate()))

Traceback:

Traceback (most recent call last):
  File "/Users/me/Code/kerchunk-test/test2.py", line 12, in <module>
    print(json.dumps(h5chunks.translate()))
  File "/Users/me/opt/miniconda3/envs/kerchunk-test/lib/python3.10/site-packages/kerchunk/hdf.py", line 73, in translate
    self._h5f.visititems(self._translator)
  File "/Users/me/opt/miniconda3/envs/kerchunk-test/lib/python3.10/site-packages/h5py/_hl/group.py", line 613, in visititems
    return h5o.visit(self.id, proxy)
  File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
  File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
  File "h5py/h5o.pyx", line 355, in h5py.h5o.visit
  File "h5py/h5o.pyx", line 302, in h5py.h5o.cb_obj_simple
  File "/Users/me/opt/miniconda3/envs/kerchunk-test/lib/python3.10/site-packages/h5py/_hl/group.py", line 612, in proxy
    return func(name, self[name])
  File "/Users/me/opt/miniconda3/envs/kerchunk-test/lib/python3.10/site-packages/kerchunk/hdf.py", line 182, in _translator
    filters.append(numcodecs.Shuffle(elementsize=h5obj.dtype.itemsize))
AttributeError: module 'numcodecs' has no attribute 'Shuffle'
@peterm790
Copy link
Collaborator

This runs with no error on my intel laptop or on a Qhub/aws instance, so I suspect you are right that it's a numcodecs issue specific to M1.

@martindurant
Copy link
Member

It may be that the cython/build step failed for M1? What version of numcodecs do you have, and how was it installed? What does dir(numcodecs) or from numcodecs.registry import codec_registry; codec_registry give you?

@jonblower
Copy link
Author

Thanks for the reply! dir(numcodecs) gives me:

['Adler32', 'AsType', 'BZ2', 'Base64', 'Blosc', 'CRC32', 'Categorize', 'Delta', 'FixedScaleOffset', 'GZip', 
'JSON', 'LZ4', 'LZMA', 'MsgPack', 'PackBits', 'Pickle', 'Quantize', 'VLenArray', 'VLenBytes', 'VLenUTF8',
 'Zlib', 'Zstd', '__builtins__', '__cached__', '__doc__', '__file__', '__loader__', '__name__', '__package__', 
'__path__', '__spec__', '__version__', 'abc', 'astype', 'atexit', 'base64', 'blosc', 'bz2', 'categorize', 
'checksum32', 'compat', 'compat_ext', 'delta', 'fixedscaleoffset', 'get_codec', 'gzip', 'json', 'lz4', 'lzma', 
'msgpacks', 'multiprocessing', 'ncores', 'packbits', 'pickles', 'quantize', 'register_codec', 'registry', 
'version', 'vlen', 'zlib', 'zstd']

I think I installed things like this (in a Python 3.10 conda environment):

pip install kerchunk
conda install h5py zarr s3fs

@martindurant
Copy link
Member

OK, so numcodecs will have been installed by the first line, since kerchunk has it as an explicit dependency. Looking at https://pypi.org/project/numcodecs/#files , py310 only has an x86 build for mac, so I suppose your system attempted a compile and skipped anythign that failed. You should report this on the numcodecs repo; however checking https://anaconda.org/conda-forge/numcodecs/files , you can conda-install numcodecs using channel conda-forge and expect things to work.

@martindurant
Copy link
Member

What was your numcodecs version? It may be that the lack of a wheel on pypi meant you got an old version without Shuffle.

@jonblower
Copy link
Author

Ah, numcodecs seems to be v0.7.3.

I found that the version of numcodes in conda-forge worked though. I created my environment like this:

conda create --name kerchunk-test2 python=3.10 h5py zarr requests aiohttp
conda activate kerchunk-test2
pip install kerchunk

This installs numcodecs version 0.10.0. Then my program above prints out the JSON as expected.

@martindurant
Copy link
Member

OK, I'll close this then, but feel free to request the arm/mac wheel at numcodecs.

@jonblower
Copy link
Author

Sorry I know this is closed, but just for posterity - I should have made it a bit clearer that before running the above commands to create my environment, I had to add conda-forge to my channels:

conda config --add channels conda-forge 
conda config --set channel_priority strict

If this isn't done, then numcodes version 0.7.3 will be pulled in as a dependency of zarr.

@jonblower
Copy link
Author

jonblower commented Jul 19, 2022

Or the following also works, if you don't want to permanently set conda-forge as the default channel in your ~/.condarc:

conda create --name kerchunk-test3
conda activate kerchunk-test3
conda install -c conda-forge python=3.10 h5py zarr requests aiohttp
pip install kerchunk

@martindurant
Copy link
Member

Sounds like we should make a kerchunk conda-forge release :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants