Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bump datasets package #536

Merged
merged 4 commits into from
Mar 4, 2024
Merged

Bump datasets package #536

merged 4 commits into from
Mar 4, 2024

Conversation

shubhobm
Copy link
Collaborator

The detector in packagehallucination is erroring out:

  File "/home/ec2-user/anaconda3/envs/py3.11/lib/python3.11/site-packages/garak/detectors/packagehallucination.py", line 52, in detect
    self._load_package_list()
  File "/home/ec2-user/anaconda3/envs/py3.11/lib/python3.11/site-packages/garak/detectors/packagehallucination.py", line 44, in _load_package_list
    pypi_dataset = datasets.load_dataset(self.pypi_dataset_name, split="train")
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ec2-user/anaconda3/envs/py3.11/lib/python3.11/site-packages/datasets/load.py", line 2146, in load_dataset
    ds = builder_instance.as_dataset(split=split, verification_mode=verification_mode, in_memory=keep_in_memory)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ec2-user/anaconda3/envs/py3.11/lib/python3.11/site-packages/datasets/builder.py", line 1173, in as_dataset
    raise NotImplementedError(f"Loading a dataset cached in a {type(self._fs).__name__} is not supported.")

Turns out this is due to a breaking change in fsspec, which is propagating to garak through the datasets dependency.

Solution is to bump datasets.

bump datasets
@leondz leondz self-requested a review February 29, 2024 20:08
@leondz
Copy link
Owner

leondz commented Feb 29, 2024

Can you sync up requirements.txt to this? The tests should pick it up (python3 -m pytest tests/test_reqs.py). Wrote a local fix then realised I don't have access to your branch

Copy link
Owner

@leondz leondz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also bump requirements.txt datasets ver

@leondz
Copy link
Owner

leondz commented Mar 1, 2024

Looks like it recently broke again and the current magic version spec is

datasets>=2.14.6,<2.17

@shubhobm
Copy link
Collaborator Author

shubhobm commented Mar 1, 2024

Updated versions in both places.

@leondz leondz merged commit 4138e85 into leondz:main Mar 4, 2024
1 check passed
@github-actions github-actions bot locked and limited conversation to collaborators Mar 4, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants