Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ERROR: Failed building wheel for tokenizers #1050

Closed
4 tasks
outdoorblake opened this issue Aug 23, 2022 · 66 comments
Closed
4 tasks

ERROR: Failed building wheel for tokenizers #1050

outdoorblake opened this issue Aug 23, 2022 · 66 comments
Labels
bug Something isn't working

Comments

@outdoorblake
Copy link

System Info

I can't seem to get past this error "ERROR: Could not build wheels for tokenizers, which is required to install pyproject.toml-based projects" when installing transformers with pip. An ML friend of mine also tried on their own instance and encountered the same problem, tried to help troubleshoot with me and we weren't able to move past so I think its possibly a recent issue.

I am following the transformers README install instructions step by step, with a venv and pytorch ready to go. Pip is also fully up to date. In this error output one prompt it says is to possibly install a rust compiler - but we both felt this doesn't seem like the right next step because it usually isn't required when installing the transformers package and the README has no mention of needing to install a rust compiler.

Thanks in advance!
-Blake

Full output below:

command: pip install transformers

Collecting transformers
Using cached transformers-4.21.1-py3-none-any.whl (4.7 MB)
Requirement already satisfied: tqdm>=4.27 in ./venv/lib/python3.9/site-packages (from transformers) (4.64.0)
Requirement already satisfied: huggingface-hub<1.0,>=0.1.0 in ./venv/lib/python3.9/site-packages (from transformers) (0.9.0)
Requirement already satisfied: pyyaml>=5.1 in ./venv/lib/python3.9/site-packages (from transformers) (6.0)
Requirement already satisfied: regex!=2019.12.17 in ./venv/lib/python3.9/site-packages (from transformers) (2022.8.17)
Collecting tokenizers!=0.11.3,<0.13,>=0.11.1
Using cached tokenizers-0.12.1.tar.gz (220 kB)
Installing build dependencies ... done
Getting requirements to build wheel ... done
Preparing metadata (pyproject.toml) ... done
Requirement already satisfied: numpy>=1.17 in ./venv/lib/python3.9/site-packages (from transformers) (1.23.2)
Requirement already satisfied: packaging>=20.0 in ./venv/lib/python3.9/site-packages (from transformers) (21.3)
Requirement already satisfied: filelock in ./venv/lib/python3.9/site-packages (from transformers) (3.8.0)
Requirement already satisfied: requests in ./venv/lib/python3.9/site-packages (from transformers) (2.26.0)
Requirement already satisfied: typing-extensions>=3.7.4.3 in ./venv/lib/python3.9/site-packages (from huggingface-hub<1.0,>=0.1.0->transformers) (4.3.0)
Requirement already satisfied: pyparsing!=3.0.5,>=2.0.2 in ./venv/lib/python3.9/site-packages (from packaging>=20.0->transformers) (3.0.9)
Requirement already satisfied: urllib3<1.27,>=1.21.1 in ./venv/lib/python3.9/site-packages (from requests->transformers) (1.26.7)
Requirement already satisfied: idna<4,>=2.5 in ./venv/lib/python3.9/site-packages (from requests->transformers) (3.3)
Requirement already satisfied: certifi>=2017.4.17 in ./venv/lib/python3.9/site-packages (from requests->transformers) (2021.10.8)
Requirement already satisfied: charset-normalizer~=2.0.0 in ./venv/lib/python3.9/site-packages (from requests->transformers) (2.0.7)
Building wheels for collected packages: tokenizers
Building wheel for tokenizers (pyproject.toml) ... error
error: subprocess-exited-with-error

× Building wheel for tokenizers (pyproject.toml) did not run successfully.
│ exit code: 1
╰─> [51 lines of output]
running bdist_wheel
running build
running build_py
creating build
creating build/lib.macosx-12-arm64-cpython-39
creating build/lib.macosx-12-arm64-cpython-39/tokenizers
copying py_src/tokenizers/init.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers
creating build/lib.macosx-12-arm64-cpython-39/tokenizers/models
copying py_src/tokenizers/models/init.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/models
creating build/lib.macosx-12-arm64-cpython-39/tokenizers/decoders
copying py_src/tokenizers/decoders/init.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/decoders
creating build/lib.macosx-12-arm64-cpython-39/tokenizers/normalizers
copying py_src/tokenizers/normalizers/init.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/normalizers
creating build/lib.macosx-12-arm64-cpython-39/tokenizers/pre_tokenizers
copying py_src/tokenizers/pre_tokenizers/init.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/pre_tokenizers
creating build/lib.macosx-12-arm64-cpython-39/tokenizers/processors
copying py_src/tokenizers/processors/init.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/processors
creating build/lib.macosx-12-arm64-cpython-39/tokenizers/trainers
copying py_src/tokenizers/trainers/init.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/trainers
creating build/lib.macosx-12-arm64-cpython-39/tokenizers/implementations
copying py_src/tokenizers/implementations/byte_level_bpe.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/implementations
copying py_src/tokenizers/implementations/sentencepiece_unigram.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/implementations
copying py_src/tokenizers/implementations/sentencepiece_bpe.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/implementations
copying py_src/tokenizers/implementations/base_tokenizer.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/implementations
copying py_src/tokenizers/implementations/init.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/implementations
copying py_src/tokenizers/implementations/char_level_bpe.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/implementations
copying py_src/tokenizers/implementations/bert_wordpiece.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/implementations
creating build/lib.macosx-12-arm64-cpython-39/tokenizers/tools
copying py_src/tokenizers/tools/init.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/tools
copying py_src/tokenizers/tools/visualizer.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/tools
copying py_src/tokenizers/init.pyi -> build/lib.macosx-12-arm64-cpython-39/tokenizers
copying py_src/tokenizers/models/init.pyi -> build/lib.macosx-12-arm64-cpython-39/tokenizers/models
copying py_src/tokenizers/decoders/init.pyi -> build/lib.macosx-12-arm64-cpython-39/tokenizers/decoders
copying py_src/tokenizers/normalizers/init.pyi -> build/lib.macosx-12-arm64-cpython-39/tokenizers/normalizers
copying py_src/tokenizers/pre_tokenizers/init.pyi -> build/lib.macosx-12-arm64-cpython-39/tokenizers/pre_tokenizers
copying py_src/tokenizers/processors/init.pyi -> build/lib.macosx-12-arm64-cpython-39/tokenizers/processors
copying py_src/tokenizers/trainers/init.pyi -> build/lib.macosx-12-arm64-cpython-39/tokenizers/trainers
copying py_src/tokenizers/tools/visualizer-styles.css -> build/lib.macosx-12-arm64-cpython-39/tokenizers/tools
running build_ext
running build_rust
error: can't find Rust compiler

  If you are using an outdated pip version, it is possible a prebuilt wheel is available for this package but pip is not able to install from it. Installing from the wheel would avoid the need for a Rust compiler.
  
  To update pip, run:
  
      pip install --upgrade pip
  
  and then retry package installation.
  
  If you did intend to build this package from source, try installing a Rust compiler from your system package manager and ensure it is on the PATH during installation. Alternatively, rustup (available at https://rustup.rs) is the recommended way to download and update the Rust compiler toolchain.
  [end of output]

note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed building wheel for tokenizers
Failed to build tokenizers
ERROR: Could not build wheels for tokenizers, which is required to install pyproject.toml-based projects

Who can help?

No response

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

command: pip install transformers

Collecting transformers
Using cached transformers-4.21.1-py3-none-any.whl (4.7 MB)
Requirement already satisfied: tqdm>=4.27 in ./venv/lib/python3.9/site-packages (from transformers) (4.64.0)
Requirement already satisfied: huggingface-hub<1.0,>=0.1.0 in ./venv/lib/python3.9/site-packages (from transformers) (0.9.0)
Requirement already satisfied: pyyaml>=5.1 in ./venv/lib/python3.9/site-packages (from transformers) (6.0)
Requirement already satisfied: regex!=2019.12.17 in ./venv/lib/python3.9/site-packages (from transformers) (2022.8.17)
Collecting tokenizers!=0.11.3,<0.13,>=0.11.1
Using cached tokenizers-0.12.1.tar.gz (220 kB)
Installing build dependencies ... done
Getting requirements to build wheel ... done
Preparing metadata (pyproject.toml) ... done
Requirement already satisfied: numpy>=1.17 in ./venv/lib/python3.9/site-packages (from transformers) (1.23.2)
Requirement already satisfied: packaging>=20.0 in ./venv/lib/python3.9/site-packages (from transformers) (21.3)
Requirement already satisfied: filelock in ./venv/lib/python3.9/site-packages (from transformers) (3.8.0)
Requirement already satisfied: requests in ./venv/lib/python3.9/site-packages (from transformers) (2.26.0)
Requirement already satisfied: typing-extensions>=3.7.4.3 in ./venv/lib/python3.9/site-packages (from huggingface-hub<1.0,>=0.1.0->transformers) (4.3.0)
Requirement already satisfied: pyparsing!=3.0.5,>=2.0.2 in ./venv/lib/python3.9/site-packages (from packaging>=20.0->transformers) (3.0.9)
Requirement already satisfied: urllib3<1.27,>=1.21.1 in ./venv/lib/python3.9/site-packages (from requests->transformers) (1.26.7)
Requirement already satisfied: idna<4,>=2.5 in ./venv/lib/python3.9/site-packages (from requests->transformers) (3.3)
Requirement already satisfied: certifi>=2017.4.17 in ./venv/lib/python3.9/site-packages (from requests->transformers) (2021.10.8)
Requirement already satisfied: charset-normalizer~=2.0.0 in ./venv/lib/python3.9/site-packages (from requests->transformers) (2.0.7)
Building wheels for collected packages: tokenizers
Building wheel for tokenizers (pyproject.toml) ... error
error: subprocess-exited-with-error

× Building wheel for tokenizers (pyproject.toml) did not run successfully.
│ exit code: 1
╰─> [51 lines of output]
running bdist_wheel
running build
running build_py
creating build
creating build/lib.macosx-12-arm64-cpython-39
creating build/lib.macosx-12-arm64-cpython-39/tokenizers
copying py_src/tokenizers/init.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers
creating build/lib.macosx-12-arm64-cpython-39/tokenizers/models
copying py_src/tokenizers/models/init.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/models
creating build/lib.macosx-12-arm64-cpython-39/tokenizers/decoders
copying py_src/tokenizers/decoders/init.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/decoders
creating build/lib.macosx-12-arm64-cpython-39/tokenizers/normalizers
copying py_src/tokenizers/normalizers/init.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/normalizers
creating build/lib.macosx-12-arm64-cpython-39/tokenizers/pre_tokenizers
copying py_src/tokenizers/pre_tokenizers/init.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/pre_tokenizers
creating build/lib.macosx-12-arm64-cpython-39/tokenizers/processors
copying py_src/tokenizers/processors/init.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/processors
creating build/lib.macosx-12-arm64-cpython-39/tokenizers/trainers
copying py_src/tokenizers/trainers/init.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/trainers
creating build/lib.macosx-12-arm64-cpython-39/tokenizers/implementations
copying py_src/tokenizers/implementations/byte_level_bpe.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/implementations
copying py_src/tokenizers/implementations/sentencepiece_unigram.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/implementations
copying py_src/tokenizers/implementations/sentencepiece_bpe.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/implementations
copying py_src/tokenizers/implementations/base_tokenizer.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/implementations
copying py_src/tokenizers/implementations/init.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/implementations
copying py_src/tokenizers/implementations/char_level_bpe.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/implementations
copying py_src/tokenizers/implementations/bert_wordpiece.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/implementations
creating build/lib.macosx-12-arm64-cpython-39/tokenizers/tools
copying py_src/tokenizers/tools/init.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/tools
copying py_src/tokenizers/tools/visualizer.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/tools
copying py_src/tokenizers/init.pyi -> build/lib.macosx-12-arm64-cpython-39/tokenizers
copying py_src/tokenizers/models/init.pyi -> build/lib.macosx-12-arm64-cpython-39/tokenizers/models
copying py_src/tokenizers/decoders/init.pyi -> build/lib.macosx-12-arm64-cpython-39/tokenizers/decoders
copying py_src/tokenizers/normalizers/init.pyi -> build/lib.macosx-12-arm64-cpython-39/tokenizers/normalizers
copying py_src/tokenizers/pre_tokenizers/init.pyi -> build/lib.macosx-12-arm64-cpython-39/tokenizers/pre_tokenizers
copying py_src/tokenizers/processors/init.pyi -> build/lib.macosx-12-arm64-cpython-39/tokenizers/processors
copying py_src/tokenizers/trainers/init.pyi -> build/lib.macosx-12-arm64-cpython-39/tokenizers/trainers
copying py_src/tokenizers/tools/visualizer-styles.css -> build/lib.macosx-12-arm64-cpython-39/tokenizers/tools
running build_ext
running build_rust
error: can't find Rust compiler

  If you are using an outdated pip version, it is possible a prebuilt wheel is available for this package but pip is not able to install from it. Installing from the wheel would avoid the need for a Rust compiler.
  
  To update pip, run:
  
      pip install --upgrade pip
  
  and then retry package installation.
  
  If you did intend to build this package from source, try installing a Rust compiler from your system package manager and ensure it is on the PATH during installation. Alternatively, rustup (available at https://rustup.rs) is the recommended way to download and update the Rust compiler toolchain.
  [end of output]

note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed building wheel for tokenizers
Failed to build tokenizers
ERROR: Could not build wheels for tokenizers, which is required to install pyproject.toml-based projects

Expected behavior

I would expect transformers library to install without throwing an error when all pre-requisites for installation are met.

@outdoorblake outdoorblake added the bug Something isn't working label Aug 23, 2022
@outdoorblake
Copy link
Author

outdoorblake commented Aug 23, 2022

I am aware of this past issue - it is very similar but these suggested fixes seem dated and are not working.

@LysandreJik
Copy link
Member

Let me move this over to tokenizers, which should be in a better position to help.

@LysandreJik LysandreJik transferred this issue from huggingface/transformers Aug 24, 2022
@erik-dunteman
Copy link

also having this issue, hadn't ran into it before.

@Narsil
Copy link
Collaborator

Narsil commented Aug 25, 2022

Are you guys on M1 ?
If that's the case it's expected unfortunately. (#932)

If not, what platform are you on ? (OS, hardware, python version ?)

Basically for M1 you need to install from source (for now, fixes coming soon #1055).

Also the error message says you're missing a rust compiler, it might be enough to just install the rust compiler: https://www.rust-lang.org/tools/install and maybe the install will go through. (It's easier if we prebuild those but still).

@alibrahimzada
Copy link

alibrahimzada commented Aug 25, 2022

I'm using M2 Apple and I can't install tokenizers. The same thing works on my Linux fine. How can we install tokenizers for M1/M2 Apple Macs?

@stephantul
Copy link

M1 user here. I got the same error, installing the rust compiler fixed this for me.

@Narsil
Copy link
Collaborator

Narsil commented Aug 26, 2022

It's all the same, we couldn't prebuild the library for m1 (which is an arm64 chip) because github didn't have a arm64 action runner. We did push manually some prebuilt binaries but it seems they contained some issues. Since then, github enabled the runner to work on m1 machines (so all macos+arm64) so hopefully this will be fixed for the next release.

Since this is a "major" release (still not in 1.0) we're going to do a full sweep of slow tests in transformers (which is our biggest user) and hopefully this should come out of the box for m1 onwards after that !

@alibrahimzada
Copy link

@stephantul where did you get the rust compiler. I installed it from https://www.rust-lang.org/tools/install and pip3 install tokenizers still fails.

@stephantul
Copy link

@alibrahimzada I installed it with homebrew

@Narsil
Copy link
Collaborator

Narsil commented Aug 29, 2022

@alibrahimzada you might also need pip install setuptools_rust and your python environment needs to be shareable (depends how you installed python basically, for pyenv for instance you will need this: pyenv/pyenv#392

(Careful it's now PYTHON_CONFIGURE_OPT="--enable-shared" pyenv install ... ).

@argonaut76
Copy link

Having the same problem and none of the above suggestions worked. Any ETA on when we can expect the next release that fixes this bug?

@anibzlv
Copy link

anibzlv commented Sep 20, 2022

I am on M1 and managed to go around this in the following way:
I installed a rust compiler using brew, and then initialized it.
brew install rustup
rustup-init
Then I restarted the console and checked if it is installed: rustc --version . It turned out you also have to setup the path:
export PATH="$HOME/.cargo/bin:$PATH"

@prabu-ssb
Copy link

I have done everything @Narsil and @anibzlv have suggested. No luck still.. (am on M1, 2021)

@argonaut76
Copy link

Oddly enough, the library works just fine inside a virtual environment on my MBP with the M1 chip. So for now, that's my approach.

@prabu-ssb
Copy link

I could install from sources and it seems to be working.

@Narsil
Copy link
Collaborator

Narsil commented Sep 27, 2022

Has anyone tried to install the latest version on M1 ? The prebuilt binaries should be released now !

@manyu252
Copy link

I tried installing on M1 just now in a python3.10 virtual environment. All I had to do was pip install setuptools_rust. Then I could install all the required packages.

@TR-EIP
Copy link

TR-EIP commented Oct 6, 2022

I'm running on M2 with Python3.8 and are still running into this problem
Any other workaround than installing from source?

@Narsil
Copy link
Collaborator

Narsil commented Oct 6, 2022

I thought Python 3.8 was not built for M1/M2... So this library cannot build it for you.

Are you sure you are not in compatibility mode and not really using 3.8 ?
https://stackoverflow.com/questions/69511006/cant-install-pyenv-3-8-5-on-macos-big-sur-with-m1-chip

@Vargol
Copy link

Vargol commented Nov 10, 2022

Try telling pip to use prefer binary, it'll probably give you an older version of tokenizer but you would need to build from source. It does depend on the version requirements for tokenizer.

The proper fix would be for Hugginface to create wheels for Apple Silicon

@Narsil
Copy link
Collaborator

Narsil commented Nov 10, 2022

We already build wheels for Apple Silicon ! Just not python3.8 which isn't supposed to exist on M1. (only 3.9, 3.10, and 3.11 now)

@Vargol
Copy link

Vargol commented Nov 10, 2022

Where's the binary wheel for 0.12.1 , PyPi can't find it.
Having to use 11.6 to avoid having "install rust" as an instruction to install user software.

@Narsil
Copy link
Collaborator

Narsil commented Nov 10, 2022

Github did not provide an action runner at the time for M1, so builds where manual (and infrequent).

Any reason you cannot upgrade to 0.13.2 or 0.12.6 ?

But yes for some older versions the M1 are not present, we're not doing retroactive builds unfortunately.
I'm basically the sole maintainer here, and I don't really have the time to figure out all the old versions for all platforms (but ensuring that once a platform is supported it keeps on working is something we're committed to).

@Vargol
Copy link

Vargol commented Nov 10, 2022

In the project there are a number of other thirty party python modules dependant on tokenizer, from yesterdays build I got the following version dependencies for pip

Collecting tokenizers!=0.11.3,<0.13,>=0.11.1 .

No sure why its not picked in 0.12.6, setting pip to prefer binary installed 0.11.6.

EDIT: answering my own question
https://pypi.org/simple/tokenizers/ goes straight from 0.12.1 to 0.13.0 there is no 0.12.6

@Narsil
Copy link
Collaborator

Narsil commented Nov 10, 2022

Hmm interesting, could you try force installing 0.12.6 and see if that fixes it ?

If you could share your env (Python version + hardware (m1 I guess) + requirements.txt) ?

I don't remember the command but there's a way to make pip explain its decisions regarding versions.

@Narsil
Copy link
Collaborator

Narsil commented Nov 10, 2022

I got confused with 0.11.6 sorry !

And I don't see the builds for 0.12 for arm, I'm guessing we moved to 0.13 first.

TBH there "shouldn't" by any major differences between 0.12.1 and 0.13, so if you could switch that might work (I took caution since we updated PyO3 bindings version and that triggered a lot of code changes, even if we didn't intend any functional changes).

transformers is the one probably limiting tokenizers (we do that to enable tokenizers to make eventual breaking changes).
Maybe you could try updating it ?

@Vargol
Copy link

Vargol commented Nov 10, 2022

It a bit convoluted ATM as currently on different OS's require different version of gfpgan unless you install torch upfront.

So I do

pip install "torch<1.13" "torchvision<1.14"

Main requirements.txt
-r requirements-base.txt

protobuf==3.19.6
torch<1.13.0
torchvision<0.14.0
-e .

requirements-base.txt

pip will resolve the version which matches torch

albumentations
dependency_injector==4.40.0
diffusers
einops
eventlet
flask==2.1.3
flask_cors==3.0.10
flask_socketio==5.3.0
flaskwebgui==0.3.7
getpass_asterisk
gfpgan
huggingface-hub
imageio
imageio-ffmpeg
kornia
numpy
omegaconf
opencv-python
pillow
pip>=22
pudb
pyreadline3
pytorch-lightning==1.7.7
realesrgan
scikit-image>=0.19
send2trash
streamlit
taming-transformers-rom1504
test-tube
torch-fidelity
torchmetrics
transformers==4.21.*
git+https://github.com/openai/CLIP.git@main#egg=clip
git+https://github.com/Birch-san/k-diffusion.git@mps#egg=k-diffusion
git+https://github.com/invoke-ai/clipseg.git@models-rename#egg=clipseg

@Vargol
Copy link

Vargol commented Nov 10, 2022

I'll have to see why we limit transformers assuming the reasoning hasn't been lost to history

@Narsil
Copy link
Collaborator

Narsil commented Feb 9, 2023

Can you provide more information on your environment ?

Python version
Cpu version
tokenizers verson
Type of Python (conda, pip, pyenv etc..)

Have you tried installing the rust compiler ?

@Sandy4321
Copy link

still error for google colab
!git clone https://github.com/huggingface/transformers
&& cd transformers
&& git checkout a3085020ed0d81d4903c50967687192e3101e770
!pip install ./transformers/

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Processing ./transformers
Preparing metadata (setup.py) ... done
Requirement already satisfied: numpy in /usr/local/lib/python3.10/dist-packages (from transformers==2.3.0) (1.22.4)
Collecting tokenizers==0.0.11 (from transformers==2.3.0)
Downloading tokenizers-0.0.11.tar.gz (30 kB)
Installing build dependencies ... done
Getting requirements to build wheel ... done
Preparing metadata (pyproject.toml) ... done
Collecting boto3 (from transformers==2.3.0)
Downloading boto3-1.26.144-py3-none-any.whl (135 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 135.6/135.6 kB 15.5 MB/s eta 0:00:00
Requirement already satisfied: filelock in /usr/local/lib/python3.10/dist-packages (from transformers==2.3.0) (3.12.0)
Requirement already satisfied: requests in /usr/local/lib/python3.10/dist-packages (from transformers==2.3.0) (2.27.1)
Requirement already satisfied: tqdm in /usr/local/lib/python3.10/dist-packages (from transformers==2.3.0) (4.65.0)
Requirement already satisfied: regex!=2019.12.17 in /usr/local/lib/python3.10/dist-packages (from transformers==2.3.0) (2022.10.31)
Collecting sentencepiece (from transformers==2.3.0)
Downloading sentencepiece-0.1.99-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.3 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.3/1.3 MB 51.6 MB/s eta 0:00:00
Collecting sacremoses (from transformers==2.3.0)
Downloading sacremoses-0.0.53.tar.gz (880 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 880.6/880.6 kB 53.7 MB/s eta 0:00:00
Preparing metadata (setup.py) ... done
Collecting botocore<1.30.0,>=1.29.144 (from boto3->transformers==2.3.0)
Downloading botocore-1.29.144-py3-none-any.whl (10.8 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 10.8/10.8 MB 113.7 MB/s eta 0:00:00
Collecting jmespath<2.0.0,>=0.7.1 (from boto3->transformers==2.3.0)
Downloading jmespath-1.0.1-py3-none-any.whl (20 kB)
Collecting s3transfer<0.7.0,>=0.6.0 (from boto3->transformers==2.3.0)
Downloading s3transfer-0.6.1-py3-none-any.whl (79 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 79.8/79.8 kB 10.1 MB/s eta 0:00:00
Requirement already satisfied: urllib3<1.27,>=1.21.1 in /usr/local/lib/python3.10/dist-packages (from requests->transformers==2.3.0) (1.26.15)
Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.10/dist-packages (from requests->transformers==2.3.0) (2022.12.7)
Requirement already satisfied: charset-normalizer~=2.0.0 in /usr/local/lib/python3.10/dist-packages (from requests->transformers==2.3.0) (2.0.12)
Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.10/dist-packages (from requests->transformers==2.3.0) (3.4)
Requirement already satisfied: six in /usr/local/lib/python3.10/dist-packages (from sacremoses->transformers==2.3.0) (1.16.0)
Requirement already satisfied: click in /usr/local/lib/python3.10/dist-packages (from sacremoses->transformers==2.3.0) (8.1.3)
Requirement already satisfied: joblib in /usr/local/lib/python3.10/dist-packages (from sacremoses->transformers==2.3.0) (1.2.0)
Requirement already satisfied: python-dateutil<3.0.0,>=2.1 in /usr/local/lib/python3.10/dist-packages (from botocore<1.30.0,>=1.29.144->boto3->transformers==2.3.0) (2.8.2)
Building wheels for collected packages: transformers, tokenizers, sacremoses
Building wheel for transformers (setup.py) ... done
Created wheel for transformers: filename=transformers-2.3.0-py3-none-any.whl size=458550 sha256=236e7cf5654e4cff65da41ee3a83e39d34fbea6396b8051e9243120a5cae5dde
Stored in directory: /tmp/pip-ephem-wheel-cache-wlywjaz5/wheels/7c/35/80/e946b22a081210c6642e607ed65b2a5b9a4d9259695ee2caf5
error: subprocess-exited-with-error

× Building wheel for tokenizers (pyproject.toml) did not run successfully.
│ exit code: 1
╰─> See above for output.

note: This error originates from a subprocess, and is likely not a problem with pip.
Building wheel for tokenizers (pyproject.toml) ... error
ERROR: Failed building wheel for tokenizers
Building wheel for sacremoses (setup.py) ... done
Created wheel for sacremoses: filename=sacremoses-0.0.53-py3-none-any.whl size=895241 sha256=099fd152876aa843c9f04a284c7f7c9260d266b181e672796d1619a0f7e2be76
Stored in directory: /root/.cache/pip/wheels/00/24/97/a2ea5324f36bc626e1ea0267f33db6aa80d157ee977e9e42fb
Successfully built transformers sacremoses
Failed to build tokenizers
ERROR: Could not build wheels for tokenizers, which is required to install pyproject.toml-based projects

@ArthurZucker
Copy link
Collaborator

I would recommend you to install tokenizers version 11.6 instead 0.0.11 that is being fetched with the commit that you checked out.

@cleblain
Copy link

cleblain commented Jun 3, 2023

System Info

I can't seem to get past this error "ERROR: Could not build wheels for tokenizers, which is required to install pyproject.toml-based projects" when installing transformers with pip. An ML friend of mine also tried on their own instance and encountered the same problem, tried to help troubleshoot with me and we weren't able to move past so I think its possibly a recent issue.

I am following the transformers README install instructions step by step, with a venv and pytorch ready to go. Pip is also fully up to date. In this error output one prompt it says is to possibly install a rust compiler - but we both felt this doesn't seem like the right next step because it usually isn't required when installing the transformers package and the README has no mention of needing to install a rust compiler.

Thanks in advance! -Blake

Full output below:

command: pip install transformers

Collecting transformers Using cached transformers-4.21.1-py3-none-any.whl (4.7 MB) Requirement already satisfied: tqdm>=4.27 in ./venv/lib/python3.9/site-packages (from transformers) (4.64.0) Requirement already satisfied: huggingface-hub<1.0,>=0.1.0 in ./venv/lib/python3.9/site-packages (from transformers) (0.9.0) Requirement already satisfied: pyyaml>=5.1 in ./venv/lib/python3.9/site-packages (from transformers) (6.0) Requirement already satisfied: regex!=2019.12.17 in ./venv/lib/python3.9/site-packages (from transformers) (2022.8.17) Collecting tokenizers!=0.11.3,<0.13,>=0.11.1 Using cached tokenizers-0.12.1.tar.gz (220 kB) Installing build dependencies ... done Getting requirements to build wheel ... done Preparing metadata (pyproject.toml) ... done Requirement already satisfied: numpy>=1.17 in ./venv/lib/python3.9/site-packages (from transformers) (1.23.2) Requirement already satisfied: packaging>=20.0 in ./venv/lib/python3.9/site-packages (from transformers) (21.3) Requirement already satisfied: filelock in ./venv/lib/python3.9/site-packages (from transformers) (3.8.0) Requirement already satisfied: requests in ./venv/lib/python3.9/site-packages (from transformers) (2.26.0) Requirement already satisfied: typing-extensions>=3.7.4.3 in ./venv/lib/python3.9/site-packages (from huggingface-hub<1.0,>=0.1.0->transformers) (4.3.0) Requirement already satisfied: pyparsing!=3.0.5,>=2.0.2 in ./venv/lib/python3.9/site-packages (from packaging>=20.0->transformers) (3.0.9) Requirement already satisfied: urllib3<1.27,>=1.21.1 in ./venv/lib/python3.9/site-packages (from requests->transformers) (1.26.7) Requirement already satisfied: idna<4,>=2.5 in ./venv/lib/python3.9/site-packages (from requests->transformers) (3.3) Requirement already satisfied: certifi>=2017.4.17 in ./venv/lib/python3.9/site-packages (from requests->transformers) (2021.10.8) Requirement already satisfied: charset-normalizer~=2.0.0 in ./venv/lib/python3.9/site-packages (from requests->transformers) (2.0.7) Building wheels for collected packages: tokenizers Building wheel for tokenizers (pyproject.toml) ... error error: subprocess-exited-with-error

× Building wheel for tokenizers (pyproject.toml) did not run successfully. │ exit code: 1 ╰─> [51 lines of output] running bdist_wheel running build running build_py creating build creating build/lib.macosx-12-arm64-cpython-39 creating build/lib.macosx-12-arm64-cpython-39/tokenizers copying py_src/tokenizers/init.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers creating build/lib.macosx-12-arm64-cpython-39/tokenizers/models copying py_src/tokenizers/models/init.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/models creating build/lib.macosx-12-arm64-cpython-39/tokenizers/decoders copying py_src/tokenizers/decoders/init.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/decoders creating build/lib.macosx-12-arm64-cpython-39/tokenizers/normalizers copying py_src/tokenizers/normalizers/init.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/normalizers creating build/lib.macosx-12-arm64-cpython-39/tokenizers/pre_tokenizers copying py_src/tokenizers/pre_tokenizers/init.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/pre_tokenizers creating build/lib.macosx-12-arm64-cpython-39/tokenizers/processors copying py_src/tokenizers/processors/init.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/processors creating build/lib.macosx-12-arm64-cpython-39/tokenizers/trainers copying py_src/tokenizers/trainers/init.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/trainers creating build/lib.macosx-12-arm64-cpython-39/tokenizers/implementations copying py_src/tokenizers/implementations/byte_level_bpe.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/implementations copying py_src/tokenizers/implementations/sentencepiece_unigram.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/implementations copying py_src/tokenizers/implementations/sentencepiece_bpe.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/implementations copying py_src/tokenizers/implementations/base_tokenizer.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/implementations copying py_src/tokenizers/implementations/init.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/implementations copying py_src/tokenizers/implementations/char_level_bpe.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/implementations copying py_src/tokenizers/implementations/bert_wordpiece.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/implementations creating build/lib.macosx-12-arm64-cpython-39/tokenizers/tools copying py_src/tokenizers/tools/init.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/tools copying py_src/tokenizers/tools/visualizer.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/tools copying py_src/tokenizers/init.pyi -> build/lib.macosx-12-arm64-cpython-39/tokenizers copying py_src/tokenizers/models/init.pyi -> build/lib.macosx-12-arm64-cpython-39/tokenizers/models copying py_src/tokenizers/decoders/init.pyi -> build/lib.macosx-12-arm64-cpython-39/tokenizers/decoders copying py_src/tokenizers/normalizers/init.pyi -> build/lib.macosx-12-arm64-cpython-39/tokenizers/normalizers copying py_src/tokenizers/pre_tokenizers/init.pyi -> build/lib.macosx-12-arm64-cpython-39/tokenizers/pre_tokenizers copying py_src/tokenizers/processors/init.pyi -> build/lib.macosx-12-arm64-cpython-39/tokenizers/processors copying py_src/tokenizers/trainers/init.pyi -> build/lib.macosx-12-arm64-cpython-39/tokenizers/trainers copying py_src/tokenizers/tools/visualizer-styles.css -> build/lib.macosx-12-arm64-cpython-39/tokenizers/tools running build_ext running build_rust error: can't find Rust compiler

  If you are using an outdated pip version, it is possible a prebuilt wheel is available for this package but pip is not able to install from it. Installing from the wheel would avoid the need for a Rust compiler.
  
  To update pip, run:
  
      pip install --upgrade pip
  
  and then retry package installation.
  
  If you did intend to build this package from source, try installing a Rust compiler from your system package manager and ensure it is on the PATH during installation. Alternatively, rustup (available at https://rustup.rs) is the recommended way to download and update the Rust compiler toolchain.
  [end of output]

note: This error originates from a subprocess, and is likely not a problem with pip. ERROR: Failed building wheel for tokenizers Failed to build tokenizers ERROR: Could not build wheels for tokenizers, which is required to install pyproject.toml-based projects

Who can help?

No response

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

command: pip install transformers

Collecting transformers Using cached transformers-4.21.1-py3-none-any.whl (4.7 MB) Requirement already satisfied: tqdm>=4.27 in ./venv/lib/python3.9/site-packages (from transformers) (4.64.0) Requirement already satisfied: huggingface-hub<1.0,>=0.1.0 in ./venv/lib/python3.9/site-packages (from transformers) (0.9.0) Requirement already satisfied: pyyaml>=5.1 in ./venv/lib/python3.9/site-packages (from transformers) (6.0) Requirement already satisfied: regex!=2019.12.17 in ./venv/lib/python3.9/site-packages (from transformers) (2022.8.17) Collecting tokenizers!=0.11.3,<0.13,>=0.11.1 Using cached tokenizers-0.12.1.tar.gz (220 kB) Installing build dependencies ... done Getting requirements to build wheel ... done Preparing metadata (pyproject.toml) ... done Requirement already satisfied: numpy>=1.17 in ./venv/lib/python3.9/site-packages (from transformers) (1.23.2) Requirement already satisfied: packaging>=20.0 in ./venv/lib/python3.9/site-packages (from transformers) (21.3) Requirement already satisfied: filelock in ./venv/lib/python3.9/site-packages (from transformers) (3.8.0) Requirement already satisfied: requests in ./venv/lib/python3.9/site-packages (from transformers) (2.26.0) Requirement already satisfied: typing-extensions>=3.7.4.3 in ./venv/lib/python3.9/site-packages (from huggingface-hub<1.0,>=0.1.0->transformers) (4.3.0) Requirement already satisfied: pyparsing!=3.0.5,>=2.0.2 in ./venv/lib/python3.9/site-packages (from packaging>=20.0->transformers) (3.0.9) Requirement already satisfied: urllib3<1.27,>=1.21.1 in ./venv/lib/python3.9/site-packages (from requests->transformers) (1.26.7) Requirement already satisfied: idna<4,>=2.5 in ./venv/lib/python3.9/site-packages (from requests->transformers) (3.3) Requirement already satisfied: certifi>=2017.4.17 in ./venv/lib/python3.9/site-packages (from requests->transformers) (2021.10.8) Requirement already satisfied: charset-normalizer~=2.0.0 in ./venv/lib/python3.9/site-packages (from requests->transformers) (2.0.7) Building wheels for collected packages: tokenizers Building wheel for tokenizers (pyproject.toml) ... error error: subprocess-exited-with-error

× Building wheel for tokenizers (pyproject.toml) did not run successfully. │ exit code: 1 ╰─> [51 lines of output] running bdist_wheel running build running build_py creating build creating build/lib.macosx-12-arm64-cpython-39 creating build/lib.macosx-12-arm64-cpython-39/tokenizers copying py_src/tokenizers/init.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers creating build/lib.macosx-12-arm64-cpython-39/tokenizers/models copying py_src/tokenizers/models/init.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/models creating build/lib.macosx-12-arm64-cpython-39/tokenizers/decoders copying py_src/tokenizers/decoders/init.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/decoders creating build/lib.macosx-12-arm64-cpython-39/tokenizers/normalizers copying py_src/tokenizers/normalizers/init.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/normalizers creating build/lib.macosx-12-arm64-cpython-39/tokenizers/pre_tokenizers copying py_src/tokenizers/pre_tokenizers/init.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/pre_tokenizers creating build/lib.macosx-12-arm64-cpython-39/tokenizers/processors copying py_src/tokenizers/processors/init.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/processors creating build/lib.macosx-12-arm64-cpython-39/tokenizers/trainers copying py_src/tokenizers/trainers/init.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/trainers creating build/lib.macosx-12-arm64-cpython-39/tokenizers/implementations copying py_src/tokenizers/implementations/byte_level_bpe.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/implementations copying py_src/tokenizers/implementations/sentencepiece_unigram.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/implementations copying py_src/tokenizers/implementations/sentencepiece_bpe.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/implementations copying py_src/tokenizers/implementations/base_tokenizer.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/implementations copying py_src/tokenizers/implementations/init.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/implementations copying py_src/tokenizers/implementations/char_level_bpe.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/implementations copying py_src/tokenizers/implementations/bert_wordpiece.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/implementations creating build/lib.macosx-12-arm64-cpython-39/tokenizers/tools copying py_src/tokenizers/tools/init.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/tools copying py_src/tokenizers/tools/visualizer.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/tools copying py_src/tokenizers/init.pyi -> build/lib.macosx-12-arm64-cpython-39/tokenizers copying py_src/tokenizers/models/init.pyi -> build/lib.macosx-12-arm64-cpython-39/tokenizers/models copying py_src/tokenizers/decoders/init.pyi -> build/lib.macosx-12-arm64-cpython-39/tokenizers/decoders copying py_src/tokenizers/normalizers/init.pyi -> build/lib.macosx-12-arm64-cpython-39/tokenizers/normalizers copying py_src/tokenizers/pre_tokenizers/init.pyi -> build/lib.macosx-12-arm64-cpython-39/tokenizers/pre_tokenizers copying py_src/tokenizers/processors/init.pyi -> build/lib.macosx-12-arm64-cpython-39/tokenizers/processors copying py_src/tokenizers/trainers/init.pyi -> build/lib.macosx-12-arm64-cpython-39/tokenizers/trainers copying py_src/tokenizers/tools/visualizer-styles.css -> build/lib.macosx-12-arm64-cpython-39/tokenizers/tools running build_ext running build_rust error: can't find Rust compiler

  If you are using an outdated pip version, it is possible a prebuilt wheel is available for this package but pip is not able to install from it. Installing from the wheel would avoid the need for a Rust compiler.
  
  To update pip, run:
  
      pip install --upgrade pip
  
  and then retry package installation.
  
  If you did intend to build this package from source, try installing a Rust compiler from your system package manager and ensure it is on the PATH during installation. Alternatively, rustup (available at https://rustup.rs) is the recommended way to download and update the Rust compiler toolchain.
  [end of output]

note: This error originates from a subprocess, and is likely not a problem with pip. ERROR: Failed building wheel for tokenizers Failed to build tokenizers ERROR: Could not build wheels for tokenizers, which is required to install pyproject.toml-based projects

Expected behavior

I would expect transformers library to install without throwing an error when all pre-requisites for installation are met.

i got the same issue with pydroid

@Teofebano
Copy link

hi @Narsil

I'm using Macbook M2 and python 3.11.5, but still encounter the same problem
any related information or work around that I can do for now?

thank you 🙏

@ArthurZucker
Copy link
Collaborator

ArthurZucker commented Sep 21, 2023

@Teofebano can't you just install the wheels we released online? the following worked for me on a M1, not sure why M2 would be different

conda create -n py3.11 python=3.11
conda activate py3.11
pip install tokenizers

@Vargol
Copy link

Vargol commented Sep 21, 2023

@Teofebano do you need such an 'old' version of transformers ?
The reason you're having this issue is that transformers is requiring a version of tokenizers for which there is no MacOS wheel, which is the problem I had if you scroll up, so it builds from source..

Alternatively install rust so it can be built (no I didn't what to do that either)

@ArthurZucker
Copy link
Collaborator

Especially if you are using a recent version of python, highly possible that it won't be compatible with old versions of transformers

@bruce2233
Copy link

Any solution now?

@ArthurZucker
Copy link
Collaborator

Hey @bruce2233 if you have an issue with building wheel, make sure to share a reproducer, a full traceback, the machine you are running this on, and make sure that all the proposed solution did not work for you!

@78Alpha
Copy link

78Alpha commented Oct 16, 2023

Spent 27 Hours trying to get deepspeed working on a tool to run into this error and be blocked. Tokenizers is already installed, but installing anything else seems to make it try to reinstall. It fails to compile due to a rust issue.

error: casting `&T` to `&mut T` is undefined behavior, even if the reference is unused, consider instead using an `UnsafeCell`
         --> tokenizers-lib/src/models/bpe/trainer.rs:526:47
          |
      522 |                     let w = &words[*i] as *const _ as *mut _;
          |                             -------------------------------- casting happend here
      ...
      526 |                         let word: &mut Word = &mut (*w);
          |                                               ^^^^^^^^^
          |
          = note: `#[deny(invalid_reference_casting)]` on by default

      warning: `tokenizers` (lib) generated 3 warnings
      error: could not compile `tokenizers` (lib) due to previous error; 3 warnings emitted

Albeit this was on WSL2, notorious for failures of a catastrophic degree.

@crummy
Copy link

crummy commented Oct 23, 2023

error: could not compile tokenizers (lib) due to previous error; 3 warnings emitted

I get this same error on my M2 MBP on Sonoma.

@drbig
Copy link

drbig commented Oct 24, 2023

      error: casting `&T` to `&mut T` is undefined behavior, even if the reference is unused, consider instead using an `UnsafeCell`
         --> tokenizers-lib/src/models/bpe/trainer.rs:526:47
          |
      522 |                     let w = &words[*i] as *const _ as *mut _;
          |                             -------------------------------- casting happend here
      ...
      526 |                         let word: &mut Word = &mut (*w);
          |                                               ^^^^^^^^^
          |
          = note: `#[deny(invalid_reference_casting)]` on by default
      
      warning: `tokenizers` (lib) generated 3 warnings
      error: could not compile `tokenizers` (lib) due to previous error; 3 warnings emitted

Ubuntu 18 LTS, Rust via "pipe internet to shell"... So the compiler is too new? The Ubuntu-included is too old for one of the deps...

@drbig
Copy link

drbig commented Oct 24, 2023

For what it's worth: Above setup but Rust version 1.67.1 and -A invalid_reference_casting (via RUSTFLAGS) and it does compile then (haven't yet got to testing if it actually works, though...).

@MarkAWard
Copy link

To prevent tokenizers from building using the latest stable rust version and toolchain, which changed invalid_reference_casting lint to deny-by-default from allow-by-default in version 1.73.0, Im using this in the Dockerfile now

RUN curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- --default-toolchain=1.72.1 -y
ENV PATH="/root/.cargo/bin:${PATH}"
ENV RUSTUP_TOOLCHAIN=1.72.1

@Iaotle
Copy link

Iaotle commented Oct 28, 2023

For what it's worth: Above setup but Rust version 1.67.1 and -A invalid_reference_casting (via RUSTFLAGS) and it does compile then (haven't yet got to testing if it actually works, though...).

Worked for me on WSL. Thanks! I'll give a bit more detail for this method:

curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- --default-toolchain=1.67.1 -y
source "$HOME/.cargo/env"
RUSTFLAGS="-A invalid_reference_casting"
python3 -m pip install -e ./modules/tortoise-tts/

@Narsil
Copy link
Collaborator

Narsil commented Oct 30, 2023

Which version of tokenizers are you all using ?

This was fixed as soon as 1.73.1 came out in 0.14.1

@AsakusaRinne
Copy link

I met this problem and failed to resolve it with any way mentioned above. However when I downgrade my python version from 3.11 to 3.10, everything went well instead. I hope it could help you.

@parsa-arjmand
Copy link

We already build wheels for Apple Silicon ! Just not python3.8 which isn't supposed to exist on M1. (only 3.9, 3.10, and 3.11 now)

this was a crucial hint thanks!

@Narsil
Copy link
Collaborator

Narsil commented Jan 12, 2024

Or just upgrade your tokenizers versions :)

@Narsil
Copy link
Collaborator

Narsil commented Jan 12, 2024

(We prebuild tokenizers, just like most precompiled python libs, on the current set of valid python versions, AT the time of building those, so old versions are built on old pythons)

@strategy155
Copy link

Is it working now, should it be closed?

@ArthurZucker
Copy link
Collaborator

Closing as completed

@DominicBartel
Copy link

Problem persists on python 3.12 for me. Windows 11, reverting to python 3.10 worked.

@deepanshh786
Copy link

still error for google colab !git clone https://github.com/huggingface/transformers && cd transformers && git checkout a3085020ed0d81d4903c50967687192e3101e770 !pip install ./transformers/

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/ Processing ./transformers Preparing metadata (setup.py) ... done Requirement already satisfied: numpy in /usr/local/lib/python3.10/dist-packages (from transformers==2.3.0) (1.22.4) Collecting tokenizers==0.0.11 (from transformers==2.3.0) Downloading tokenizers-0.0.11.tar.gz (30 kB) Installing build dependencies ... done Getting requirements to build wheel ... done Preparing metadata (pyproject.toml) ... done Collecting boto3 (from transformers==2.3.0) Downloading boto3-1.26.144-py3-none-any.whl (135 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 135.6/135.6 kB 15.5 MB/s eta 0:00:00 Requirement already satisfied: filelock in /usr/local/lib/python3.10/dist-packages (from transformers==2.3.0) (3.12.0) Requirement already satisfied: requests in /usr/local/lib/python3.10/dist-packages (from transformers==2.3.0) (2.27.1) Requirement already satisfied: tqdm in /usr/local/lib/python3.10/dist-packages (from transformers==2.3.0) (4.65.0) Requirement already satisfied: regex!=2019.12.17 in /usr/local/lib/python3.10/dist-packages (from transformers==2.3.0) (2022.10.31) Collecting sentencepiece (from transformers==2.3.0) Downloading sentencepiece-0.1.99-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.3 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.3/1.3 MB 51.6 MB/s eta 0:00:00 Collecting sacremoses (from transformers==2.3.0) Downloading sacremoses-0.0.53.tar.gz (880 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 880.6/880.6 kB 53.7 MB/s eta 0:00:00 Preparing metadata (setup.py) ... done Collecting botocore<1.30.0,>=1.29.144 (from boto3->transformers==2.3.0) Downloading botocore-1.29.144-py3-none-any.whl (10.8 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 10.8/10.8 MB 113.7 MB/s eta 0:00:00 Collecting jmespath<2.0.0,>=0.7.1 (from boto3->transformers==2.3.0) Downloading jmespath-1.0.1-py3-none-any.whl (20 kB) Collecting s3transfer<0.7.0,>=0.6.0 (from boto3->transformers==2.3.0) Downloading s3transfer-0.6.1-py3-none-any.whl (79 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 79.8/79.8 kB 10.1 MB/s eta 0:00:00 Requirement already satisfied: urllib3<1.27,>=1.21.1 in /usr/local/lib/python3.10/dist-packages (from requests->transformers==2.3.0) (1.26.15) Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.10/dist-packages (from requests->transformers==2.3.0) (2022.12.7) Requirement already satisfied: charset-normalizer~=2.0.0 in /usr/local/lib/python3.10/dist-packages (from requests->transformers==2.3.0) (2.0.12) Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.10/dist-packages (from requests->transformers==2.3.0) (3.4) Requirement already satisfied: six in /usr/local/lib/python3.10/dist-packages (from sacremoses->transformers==2.3.0) (1.16.0) Requirement already satisfied: click in /usr/local/lib/python3.10/dist-packages (from sacremoses->transformers==2.3.0) (8.1.3) Requirement already satisfied: joblib in /usr/local/lib/python3.10/dist-packages (from sacremoses->transformers==2.3.0) (1.2.0) Requirement already satisfied: python-dateutil<3.0.0,>=2.1 in /usr/local/lib/python3.10/dist-packages (from botocore<1.30.0,>=1.29.144->boto3->transformers==2.3.0) (2.8.2) Building wheels for collected packages: transformers, tokenizers, sacremoses Building wheel for transformers (setup.py) ... done Created wheel for transformers: filename=transformers-2.3.0-py3-none-any.whl size=458550 sha256=236e7cf5654e4cff65da41ee3a83e39d34fbea6396b8051e9243120a5cae5dde Stored in directory: /tmp/pip-ephem-wheel-cache-wlywjaz5/wheels/7c/35/80/e946b22a081210c6642e607ed65b2a5b9a4d9259695ee2caf5 error: subprocess-exited-with-error

× Building wheel for tokenizers (pyproject.toml) did not run successfully. │ exit code: 1 ╰─> See above for output.

note: This error originates from a subprocess, and is likely not a problem with pip. Building wheel for tokenizers (pyproject.toml) ... error ERROR: Failed building wheel for tokenizers Building wheel for sacremoses (setup.py) ... done Created wheel for sacremoses: filename=sacremoses-0.0.53-py3-none-any.whl size=895241 sha256=099fd152876aa843c9f04a284c7f7c9260d266b181e672796d1619a0f7e2be76 Stored in directory: /root/.cache/pip/wheels/00/24/97/a2ea5324f36bc626e1ea0267f33db6aa80d157ee977e9e42fb Successfully built transformers sacremoses Failed to build tokenizers ERROR: Could not build wheels for tokenizers, which is required to install pyproject.toml-based projects

Did you get a solution for this?

@ck37
Copy link

ck37 commented Apr 23, 2024

Downgrading to python 3.10 worked for me (Ubuntu 20.04, tokenizers 0.13.3).

@ArthurZucker
Copy link
Collaborator

Yes, we were a bit slow to ship binaries for py311 and more

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests