Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding support for Ascend NPU #372

Merged
merged 4 commits into from
Nov 20, 2023
Merged

Conversation

statelesshz
Copy link
Contributor

@statelesshz statelesshz commented Nov 1, 2023

What does this PR do?

Thanks for creating this library. In our effort to streamline huggingface on Ascend devices, this PR is an important step.

What is Ascend NPU and torch_npu

  • Ascend NPU is an AI processor that support AI frameworks like PyTorch, TensorFlow.
  • torch_npu is an officially recognized pytorch integration plugin to support ascend npu using the pytorch framework. Ref: Improved third-party device support.

cc @Narsil, @LysandreJik and @zhangsibo1129

@statelesshz
Copy link
Contributor Author

statelesshz commented Nov 1, 2023

This PR was verified locally, and the corresponding test results are as follows:
system info

(hf) [root@localhost /home/hf/safetensors/bindings/python]# accelerate env
Fail to import hypothesis in common_utils, tests are not derandomized

Copy-and-paste the text below in your GitHub issue

- `Accelerate` version: 0.24.1
- Platform: Linux-4.19.90-vhulk2211.3.0.h1543.eulerosv2r10.aarch64-aarch64-with-glibc2.26
- Python version: 3.8.17
- Numpy version: 1.23.5
- PyTorch version (GPU?): 2.1.0 (False)
- PyTorch XPU available: False
- PyTorch NPU available: True
- System RAM: 2015.11 GB
- `Accelerate` default config:
	- compute_environment: LOCAL_MACHINE
	- distributed_type: MULTI_NPU
	- mixed_precision: no
	- use_cpu: False
	- debug: False
	- num_processes: 8
	- machine_rank: 0
	- num_machines: 1
	- gpu_ids: all
	- rdzv_backend: static
	- same_network: True
	- main_training_function: main
	- downcast_bf16: no
	- tpu_use_cluster: False
	- tpu_use_sudo: False
	- tpu_env: []

(hf) [root@localhost /home/hf/safetensors/bindings/python]# pip3 show safetensors
Name: safetensors
Version: 0.4.1.dev0
Summary: 
Home-page: 
Author: 
Author-email: Nicolas Patry <patry.nicolas@protonmail.com>
License: 
Location: /root/anaconda3/envs/hf/lib/python3.8/site-packages
Editable project location: /home/hf/safetensors/bindings/python
Requires: 
Required-by: peft, timm, transformers

the output log

(hf) [root@localhost /home/hf/safetensors/bindings/python]# python3 -m pytest -v tests/test_pt_comparison.py 
========================================================= test session starts ==========================================================
platform linux -- Python 3.8.17, pytest-7.4.2, pluggy-1.3.0 -- /root/anaconda3/envs/hf/bin/python3
cachedir: .pytest_cache
rootdir: /home/hf/safetensors/bindings/python
configfile: setup.cfg
plugins: dash-2.13.0, hydra-core-1.3.2, odl-0.7.0, anyio-4.0.0
collected 19 items                                                                                                                     

tests/test_pt_comparison.py::TorchTestCase::test_bogus PASSED                                                                    [  5%]
tests/test_pt_comparison.py::TorchTestCase::test_disjoint_tensors_shared_storage PASSED                                          [ 10%]
tests/test_pt_comparison.py::TorchTestCase::test_gpu SKIPPED (Cuda is not available)                                             [ 15%]
tests/test_pt_comparison.py::TorchTestCase::test_in_memory PASSED                                                                [ 21%]
tests/test_pt_comparison.py::TorchTestCase::test_meta_tensor PASSED                                                              [ 26%]
tests/test_pt_comparison.py::TorchTestCase::test_multiple_zero_sized PASSED                                                      [ 31%]
tests/test_pt_comparison.py::TorchTestCase::test_npu PASSED                                                                      [ 36%]
tests/test_pt_comparison.py::TorchTestCase::test_odd_dtype PASSED                                                                [ 42%]
tests/test_pt_comparison.py::TorchTestCase::test_serialization PASSED                                                            [ 47%]
tests/test_pt_comparison.py::TorchTestCase::test_sparse PASSED                                                                   [ 52%]
tests/test_pt_comparison.py::TorchTestCase::test_zero_sized PASSED                                                               [ 57%]
tests/test_pt_comparison.py::LoadTestCase::test_deserialization_safe PASSED                                                      [ 63%]
tests/test_pt_comparison.py::LoadTestCase::test_deserialization_safe_device_1 SKIPPED (Only 1 device available)                  [ 68%]
tests/test_pt_comparison.py::LoadTestCase::test_deserialization_safe_gpu SKIPPED (Cuda is not available)                         [ 73%]
tests/test_pt_comparison.py::LoadTestCase::test_deserialization_safe_gpu_slice SKIPPED (Cuda is not available)                   [ 78%]
tests/test_pt_comparison.py::SliceTestCase::test_cannot_serialize_a_non_contiguous_tensor PASSED                                 [ 84%]
tests/test_pt_comparison.py::SliceTestCase::test_cannot_serialize_shared PASSED                                                  [ 89%]
tests/test_pt_comparison.py::SliceTestCase::test_deserialization_metadata PASSED                                                 [ 94%]
tests/test_pt_comparison.py::SliceTestCase::test_deserialization_slice PASSED                                                    [100%]

==================================================== 15 passed, 4 skipped in 15.30s ====================================================

Copy link
Collaborator

@Narsil Narsil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks ! The PR looks good.

I left a few nits, the biggest blocker is the relatively odd cast to get the device back, if you have more details on why it's needed it'd be appreciated (I don't feel confortable merging right now with this hack)

bindings/python/tests/test_pt_comparison.py Outdated Show resolved Hide resolved
bindings/python/src/lib.rs Outdated Show resolved Hide resolved
@statelesshz
Copy link
Contributor Author

statelesshz commented Nov 18, 2023

Hi @Narsil thanks for your review. I've updated my code implementation. There are two main changes:

  • add more dtypes (torch.float32&torch.float16)to test case of npu

    torch.allclose cannot work with bf16 when using npu😅

  • remove unnecessary hack code.

The updated test results are as follows:

(ds) [root@node-43 python]# python3 -m pytest -v tests/test_pt_comparison.py::TorchTestCase::test_npu
=========================================================================================================================== test session starts ============================================================================================================================
platform linux -- Python 3.8.18, pytest-7.4.3, pluggy-1.3.0 -- /home/miniconda3/envs/ds/bin/python3
cachedir: .pytest_cache
rootdir: /home/ds/safetensors/bindings/python
configfile: setup.cfg
collected 1 item

tests/test_pt_comparison.py::TorchTestCase::test_npu PASSED                                                                                                                                                                                                          [100%]

============================================================================================================================= warnings summary =============================================================================================================================
../../../../../miniconda3/envs/ds/lib/python3.8/site-packages/torch_npu/dynamo/__init__.py:18
  /home/miniconda3/envs/ds/lib/python3.8/site-packages/torch_npu/dynamo/__init__.py:18: UserWarning: Register eager implementation for the 'npu' backend of dynamo, as torch_npu was not compiled with torchair.
    warnings.warn(

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
====================================================================================================================== 1 passed, 1 warning in 18.56s =======================================================================================================================

Could you plz take a second look and trigger CI? thx :)

@Narsil
Copy link
Collaborator

Narsil commented Nov 20, 2023

LGTM thanks for this !

@Narsil Narsil merged commit 094e676 into huggingface:main Nov 20, 2023
9 of 10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants