Adding MLX backend. #419

Narsil · 2024-01-04T10:30:44Z

What does this PR do?

Adds MLX backend.
MLX already has native safetensors support, but this was trivial to add.

Fixes # (issue) or description of the problem this PR solves.

Narsil · 2024-01-04T10:37:28Z

We can safely ignore the failure which is entirely unrelated to the PR (more recent clippy detects an issue with PyO3 itself, which will most likely be fixed, by simply upgrading PyO3)

Vaibhavs10

Dope! Thanks for putting this up! 🚀 Conceptually, this looks good to me.

That said, I'm not an expert in MLX so would appreciate if Awni could do a deeper review.

One qq: Can we potentially also test for persisting and loading quantised models as well? we don't need to be exhaustive here, but just testing for 4-bit should be okay IMO.

Narsil · 2024-01-04T11:13:55Z

Quantized is u8, therefore it should work out of the box (not sure how mlx persists quantized information in npx/npz, i'm guessing it's not saved)

pcuenca

Awesome! 🔥

bindings/python/py_src/safetensors/mlx.py

pcuenca · 2024-01-04T11:16:49Z

bindings/python/py_src/safetensors/mlx.py

+    return numpy_dict
+
+
+def _mx2np(mx_dict: Dict[str, mx.array]) -> Dict[str, np.array]:


Does this handle bfloat16?

Oh, I see in the tests below that it does :)

Yup, using the same flax trick which is just a special named dtype.

bindings/python/tests/test_mlx_comparison.py

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

awni

LGTM! Thanks for adding that!

I think using your built-in numpy.save is probably a good call from a maintenance standpoint.

But just a thought that this will involve a copy (on the load side as well). Currently we do a copy to get in and out of Numpy.

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

Narsil · 2024-01-05T13:36:01Z

But just a thought that this will involve a copy (on the load side as well). Currently we do a copy to get in and out of Numpy.

Good to know. The only thing really needed is frombuffer equivalent in order to get loading partial tensors working.
On the rust side, I use the slice information given by the user to create. a CPU buffer of the correct data within the tensor, and use frombuffer on given framework in order to send the correct data where it belongs.

If there's a copy from numpy-> MLX, that means that currently there are 2 copies created.

File -> Slice (local buffer inside rust) -> numpy.from_buffer (should be zero copy) -> mlx (new copy).

Is that correct ?

awni · 2024-01-06T16:43:10Z

If there's a copy from numpy-> MLX, that means that currently there are 2 copies created.

Yes that's correct. We can't avoid the copy from NumPy because we have to use specifically allocated memory to make it available to both the CPU and GPU.

Adding MLX backend.

5c18e4b

Vaibhavs10 reviewed Jan 4, 2024

View reviewed changes

pcuenca approved these changes Jan 4, 2024

View reviewed changes

Apply suggestions from code review

c9ffa26

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

awni approved these changes Jan 4, 2024

View reviewed changes

Narsil and others added 2 commits January 5, 2024 14:33

Update bindings/python/py_src/safetensors/mlx.py

2a0fbd7

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

Update bindings/python/py_src/safetensors/mlx.py

4bb3e10

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

Narsil merged commit 56659f4 into main Jan 5, 2024
5 of 11 checks passed

Narsil deleted the add_mlx branch January 5, 2024 13:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding MLX backend. #419

Adding MLX backend. #419

Narsil commented Jan 4, 2024

Narsil commented Jan 4, 2024

Vaibhavs10 left a comment

Narsil commented Jan 4, 2024

pcuenca left a comment

pcuenca Jan 4, 2024

pcuenca Jan 4, 2024

Narsil Jan 4, 2024

awni left a comment

Narsil commented Jan 5, 2024

awni commented Jan 6, 2024

		return numpy_dict


		def _mx2np(mx_dict: Dict[str, mx.array]) -> Dict[str, np.array]:

Adding MLX backend. #419

Adding MLX backend. #419

Conversation

Narsil commented Jan 4, 2024

What does this PR do?

Narsil commented Jan 4, 2024

Vaibhavs10 left a comment

Choose a reason for hiding this comment

Narsil commented Jan 4, 2024

pcuenca left a comment

Choose a reason for hiding this comment

pcuenca Jan 4, 2024

Choose a reason for hiding this comment

pcuenca Jan 4, 2024

Choose a reason for hiding this comment

Narsil Jan 4, 2024

Choose a reason for hiding this comment

awni left a comment

Choose a reason for hiding this comment

Narsil commented Jan 5, 2024

awni commented Jan 6, 2024