Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mlx library integration (via mlx-lm) #918

Closed
pngwn opened this issue May 24, 2024 · 3 comments · Fixed by #956
Closed

mlx library integration (via mlx-lm) #918

pngwn opened this issue May 24, 2024 · 3 comments · Fixed by #956

Comments

@pngwn
Copy link

pngwn commented May 24, 2024

An additional library integration for mlx.

Context

mlx is an ML framework that support high-performance inference on apple silicon (amongst other things). It has now has a rich ecosystem and vibrant community of users.

Request

The mlx-lm python library provides some simple utilities for text generation and it would be great if there were an integration with outlines.

Extra detail

mlx-lm supports a logit_bias param in its top level generate function

Related to #806

@pngwn
Copy link
Author

pngwn commented May 24, 2024

There is a separate library that integrates outlines with mlx. I thought I would post it here in case it is useful:

https://github.com/sacha-ichbiah/outlines-mlx

I have also pinged the author.

@namin
Copy link

namin commented Jun 8, 2024

In the meantime, note that it's possible to use MLX via the OpenAI API. Of course, then, one is limited to generators choice and text, but I guess this is better than nothing.

This is what I did to get it working:

# see https://github.com/ml-explore/mlx-examples/pull/810 which must be merged
# tested as follows
# mlx_lm.server --model mlx-community/gemma-1.1-7b-it-4bit --port 11435 --chat-template=CHAT_TEMPLATE
# where CHAT_TEMPLATE is as in the tokenizer_config below

from openai import AsyncOpenAI
from outlines.models.openai import OpenAI, OpenAIConfig

from mlx_lm.tokenizer_utils import load_tokenizer
from pathlib import Path
model_path = Path("mlx-community/gemma-1.1-7b-it-4bit")
tokenizer_config = {"chat_template": ""{{ bos_token }}{% set ns = namespace(extra_system='') %}{% for message in messages %}{% set role = message['role'] %}{% if (message['role'] == 'assistant') %}{% set role = 'model' %}{% endif %}{% if (role == 'system') %}{% set ns.extra_system = ns.extra_system + message['content'] %}{% else %}{% set message_system = '' %}{% if (role == 'user') %}{% if (ns.extra_system == '') %}{% else %}{% set message_system = 'System: ' + ns.extra_system + '\\n' %}{% set ns.extra_system = '' %}{% endif %}{% endif %}{{ '<start_of_turn>' + role + '\\n' + message_system + message['content'] | trim + '<end_of_turn>\\n' }}{% endif %}{% endfor %}{% if add_generation_prompt %}{{'<start_of_turn>model\\n'}}{% endif %}"}

base_url = "http://localhost:11435/v1"
api_key="not_needed"

config = OpenAIConfig(model="openai/mlx-gemma")
client = AsyncOpenAI(
    base_url=base_url,
    api_key=api_key,
)
tokenizer = load_tokenizer(model_path, tokenizer_config)

model = OpenAI(client, config, tokenizer)

@rlouf
Copy link
Member

rlouf commented Jun 9, 2024

You will be able to use every generator once #926 is merged

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

3 participants