Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Pipeline Refactor][server][OpenAI] Enable OpenAI to use new text gen pipeline #1477

Merged
merged 11 commits into from
Dec 14, 2023

Conversation

dsikka
Copy link
Contributor

@dsikka dsikka commented Dec 13, 2023

Summary

  • Enable the OpenAI server to use the new pipeline
  • Disable streaming within the OpenAI server as not available in v2 as of yet

Testing

We can now use continuous batching with OpenAI:

num_cores: 2
num_workers: 2
endpoints:
  - task: text_generation
    model: hf:neuralmagic/TinyLlama-1.1B-Chat-v0.4-pruned50-quant-ds
    kwargs:
      {"continuous_batch_sizes": [2], "internal_kv_cache": False}

Starting this server:

deepsparse.server --config_file new_sample_config.yaml --integration openai

Using the API:

import openai
from openai import OpenAI


client = OpenAI(base_url="http://localhost:5543/v1", api_key="EMPTY")

models = client.models.list()

model = "hf:neuralmagic/TinyLlama-1.1B-Chat-v0.4-pruned50-quant-ds"
print(f"Accessing model API '{model}'")


# Completion API
stream = False
completion = client.completions.create(
    prompt="The sun shined",
    stream=stream,
    n=8,
    max_tokens=10,
    model=model
)

print(completion)

bfineran
bfineran previously approved these changes Dec 13, 2023
dbogunowicz
dbogunowicz previously approved these changes Dec 14, 2023
Base automatically changed from update_config to main December 14, 2023 11:49
@dsikka dsikka dismissed stale reviews from dbogunowicz and bfineran December 14, 2023 11:49

The base branch was changed.

@dsikka dsikka merged commit 543199e into main Dec 14, 2023
13 checks passed
@dsikka dsikka deleted the update_openai branch December 14, 2023 15:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants