Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vectorize function calls #120

Merged
merged 4 commits into from
Jun 6, 2023
Merged

Conversation

rlouf
Copy link
Member

@rlouf rlouf commented May 26, 2023

In this PR I introduce a outlines.vectorize function that works similarly to numpy.vectorize, and allows to turn functions into functions that take arrays as an input. The major differences with numpy.vectorize are:

  1. The possibility to batch the execution. Launching too many concurrent calls at the same time, doing inference locally with a batch size too large may not be acceptable and thus we need to batch the parallel calls.
  2. It will execute coroutines concurrently for better performance.

This function will be used internally for model calls, so we can pass arrays of arbitrary shapes as inputs and get an array of answers. Usual broadcasting rules apply:

import outlines.models as models

complete = models.text_completion.openai("text-davinci-003")

prompts = np.array([["Hi", "Hello"], ["Bye", "Ciao"]])
answer = complete(prompts)
print(answer.shape)
# (2, 2)

answer_samples = complete(prompts, num_samples=10)
print(answer_samples.shape)
# (2, 2, 10)

I made the choice here to execute the coroutines in a new event loop before returning. This allows to keep the synchronous user-facing API for now. Of course, this restricts the potential speedups we get by using asyncio and we will eventually need to move towards an explicitly async API. But that would be too big of a change for a single PR.

  • Vectorize scalar functions and coroutines (OpenAI API, simple tools)
  • Vectorize functions with an arbitrary signature (HF models)
  • Vectorize coroutines with an arbitrary signature (OpenAI with several samples)
  • Execute coroutines in a new event loop before returning
  • Replace the OpenAI calls with async ones and use vectorize
  • Use vectorize for HF model calls

@rlouf rlouf marked this pull request as draft June 1, 2023 08:27
@rlouf rlouf force-pushed the vectorize-model-calls branch 2 times, most recently from 52bec7b to 3927dce Compare June 1, 2023 13:28
@rlouf rlouf marked this pull request as ready for review June 1, 2023 14:12
@rlouf rlouf added text Linked to text generation enhancement transformers Linked to the `transformers` integration openai labels Jun 1, 2023
@rlouf rlouf force-pushed the vectorize-model-calls branch 3 times, most recently from 6fd21cb to cf981d6 Compare June 2, 2023 18:59
outlines/base.py Outdated Show resolved Hide resolved
outlines/base.py Show resolved Hide resolved
outlines/base.py Outdated Show resolved Hide resolved
@rlouf rlouf force-pushed the vectorize-model-calls branch 2 times, most recently from 0c2deee to 6d41c13 Compare June 5, 2023 07:26
@rlouf
Copy link
Member Author

rlouf commented Jun 5, 2023

Update: Vectorization is currently implemented, although it needs more documentation and tests to make the PR mergeable.

@rlouf rlouf force-pushed the vectorize-model-calls branch 5 times, most recently from 8fa19c2 to 3d5d5c7 Compare June 5, 2023 16:16
@rlouf rlouf requested a review from brandonwillard June 5, 2023 16:25
Copy link
Contributor

@brandonwillard brandonwillard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like we could use some tests for OpenAI, but I imagine that's not particularly straightforward.

Aside from that, just some questions/comments; otherwise, looks good to me.

Comment on lines +51 to +63
loop = asyncio.new_event_loop()
try:
outputs = loop.run_until_complete(self.func())
finally:
loop.close()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to provide an interface that allows the caller to manage the event loop (e.g. for use in external async applications)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the function that returns the new sequence should be async, but I didn't want to break the API just yet.

Comment on lines +157 to +244
loop = asyncio.new_event_loop()
try:
outputs = loop.run_until_complete(create_and_gather_tasks())
finally:
loop.close()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same consideration here.

outlines/base.py Outdated Show resolved Hide resolved
@rlouf
Copy link
Member Author

rlouf commented Jun 5, 2023

Looks like we could use some tests for OpenAI, but I imagine that's not particularly straightforward.

We need to refactor the code to make it easier to test the code that relies on the OpenAI API. That's the thing that annoys me the most currently.

@rlouf
Copy link
Member Author

rlouf commented Jun 6, 2023

We might want a slightly different behavior for HuggingFace models: in this case we're better off flattening the input arrays so inference can be run in batch. This does not make this PR any less useful, but probably means that we'll need a different mechanism for these models. Will open an issue once this is merged.

@rlouf rlouf merged commit e3cdf0e into outlines-dev:main Jun 6, 2023
3 checks passed
@rlouf rlouf deleted the vectorize-model-calls branch June 6, 2023 08:59
@rlouf rlouf linked an issue Jun 21, 2023 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement text Linked to text generation transformers Linked to the `transformers` integration
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Vectorize the model and function calls
2 participants