Vectorize function calls #120

rlouf · 2023-05-26T12:02:45Z

In this PR I introduce a outlines.vectorize function that works similarly to numpy.vectorize, and allows to turn functions into functions that take arrays as an input. The major differences with numpy.vectorize are:

The possibility to batch the execution. Launching too many concurrent calls at the same time, doing inference locally with a batch size too large may not be acceptable and thus we need to batch the parallel calls.
It will execute coroutines concurrently for better performance.

This function will be used internally for model calls, so we can pass arrays of arbitrary shapes as inputs and get an array of answers. Usual broadcasting rules apply:

import outlines.models as models

complete = models.text_completion.openai("text-davinci-003")

prompts = np.array([["Hi", "Hello"], ["Bye", "Ciao"]])
answer = complete(prompts)
print(answer.shape)
# (2, 2)

answer_samples = complete(prompts, num_samples=10)
print(answer_samples.shape)
# (2, 2, 10)

I made the choice here to execute the coroutines in a new event loop before returning. This allows to keep the synchronous user-facing API for now. Of course, this restricts the potential speedups we get by using asyncio and we will eventually need to move towards an explicitly async API. But that would be too big of a change for a single PR.

Vectorize scalar functions and coroutines (OpenAI API, simple tools)
Vectorize functions with an arbitrary signature (HF models)
Vectorize coroutines with an arbitrary signature (OpenAI with several samples)
Execute coroutines in a new event loop before returning
Replace the OpenAI calls with async ones and use vectorize
Use vectorize for HF model calls

outlines/base.py

rlouf · 2023-06-05T09:27:15Z

Update: Vectorization is currently implemented, although it needs more documentation and tests to make the PR mergeable.

brandonwillard

Looks like we could use some tests for OpenAI, but I imagine that's not particularly straightforward.

Aside from that, just some questions/comments; otherwise, looks good to me.

brandonwillard · 2023-06-05T16:57:45Z

outlines/base.py

+            loop = asyncio.new_event_loop()
+            try:
+                outputs = loop.run_until_complete(self.func())
+            finally:
+                loop.close()


Do we want to provide an interface that allows the caller to manage the event loop (e.g. for use in external async applications)?

I think the function that returns the new sequence should be async, but I didn't want to break the API just yet.

brandonwillard · 2023-06-05T17:01:44Z

outlines/base.py

+        loop = asyncio.new_event_loop()
+        try:
+            outputs = loop.run_until_complete(create_and_gather_tasks())
+        finally:
+            loop.close()


Same consideration here.

outlines/base.py

rlouf · 2023-06-05T20:22:42Z

Looks like we could use some tests for OpenAI, but I imagine that's not particularly straightforward.

We need to refactor the code to make it easier to test the code that relies on the OpenAI API. That's the thing that annoys me the most currently.

rlouf · 2023-06-06T05:45:03Z

We might want a slightly different behavior for HuggingFace models: in this case we're better off flattening the input arrays so inference can be run in batch. This does not make this PR any less useful, but probably means that we'll need a different mechanism for these models. Will open an issue once this is merged.

rlouf marked this pull request as draft June 1, 2023 08:27

rlouf force-pushed the vectorize-model-calls branch 2 times, most recently from 52bec7b to 3927dce Compare June 1, 2023 13:28

rlouf marked this pull request as ready for review June 1, 2023 14:12

rlouf force-pushed the vectorize-model-calls branch from 3efcda0 to 8f0797b Compare June 1, 2023 14:30

rlouf added text Linked to text generation enhancement transformers Linked to the `transformers` integration openai labels Jun 1, 2023

rlouf force-pushed the vectorize-model-calls branch 3 times, most recently from 6fd21cb to cf981d6 Compare June 2, 2023 18:59

rlouf commented Jun 3, 2023

View reviewed changes

outlines/base.py Outdated Show resolved Hide resolved

rlouf commented Jun 3, 2023

View reviewed changes

outlines/base.py Show resolved Hide resolved

rlouf commented Jun 3, 2023

View reviewed changes

outlines/base.py Outdated Show resolved Hide resolved

rlouf force-pushed the vectorize-model-calls branch 2 times, most recently from 0c2deee to 6d41c13 Compare June 5, 2023 07:26

rlouf force-pushed the vectorize-model-calls branch 5 times, most recently from 8fa19c2 to 3d5d5c7 Compare June 5, 2023 16:16

rlouf requested a review from brandonwillard June 5, 2023 16:25

brandonwillard approved these changes Jun 5, 2023

View reviewed changes

rlouf mentioned this pull request Jun 5, 2023

Implement Tree of Thoughts using Outlines #130

Closed

rlouf added 2 commits June 6, 2023 10:25

Vectorize scalar function

17aa40e

Vectorize function with arbitrary signature

f0998af

Vectorize OpenAI model calls

5f239d0

rlouf force-pushed the vectorize-model-calls branch from 3d5d5c7 to b26ffc0 Compare June 6, 2023 08:25

Vectorize HF model calls

9d863e7

rlouf force-pushed the vectorize-model-calls branch from b26ffc0 to 9d863e7 Compare June 6, 2023 08:29

rlouf merged commit e3cdf0e into outlines-dev:main Jun 6, 2023
3 checks passed

rlouf deleted the vectorize-model-calls branch June 6, 2023 08:59

rlouf linked an issue Jun 21, 2023 that may be closed by this pull request

Vectorize the model and function calls #52

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Vectorize function calls #120

Vectorize function calls #120

rlouf commented May 26, 2023 •

edited

Loading

rlouf commented Jun 5, 2023

brandonwillard left a comment

brandonwillard Jun 5, 2023

rlouf Jun 5, 2023

brandonwillard Jun 5, 2023

rlouf commented Jun 5, 2023

rlouf commented Jun 6, 2023

Vectorize function calls #120

Vectorize function calls #120

Conversation

rlouf commented May 26, 2023 • edited Loading

rlouf commented Jun 5, 2023

brandonwillard left a comment

Choose a reason for hiding this comment

brandonwillard Jun 5, 2023

Choose a reason for hiding this comment

rlouf Jun 5, 2023

Choose a reason for hiding this comment

brandonwillard Jun 5, 2023

Choose a reason for hiding this comment

rlouf commented Jun 5, 2023

rlouf commented Jun 6, 2023

rlouf commented May 26, 2023 •

edited

Loading