Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move model download from the Docker image build step to the run script #19

Merged
merged 8 commits into from
Aug 21, 2023

Conversation

edgar971
Copy link
Contributor

Here's the PR that addresses issue #8 by removing the model download step from the Dockerfile for the API service. This PR updates the API service in the compose files to use a Docker Volume mounted at /models. Additionally, it enhances the run.sh script with a model download manager, which checks for model existence and downloads it if absent. As an optional improvement, this PR suggests exposing the model as an environment variable, reducing the need for multiple Dockerfiles. The run.sh model download manager can then utilize this environment variable to specify, optionally run, and launch the desired model.

@edgar971 edgar971 mentioned this pull request Aug 18, 2023
@edicristofaro
Copy link

@edgar971 that looks pretty good. if it gets merged, i can rebase onto that and add the GPU container things to that.

@mayankchhabra
Copy link
Member

Awesome! Thanks for working on this @edgar971. I'm testing this now.

Copy link
Member

@mayankchhabra mayankchhabra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work @edgar971! I left some suggestions.

api/run.sh Show resolved Hide resolved
api/Dockerfile Outdated Show resolved Hide resolved
api/run.sh Show resolved Hide resolved
docker-compose.yml Outdated Show resolved Hide resolved
edgar971 and others added 2 commits August 18, 2023 11:41
Co-authored-by: Mayank Chhabra <mayankchhabra9@gmail.com>
@todaywasawesome
Copy link
Contributor

@edgar971 @mayankchhabra how well does the model preform if it's available over network vs locally? I'm wondering if it's worth mounting a network share with lots of models, or if storage should always be local to the node where it's running.

@mayankchhabra
Copy link
Member

@todaywasawesome, I've not tried a setup like so I'm not sure. But as long as the model is locked into the RAM on start (currently enabled for 13B and 70B models in their docker-compose.yml files), then, in theory, only the initial loading time should be bandwidth-bound by the local network, and interference should have no effect in speed.

@todaywasawesome
Copy link
Contributor

@mayankchhabra when it runs a model it always loads it all into memory? In that case network storage should be fine.

@mayankchhabra
Copy link
Member

Yep!

@edgar971
Copy link
Contributor Author

The REST API is fast. I think llama.cpp has a streaming option instead of just waiting for the full response.

@notocwilson
Copy link

...I'm wondering if it's worth mounting a network share with lots of models, or if storage should always be local to the node where it's running.

@todaywasawesome FWIW, if you're looking to do this, one should be able to modify the compose script to redefine the model as a network share. I was planning on exposing mine over NFS once this change is merged, as my compute nodes don't have a lot of local storage.

@edgar971
Copy link
Contributor Author

...I'm wondering if it's worth mounting a network share with lots of models, or if storage should always be local to the node where it's running.

@todaywasawesome FWIW, if you're looking to do this, one should be able to modify the compose script to redefine the model as a network share. I was planning on exposing mine over NFS once this change is merged, as my compute nodes don't have a lot of local storage.

Are you able to mount the EFS to ./models and then use that as the docker compose mount?

@todaywasawesome
Copy link
Contributor

@edgar971 For kubernetes, yeah we can just mount the external file share to that folder and done.

@edgar971
Copy link
Contributor Author

Perfect, any changes for the PR @mayankchhabra?

@mayankchhabra
Copy link
Member

Awesome, just tested and everything looks good! Thanks again for your work on this @edgar971!

@mayankchhabra mayankchhabra merged commit 4dc2369 into getumbrel:master Aug 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants