-
Notifications
You must be signed in to change notification settings - Fork 2
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #578 from threefoldtech/development_llama
added ai llama cpu guide
- Loading branch information
Showing
9 changed files
with
121 additions
and
4 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
6 changes: 6 additions & 0 deletions
6
...mentation/system_administrators/advanced/ai_ml_workloads/ai_ml_workloads_toc.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
# AI & ML Workloads | ||
|
||
<h2>Table of Contents</h2> | ||
|
||
- [CPU and Llama](./cpu_and_llama.md) | ||
- [GPU and Pytorch](./gpu_and_pytorch.md) |
105 changes: 105 additions & 0 deletions
105
src/documentation/system_administrators/advanced/ai_ml_workloads/cpu_and_llama.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,105 @@ | ||
<h1> AI & ML Workloads: CPU and Llama </h1> | ||
|
||
<h2>Table of Contents</h2> | ||
|
||
- [Introduction](#introduction) | ||
- [Prerequisites](#prerequisites) | ||
- [Deploy a Full VM](#deploy-a-full-vm) | ||
- [Preparing the VM](#preparing-the-vm) | ||
- [Setting OpenWebUI](#setting-openwebui) | ||
- [Pull a Model](#pull-a-model) | ||
- [Using Llama](#using-llama) | ||
- [References](#references) | ||
|
||
--- | ||
|
||
## Introduction | ||
|
||
We present a simple guide on how to deploy large language models on the grid using CPU. For this guide, we will be deploying Llama on a full VM using OpenWebUI bundled with Ollama support. | ||
|
||
Llama is a large language model trained by Meta AI. It is an open-source model, meaning that it is free to use and customize for various applications. This LLM is designed to be a more conversational AI allowing users to engage in natural-sounding conversations. Llama is trained on a massive dataset of text from the internet and can generate responses to a wide range of topics and questions. | ||
|
||
Ollama is an open-source project that allows users to run large language models (LLMs) on their local machine. | ||
|
||
OpenWebUI is one of many front ends for Ollama, providing a convenient and user friendly way to load weights and chat with the bot. | ||
|
||
## Prerequisites | ||
|
||
- [A TFChain account](../../../dashboard/wallet_connector.md) | ||
- TFT in your TFChain account | ||
- [Buy TFT](../../../threefold_token/buy_sell_tft/buy_sell_tft.md) | ||
- [Send TFT to TFChain](../../../threefold_token/tft_bridges/tfchain_stellar_bridge.md) | ||
|
||
## Deploy a Full VM | ||
|
||
We start by deploying a full VM on the ThreeFold Dashboard. The more cores we set to the machine, the faster the model will be. | ||
|
||
* On the [Threefold Dashboard](https://dashboard.grid.tf/#/), go to the [full virtual machine deployment page](https://dashboard.grid.tf/#/deploy/virtual-machines/full-virtual-machine/) | ||
* Deploy a full VM (Ubuntu 22.04) with only `Wireguard` as the network | ||
* Vcores: 8 vcores | ||
* MB of RAM: 4096 GB | ||
* GB of storage: 100 GB | ||
* After deployment, [set the Wireguard configurations](../../getstarted/ssh_guide/ssh_wireguard.md) | ||
* Connect to the VM via SSH | ||
* ``` | ||
ssh root@VM_Wireguard_Address | ||
``` | ||
|
||
## Preparing the VM | ||
|
||
We prepare the full VM to run Llama. | ||
|
||
* Install Docker | ||
* ``` | ||
wget -O docker.sh get.docker.com | ||
bash docker.sh | ||
``` | ||
|
||
## Setting OpenWebUI | ||
|
||
We now install OpenWebUI with bundled Ollama support. Note that you might need to use another port than `3000` if this port is already in use on your local machine. | ||
|
||
* For CPU only | ||
``` | ||
docker run -d -p 3000:8080 -v ollama:/root/.ollama -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:ollama | ||
``` | ||
* Once the container is fully loaded and running, go to your browser to access OpenWebUI using the Wireguard address: | ||
* ``` | ||
10.20.4.2:3000 | ||
``` | ||
|
||
You should now see the OpenWebUI page. You can register by entering your email and setting a password. This information will stay on the machine running OpenWebUI. | ||
|
||
<p align="center"> | ||
<img src="./img/openwebui_page.png" /> | ||
</p> | ||
|
||
## Pull a Model | ||
|
||
Once you've access OpenWebUI, you need to download a LLM model before using it. | ||
|
||
- Click on the bottom left button displaying your username | ||
- Click on `Settings`, then `Admin Settings` and `Models` | ||
- Under `Pull a model from Ollama.com`, enter the LLM model you want to use | ||
- In our case we will use `llama3` | ||
- Click on the button on the right to pull the image | ||
|
||
![](./img/openwebui_model.png) | ||
|
||
## Using Llama | ||
|
||
Let's now use Llama! | ||
|
||
- Click on `New Chat` on the top left corner | ||
- Click on `Select a model` and select the model you downloaded | ||
- You can click on `Set as default` for convenience | ||
|
||
![](./img/openwebui_set_model.png) | ||
|
||
- You can now `Send a Message` to Llama and interact with it! | ||
|
||
That's it. You now have a running LLM instance on the grid. | ||
|
||
## References | ||
|
||
For any advanced configurations, you may refer to the [OpenWebUI documentation](https://github.com/open-webui/open-webui). |
2 changes: 1 addition & 1 deletion
2
...dministrators/advanced/ai_ml_workloads.md → ...vanced/ai_ml_workloads/gpu_and_pytorch.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,4 +1,4 @@ | ||
<h1> AI & ML Workloads </h1> | ||
<h1> AI & ML Workloads: GPU and Pytorch</h1> | ||
|
||
<h2> Table of Contents </h2> | ||
|
||
|
Binary file added
BIN
+161 KB
...entation/system_administrators/advanced/ai_ml_workloads/img/openwebui_model.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+23.1 KB
...mentation/system_administrators/advanced/ai_ml_workloads/img/openwebui_page.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+67 KB
...tion/system_administrators/advanced/ai_ml_workloads/img/openwebui_set_model.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters