Skip to content

Commit

Permalink
Merge pull request #578 from threefoldtech/development_llama
Browse files Browse the repository at this point in the history
added ai llama cpu guide
  • Loading branch information
khaledyoussef24 committed Jun 27, 2024
2 parents 3a21151 + aed432e commit a1813cc
Show file tree
Hide file tree
Showing 9 changed files with 121 additions and 4 deletions.
4 changes: 3 additions & 1 deletion src/SUMMARY.md
Original file line number Diff line number Diff line change
Expand Up @@ -268,7 +268,9 @@
- [IPFS on a Micro VM](documentation/system_administrators/advanced/ipfs/ipfs_microvm.md)
- [MinIO Operator with Helm3](documentation/system_administrators/advanced/minio_helm3.md)
- [Hummingbot](documentation/system_administrators/advanced/hummingbot.md)
- [AI & ML Workloads](documentation/system_administrators/advanced/ai_ml_workloads.md)
- [AI & ML Workloads](documentation/system_administrators/advanced/ai_ml_workloads/ai_ml_workloads_toc.md)
- [CPU and Llama](documentation/system_administrators/advanced/ai_ml_workloads/cpu_and_llama.md)
- [GPU and Pytorch](documentation/system_administrators/advanced/ai_ml_workloads/gpu_and_pytorch.md)
- [Ecommerce](documentation/system_administrators/advanced/ecommerce/ecommerce.md)
- [WooCommerce](documentation/system_administrators/advanced/ecommerce/woocommerce.md)
- [nopCommerce](documentation/system_administrators/advanced/ecommerce/nopcommerce.md)
Expand Down
4 changes: 3 additions & 1 deletion src/documentation/system_administrators/advanced/advanced.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,9 @@ In this section, we delve into sophisticated topics and powerful functionalities
- [IPFS on a Full VM](./ipfs/ipfs_fullvm.md)
- [IPFS on a Micro VM](./ipfs/ipfs_microvm.md)
- [Hummingbot](./hummingbot.md)
- [AI & ML Workloads](./ai_ml_workloads.md)
- [AI & ML Workloads](./ai_ml_workloads/ai_ml_workloads_toc.md)
- [CPU and Llama](./ai_ml_workloads/cpu_and_llama.md)
- [GPU and Pytorch](./ai_ml_workloads/gpu_and_pytorch.md)
- [Ecommerce](./ecommerce/ecommerce.md)
- [WooCommerce](./ecommerce/woocommerce.md)
- [nopCommerce](./ecommerce/nopcommerce.md)
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
# AI & ML Workloads

<h2>Table of Contents</h2>

- [CPU and Llama](./cpu_and_llama.md)
- [GPU and Pytorch](./gpu_and_pytorch.md)
Original file line number Diff line number Diff line change
@@ -0,0 +1,105 @@
<h1> AI & ML Workloads: CPU and Llama </h1>

<h2>Table of Contents</h2>

- [Introduction](#introduction)
- [Prerequisites](#prerequisites)
- [Deploy a Full VM](#deploy-a-full-vm)
- [Preparing the VM](#preparing-the-vm)
- [Setting OpenWebUI](#setting-openwebui)
- [Pull a Model](#pull-a-model)
- [Using Llama](#using-llama)
- [References](#references)

---

## Introduction

We present a simple guide on how to deploy large language models on the grid using CPU. For this guide, we will be deploying Llama on a full VM using OpenWebUI bundled with Ollama support.

Llama is a large language model trained by Meta AI. It is an open-source model, meaning that it is free to use and customize for various applications. This LLM is designed to be a more conversational AI allowing users to engage in natural-sounding conversations. Llama is trained on a massive dataset of text from the internet and can generate responses to a wide range of topics and questions.

Ollama is an open-source project that allows users to run large language models (LLMs) on their local machine.

OpenWebUI is one of many front ends for Ollama, providing a convenient and user friendly way to load weights and chat with the bot.

## Prerequisites

- [A TFChain account](../../../dashboard/wallet_connector.md)
- TFT in your TFChain account
- [Buy TFT](../../../threefold_token/buy_sell_tft/buy_sell_tft.md)
- [Send TFT to TFChain](../../../threefold_token/tft_bridges/tfchain_stellar_bridge.md)

## Deploy a Full VM

We start by deploying a full VM on the ThreeFold Dashboard. The more cores we set to the machine, the faster the model will be.

* On the [Threefold Dashboard](https://dashboard.grid.tf/#/), go to the [full virtual machine deployment page](https://dashboard.grid.tf/#/deploy/virtual-machines/full-virtual-machine/)
* Deploy a full VM (Ubuntu 22.04) with only `Wireguard` as the network
* Vcores: 8 vcores
* MB of RAM: 4096 GB
* GB of storage: 100 GB
* After deployment, [set the Wireguard configurations](../../getstarted/ssh_guide/ssh_wireguard.md)
* Connect to the VM via SSH
* ```
ssh root@VM_Wireguard_Address
```

## Preparing the VM

We prepare the full VM to run Llama.

* Install Docker
* ```
wget -O docker.sh get.docker.com
bash docker.sh
```

## Setting OpenWebUI

We now install OpenWebUI with bundled Ollama support. Note that you might need to use another port than `3000` if this port is already in use on your local machine.

* For CPU only
```
docker run -d -p 3000:8080 -v ollama:/root/.ollama -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:ollama
```
* Once the container is fully loaded and running, go to your browser to access OpenWebUI using the Wireguard address:
* ```
10.20.4.2:3000
```

You should now see the OpenWebUI page. You can register by entering your email and setting a password. This information will stay on the machine running OpenWebUI.

<p align="center">
<img src="./img/openwebui_page.png" />
</p>

## Pull a Model

Once you've access OpenWebUI, you need to download a LLM model before using it.

- Click on the bottom left button displaying your username
- Click on `Settings`, then `Admin Settings` and `Models`
- Under `Pull a model from Ollama.com`, enter the LLM model you want to use
- In our case we will use `llama3`
- Click on the button on the right to pull the image

![](./img/openwebui_model.png)

## Using Llama

Let's now use Llama!

- Click on `New Chat` on the top left corner
- Click on `Select a model` and select the model you downloaded
- You can click on `Set as default` for convenience

![](./img/openwebui_set_model.png)

- You can now `Send a Message` to Llama and interact with it!

That's it. You now have a running LLM instance on the grid.

## References

For any advanced configurations, you may refer to the [OpenWebUI documentation](https://github.com/open-webui/open-webui).
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
<h1> AI & ML Workloads </h1>
<h1> AI & ML Workloads: GPU and Pytorch</h1>

<h2> Table of Contents </h2>

Expand Down
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Original file line number Diff line number Diff line change
Expand Up @@ -85,7 +85,9 @@ For complementary information on ThreeFold grid and its cloud component, refer t
- [IPFS on a Full VM](./advanced/ipfs/ipfs_fullvm.md)
- [IPFS on a Micro VM](./advanced/ipfs/ipfs_microvm.md)
- [Hummingbot](./advanced/hummingbot.md)
- [AI & ML Workloads](./advanced/ai_ml_workloads.md)
- [AI & ML Workloads](./advanced/ai_ml_workloads/ai_ml_workloads_toc.md)
- [CPU and Llama](./advanced/ai_ml_workloads/cpu_and_llama.md)
- [GPU and Pytorch](./advanced/ai_ml_workloads/gpu_and_pytorch.md)
- [Ecommerce](./advanced/ecommerce/ecommerce.md)
- [WooCommerce](./advanced/ecommerce/woocommerce.md)
- [nopCommerce](./advanced/ecommerce/nopcommerce.md)
Expand Down

0 comments on commit a1813cc

Please sign in to comment.