Edge LLM

Deploy large language models on low power, single-board computers.

Seamlessly deploy LLMs locally on your Jetson Nano and make use of small models such as TinyLlama and leverage their power using a web-ui.

Quickstart

Run the following command for a quickstart

git clone https://github.com/n-vlahovic/edge-llm
cd edge-llm
make init

or just

make init

in case you already cloned and CD'd into the repo.

The WebUI is available under http://<jetson-ip>:<WEBUI_PORT> and the LLM backend under http://<jetson-ip>:<LLM_PORT>.

You can now proceed to logging in.

Requirements

NVidia Jetson Nano
4GB >= RAM
32GB >= Storage (64GB >= preferred)
JetPack SDK

Updating Docker

If you installed JetPack following the official getting-started docs, it might be that your docker; docker-compose version is somewhat dated.

Running make docker-update will update docker; docker-compose by invoking the ./scripts/update_docker.py script, which:

Checks if the docker version is
Removes the existing docker; docker-compose installations
Adds the Docker's official GPG key
Adds the repository to Apt sources
Installs the Docker packages
Enables docker service
Adds the current $USER to the docker user group (a reboot will be necessary)

Creating an ENV File

The docker-compose.yml file has abstracted certain parameters by leveraging .env files. An .env file with the following variables needs to be created:

LLM_PORT=<int>: The exposed port (on the Jetson nano - not the container) of the LLM backend.
WEBUI_PORT=<int>: The exposed port (on the Jetson nano - not the container) of the LLM WebUI.

Here is a possible way to create the .env file:

python3 scripts/check_dotenv.py

or

echo 'LLM_PORT=8000' >> .env && echo 'WEBUI_PORT=3000' >> .env

Building the Services

To build the services, simply run make build which runs docker compose up --build -d.

Stop or Kill

To stop or kill the services, you can run make <stop|kill> which runs docker compose <stop|down -v>.

Pulling Models

Our setup uses Ollama as backend, here is a list of out-of-the-box available models.

Models can be pulled as such:

python3 scripts/pull_model.py -m <model>

alternatively

docker exec -it ollama ollama pull <model>

Given the compute and memory limitations of our platform, we should aim at deploying small models, e.g. tinyllama.

Logging into the Web-UI

Using any device connected to the same network as your Jetson Nano, navigate to the URL http://<HOST>:<WEB_UI_PORT> (the HOST can typically be the hostname of your Jetson Nano e.g. jetson-nano.local or its ip-address which can be determined via ip a).

You should be prompted with this login screen. If you didn't create an account, create one via the Sign up link. Please note that this creates a local account. For persistency, the data is saved via docker volumes onto the filesystem in the ./ollama-webui directory (which is excluded from git).

Once logged in, this how the UI should look like:

You can now select a model and start a prompt - enjoy!

Name		Name	Last commit message	Last commit date
Latest commit History 61 Commits
inst		inst
scripts		scripts
.gitignore		.gitignore
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
docker-compose.yml		docker-compose.yml
ollama-webui.Dockerfile		ollama-webui.Dockerfile
ollama.Dockerfile		ollama.Dockerfile

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Edge LLM

Deploy large language models on low power, single-board computers.

Quickstart

Getting Started

Requirements

Updating Docker

Creating an ENV File

Building the Services

Stop or Kill

Pulling Models

Logging into the Web-UI

About

Releases

Packages

Languages

License

N-Vlahovic/edge-llm

Folders and files

Latest commit

History

Repository files navigation

Edge LLM

Deploy large language models on low power, single-board computers.

Quickstart

Getting Started

Requirements

Updating Docker

Creating an ENV File

Building the Services

Stop or Kill

Pulling Models

Logging into the Web-UI

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages