Multimodal-Large-Language-Model (MLLM)

Thank you for checking out the Multimodal-Large-Language-Model project. Please note that this project was created for research purposes.

For a more robust and well-developed solution, you may consider using open-webui/open-webui with ollama/ollama.

Documentation

You can access the project documentation at [GitHub Pages].

Host requirements

Docker: [Installation Guide]
Docker Compose: [Installation Guide]
Compatibile with Linux and Windows Host
Ensure port 8501 and 11434 are not already in use
You should have at least 8 GB of RAM available to run the 7B models, 16 GB to run the 13B models, and 32 GB to run the 33B models. [Source]
Project can be ran on either CPU or GPU

Running on GPU

NVIDIA Container Toolkit (Linux) [Installation Guide]
NVIDIA CUDA Toolkit (Windows) [Installation]
WSL (Windows) [Installation]

Tested Model(s)

Model Name	Size	Link
llava:7b	4.7GB	Link
llava:34b	20GB	Link

Llava is pulled and loaded by default, other models from Ollama can be added into ollama/ollama-build.sh

Usage

Note

Project will run on GPU by default. To run on CPU, use the docker-compose.cpu.yml instead

Clone this repository and navigate to project folder

git clone https://github.com/NotYuSheng/Multimodal-Large-Language-Model.git
cd Multimodal-Large-Language-Model

Build the Docker images:

docker-compose build

Run images

docker-compose up -d

Access Streamlit webpage from host

<host-ip>:8501

API calls to Ollama server can be made to

<host-ip>:11434

Name		Name	Last commit message	Last commit date
Latest commit History 419 Commits
.github		.github
doc		doc
ollama		ollama
sample-img		sample-img
streamlit		streamlit
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
docker-compose.cpu.yml		docker-compose.cpu.yml
docker-compose.yml		docker-compose.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Multimodal-Large-Language-Model (MLLM)

Documentation

Host requirements

Running on GPU

Tested Model(s)

Usage

About

Releases 2

Packages

Contributors 2

Languages

License

NotYuSheng/Multimodal-Large-Language-Model

Folders and files

Latest commit

History

Repository files navigation

Multimodal-Large-Language-Model (MLLM)

Documentation

Host requirements

Running on GPU

Tested Model(s)

Usage

About

Topics

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases 2

Packages 0

Contributors 2

Languages

Packages