Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Self host LLM use case (initial code) #1442

Merged
merged 2 commits into from
Feb 19, 2024

Conversation

seokho-son
Copy link
Member

  • LLM을 쉽게 배포해볼 수 있는 유스케이스 추가.
  • 하기 spec, image set에서 실행 가능
    • AWS,aws-us-east-2, ami-0c616d2c080a12072, Ubuntu 20.04
    • AWS,aws-us-east-2, g5.2xlarge

LLM Service Scripts

This document describes a set of scripts designed to manage an LLM (Large Language Model) service. These scripts facilitate starting the service (startServer.sh), checking its status (statusServer.sh), and stopping the service (stopServer.sh).

Prerequisites

  • A Linux-based system
  • Python 3 installed
  • pip (Python package manager)
  • sudo privileges or root access

Installation & Setup

1. Start Server

Starts the LLM service by installing necessary Python packages and running a FastAPI-based service in the background.

sudo ./startServer.sh

To download and prepare startServer.sh script for execution:

wget https://example.com/path/to/startServer.sh
chmod +x startServer.sh

2. Check Server Status

Checks the status of the currently running LLM service. If the service is running, it outputs the contents of recent logs.

./statusServer.sh

To download and prepare statusServer.sh script for execution:

wget https://example.com/path/to/statusServer.sh
chmod +x statusServer.sh

3. Stop Server

Stops the running LLM service by safely terminating all related processes.

./stopServer.sh

To download and prepare stopServer.sh script for execution:

wget https://example.com/path/to/stopServer.sh
chmod +x stopServer.sh

Testing the Server

Once the server is running, you can test the LLM service with the following curl command. This command sends a text generation request to the service, testing its operational status.

curl -X POST http://{PUBLICIP}:5001/v1/generateText \
     -H "Content-Type: application/json" \
     -d '{"prompt": "Who is president of US?"}'

Replace {PUBLICIP} with the public IP address of the server where the LLM service is running.

Notes

  • These scripts operate a Python-based LLM service using FastAPI and Uvicorn.
  • Service logs are saved to ~/llm_nohup.out by default.
  • Server testing utilizes the service's public IP address and port number 5001.

ACK

Thanks for useful guide to deploy self-hosted LLM.

@seokho-son
Copy link
Member Author

/approve

@github-actions github-actions bot added the approved This PR is approved and will be merged soon. label Feb 19, 2024
@cb-github-robot cb-github-robot merged commit 9b3efeb into cloud-barista:main Feb 19, 2024
1 of 2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved This PR is approved and will be merged soon.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants