This repository hosts a self-contained inference server for any open-source Large Language Model (LLM). It offers an OpenAI API-compatible server, along with AutoGen agents capable of advanced function calling.
- Unix-based system for Ollama installation.
- Python environment.
-
Clone the Repository:
git clone git@github.com:Jawabreh0/CyprusCodes_LLM.git
-
Install Dependencies:
pip install -r requirements.txt
-
Install Ollama (Unix Only): Ollama Installation Guide
-
Install a Preferred LLM (e.g., Mistral 7B): Mistral 7B Installation
-
Configure LLM Path: Update the LLM path in
/CyprusCodes_LLM/inference_server/conversation.py
(Line No. 18). -
Start the Server:
python /CyprusCodes_LLM/inference_server/main.py
-
Access Server Documentation: Server API Documentation
-
Load the Model: Make request to load the model using the command in
/CyprusCodes_LLM/inference_server/load-model-command.txt
, but make sure to change the model path -
Make Conversation With The LLM Without Function Calling:
python /CyprusCodes_LLM/inference_server/conversation.py
-
Conversation With Function Calling:
python /CyprusCodes_LLM/agent_function_calling/main.py
-
Customize External System Adaptor: Modify
/CyprusCodes_LLM/agents_function_calling/flight_adaptor.py
for specific use cases. -
Tailor Agent Scripts: Update
agent_engineer.py
andagent_expert.py
as per your system adaptor requirements.
- Adaptor: An abstraction layer for external systems, e.g., fetching data from
flight_data.json
. - Agent_Engineer: Defines the functions to be executed, including their JSON schema signatures and system messages.
- Agent_Expert: Determines the values to be passed into functions.
- Agent_User: Acts as a proxy for user interactions.
- Agent_Utils: Manages termination protocols.
AutoGen provides multi-agent conversation framework as a high-level abstraction. With this framework, one can conveniently build LLM workflows, AutoGen offers a collection of working systems spanning a wide range of applications from various domains and complexities, AutoGen supports enhanced LLM inference APIs, which can be used to improve inference performance and reduce cost.
But, AutoGen supports only OpenAI API, so we have to make sure that our inference server compatabile with OpenAI API
Autogen enables the next-gen LLM applications with a generic multi-agent conversation framework. It offers customizable and conversable agents which integrate LLMs, tools, and humans. By automating chat among multiple capable agents, one can easily make them collectively perform tasks autonomously or with human feedback.