From ac428aae8a2755a4a1426ee15ea00c6a71a33d47 Mon Sep 17 00:00:00 2001
From: irfanpena
Date: Thu, 5 Sep 2024 14:04:52 +0700
Subject: [PATCH 01/21] Draft the Platform readme
---
README.md | 2 +-
platform/README.md | 244 +++++++++++++++++++++++++--------------------
2 files changed, 138 insertions(+), 108 deletions(-)
diff --git a/README.md b/README.md
index 6b21c4448..33a1c0e11 100644
--- a/README.md
+++ b/README.md
@@ -5,7 +5,7 @@
Documentation - API Reference
- - Changelog - Bug reports - Discord
+ - Changelog - Bug reports - Discord
> ⚠️ **Cortex is currently in Development**: Expect breaking changes and bugs!
diff --git a/platform/README.md b/platform/README.md
index 660664159..10a191e8e 100644
--- a/platform/README.md
+++ b/platform/README.md
@@ -1,138 +1,168 @@
# Cortex
-
+
- Documentation - API Reference
- - Changelog - Bug reports - Discord
+ Documentation - API Reference
+ - Changelog - Bug reports - Discord
-> ⚠️ **Cortex is currently in Development**: Expect breaking changes and bugs!
+> ⚠️ **Cortex Platform is Coming Soon!**
## About
-Cortex is an OpenAI-compatible AI engine that developers can use to build LLM apps. It is packaged with a Docker-inspired command-line interface and client libraries. It can be used as a standalone server or imported as a library.
+Cortex Platform is a fully database-driven application built on top of [`cortex.cpp`](), designed as an OpenAI API equivalent. It supports multiple engines, multi-user functionality, and operates entirely through an API.
## Cortex Engines
-Cortex supports the following engines:
+Cortex Platform supports the following engines:
- [`cortex.llamacpp`](https://github.com/janhq/cortex.llamacpp): `cortex.llamacpp` library is a C++ inference tool that can be dynamically loaded by any server at runtime. We use this engine to support GGUF inference with GGUF models. The `llama.cpp` is optimized for performance on both CPU and GPU.
- [`cortex.onnx` Repository](https://github.com/janhq/cortex.onnx): `cortex.onnx` is a C++ inference library for Windows that leverages `onnxruntime-genai` and uses DirectML to provide GPU acceleration across a wide range of hardware and drivers, including AMD, Intel, NVIDIA, and Qualcomm GPUs.
- [`cortex.tensorrt-llm`](https://github.com/janhq/cortex.tensorrt-llm): `cortex.tensorrt-llm` is a C++ inference library designed for NVIDIA GPUs. It incorporates NVIDIA’s TensorRT-LLM for GPU-accelerated inference.
-## Quicklinks
-
-- [Homepage](https://cortex.so/)
-- [Docs](https://cortex.so/docs/)
-
-## Quickstart
-### Prerequisites
-- **OS**:
- - MacOSX 13.6 or higher.
- - Windows 10 or higher.
- - Ubuntu 22.04 and later.
-- **Dependencies**:
- - **Node.js**: Version 18 and above is required to run the installation.
- - **NPM**: Needed to manage packages.
- - **CPU Instruction Sets**: Available for download from the [Cortex GitHub Releases](https://github.com/janhq/cortex/releases) page.
- - **OpenMPI**: Required for Linux. Install by using the following command:
- ```bash
- sudo apt install openmpi-bin libopenmpi-dev
- ```
-
-> Visit [Quickstart](https://cortex.so/docs/quickstart) to get started.
-
-### NPM
-``` bash
-# Install using NPM
-npm i -g cortexso
-# Run model
-cortex run mistral
-# To uninstall globally using NPM
-npm uninstall -g cortexso
-```
-
-### Homebrew
-``` bash
-# Install using Brew
-brew install cortexso
-# Run model
-cortex run mistral
-# To uninstall using Brew
-brew uninstall cortexso
-```
-> You can also install Cortex using the Cortex Installer available on [GitHub Releases](https://github.com/janhq/cortex/releases).
-
-## Cortex Server
-```bash
-cortex serve
-
-# Output
-# Started server at http://localhost:1337
-# Swagger UI available at http://localhost:1337/api
-```
+## Installation
+### Docker
+**Coming Soon!**
-You can now access the Cortex API server at `http://localhost:1337`,
-and the Swagger UI at `http://localhost:1337/api`.
+### Helm
+**Coming Soon!**
-## Build from Source
+### Yarn
+**Coming Soon!**
-To install Cortex from the source, follow the steps below:
+### Libraries
+**Coming Soon!**
-1. Clone the Cortex repository [here](https://github.com/janhq/cortex/tree/dev).
-2. Navigate to the `cortex-js` folder.
-3. Open the terminal and run the following command to build the Cortex project:
+### Build from Source
+**Coming Soon!**
+## Quickstart
+**Coming Soon!**
+
+## Model Library
+Cortex supports a list of models available on [Cortex Hub](https://huggingface.co/cortexso).
+
+Here are example of models that you can use based on each supported engine:
+### `llama.cpp`
+| Model ID | Variant (Branch) | Model size | CLI command |
+|------------------|------------------|-------------------|------------------------------------|
+| codestral | 22b-gguf | 22B | `cortex run codestral:22b-gguf` |
+| command-r | 35b-gguf | 35B | `cortex run command-r:35b-gguf` |
+| gemma | 7b-gguf | 7B | `cortex run gemma:7b-gguf` |
+| llama3 | gguf | 8B | `cortex run llama3:gguf` |
+| llama3.1 | gguf | 8B | `cortex run llama3.1:gguf` |
+| mistral | 7b-gguf | 7B | `cortex run mistral:7b-gguf` |
+| mixtral | 7x8b-gguf | 46.7B | `cortex run mixtral:7x8b-gguf` |
+| openhermes-2.5 | 7b-gguf | 7B | `cortex run openhermes-2.5:7b-gguf`|
+| phi3 | medium-gguf | 14B - 4k ctx len | `cortex run phi3:medium-gguf` |
+| phi3 | mini-gguf | 3.82B - 4k ctx len| `cortex run phi3:mini-gguf` |
+| qwen2 | 7b-gguf | 7B | `cortex run qwen2:7b-gguf` |
+| tinyllama | 1b-gguf | 1.1B | `cortex run tinyllama:1b-gguf` |
+### `ONNX`
+| Model ID | Variant (Branch) | Model size | CLI command |
+|------------------|------------------|-------------------|------------------------------------|
+| gemma | 7b-onnx | 7B | `cortex run gemma:7b-onnx` |
+| llama3 | onnx | 8B | `cortex run llama3:onnx` |
+| mistral | 7b-onnx | 7B | `cortex run mistral:7b-onnx` |
+| openhermes-2.5 | 7b-onnx | 7B | `cortex run openhermes-2.5:7b-onnx`|
+| phi3 | mini-onnx | 3.82B - 4k ctx len| `cortex run phi3:mini-onnx` |
+| phi3 | medium-onnx | 14B - 4k ctx len | `cortex run phi3:medium-onnx` |
+### `TensorRT-LLM`
+| Model ID | Variant (Branch) | Model size | CLI command |
+|------------------|-------------------------------|-------------------|------------------------------------|
+| llama3 | 8b-tensorrt-llm-windows-ampere | 8B | `cortex run llama3:8b-tensorrt-llm-windows-ampere` |
+| llama3 | 8b-tensorrt-llm-linux-ampere | 8B | `cortex run llama3:8b-tensorrt-llm-linux-ampere` |
+| llama3 | 8b-tensorrt-llm-linux-ada | 8B | `cortex run llama3:8b-tensorrt-llm-linux-ada`|
+| llama3 | 8b-tensorrt-llm-windows-ada | 8B | `cortex run llama3:8b-tensorrt-llm-windows-ada` |
+| mistral | 7b-tensorrt-llm-linux-ampere | 7B | `cortex run mistral:7b-tensorrt-llm-linux-ampere`|
+| mistral | 7b-tensorrt-llm-windows-ampere | 7B | `cortex run mistral:7b-tensorrt-llm-windows-ampere` |
+| mistral | 7b-tensorrt-llm-linux-ada | 7B | `cortex run mistral:7b-tensorrt-llm-linux-ada`|
+| mistral | 7b-tensorrt-llm-windows-ada | 7B | `cortex run mistral:7b-tensorrt-llm-windows-ada` |
+| openhermes-2.5 | 7b-tensorrt-llm-windows-ampere | 7B | `cortex run openhermes-2.5:7b-tensorrt-llm-windows-ampere`|
+| openhermes-2.5 | 7b-tensorrt-llm-windows-ada | 7B | `cortex run openhermes-2.5:7b-tensorrt-llm-windows-ada`|
+| openhermes-2.5 | 7b-tensorrt-llm-linux-ada | 7B | `cortex run openhermes-2.5:7b-tensorrt-llm-linux-ada`|
+
+> **Note**:
+> You should have at least 8 GB of RAM available to run the 7B models, 16 GB to run the 14B models, and 32 GB to run the 32B models.
+
+## Cortex Platfrom API
+Cortex Platform has the stateful API that runs at `localhost:1337`.
+
+### Create Message
```bash
-npx nest build
+curl --request POST \
+ --url http://127.0.0.1:1337/v1/threads/__THREAD_ID__/messages \
+ --header 'Content-Type: application/json' \
+ --data '{
+ "role": "user",
+ "content": "Tell me a joke"
+}'
```
-4. Make the `command.js` executable:
-
+### Create Assistant
```bash
-chmod +x '[path-to]/cortex/cortex-js/dist/src/command.js'
+curl --request POST \
+ --url http://127.0.0.1:1337/v1/assistants \
+ --header 'Content-Type: application/json' \
+ --data '{
+ "id": "jan",
+ "avatar": "",
+ "name": "Jan",
+ "description": "A default assistant that can use all downloaded models",
+ "model": "",
+ "instructions": "",
+ "tools": [],
+ "metadata": {},
+ "top_p": "0.7",
+ "temperature": "0.7"
+}'
```
-5. Link the package globally:
-
+### Create Thread
```bash
-npm link
+curl --request POST \
+ --url http://127.0.0.1:1337/v1/threads \
+ --header 'Content-Type: application/json' \
+ --data '{
+ "assistants": [
+ {
+ "id": "thread_123",
+ "avatar": "https://example.com/avatar.png",
+ "name": "Virtual Helper",
+ "model": "mistral",
+ "instructions": "Assist with customer queries and provide information based on the company database.",
+ "tools": [
+ {
+ "name": "Knowledge Retrieval",
+ "settings": {
+ "source": "internal",
+ "endpoint": "https://api.example.com/knowledge"
+ }
+ }
+ ],
+ "description": "This assistant helps with customer support by retrieving relevant information.",
+ "metadata": {
+ "department": "support",
+ "version": "1.0"
+ },
+ "object": "assistant",
+ "temperature": 0.7,
+ "top_p": 0.9,
+ "created_at": 1622470423,
+ "response_format": {
+ "format": "json"
+ },
+ "tool_resources": {
+ "resources": [
+ "database1",
+ "database2"
+ ]
+ }
+ }
+ ]
+}'
```
-## Cortex CLI Commands
-
-The following CLI commands are currently available.
-See [CLI Reference Docs](https://cortex.so/docs/cli) for more information.
-
-```bash
-
- serve Providing API endpoint for Cortex backend.
- chat Send a chat request to a model.
- init|setup Init settings and download cortex's dependencies.
- ps Show running models and their status.
- kill Kill running cortex processes.
- pull|download Download a model. Working with HuggingFace model id.
- run [options] EXPERIMENTAL: Shortcut to start a model and chat.
- models Subcommands for managing models.
- models list List all available models.
- models pull Download a specified model.
- models remove Delete a specified model.
- models get Retrieve the configuration of a specified model.
- models start Start a specified model.
- models stop Stop a specified model.
- models update Update the configuration of a specified model.
- benchmark Benchmark and analyze the performance of a specific AI model using your system.
- presets Show all the available model presets within Cortex.
- telemetry Retrieve telemetry logs for monitoring and analysis.
- embeddings Creates an embedding vector representing the input text.
- engines Subcommands for managing engines.
- engines get Get an engine details.
- engines list Get all the available Cortex engines.
- engines init Setup and download the required dependencies to run cortex engines.
- configs Subcommands for managing configurations.
- configs get Get a configuration details.
- configs list Get all the available configurations.
- configs set Set a configuration.
-```
+> **Note**: Check our [API documentation](https://cortex.so/api-reference) for a full list of available stateful endpoints.
## Contact Support
- For support, please file a [GitHub ticket](https://github.com/janhq/cortex/issues/new/choose).
From 3cca455528e8511d0d230b0136d52d0a3254d1d8 Mon Sep 17 00:00:00 2001
From: irfanpena
Date: Thu, 5 Sep 2024 14:16:39 +0700
Subject: [PATCH 02/21] Update the Model library table
---
platform/README.md | 74 ++++++++++++++++++++++++----------------------
1 file changed, 38 insertions(+), 36 deletions(-)
diff --git a/platform/README.md b/platform/README.md
index 10a191e8e..8c3c04d33 100644
--- a/platform/README.md
+++ b/platform/README.md
@@ -39,47 +39,49 @@ Cortex Platform supports the following engines:
**Coming Soon!**
## Model Library
-Cortex supports a list of models available on [Cortex Hub](https://huggingface.co/cortexso).
+Cortex Platform supports a list of models available on [Cortex Hub](https://huggingface.co/cortexso).
Here are example of models that you can use based on each supported engine:
### `llama.cpp`
-| Model ID | Variant (Branch) | Model size | CLI command |
-|------------------|------------------|-------------------|------------------------------------|
-| codestral | 22b-gguf | 22B | `cortex run codestral:22b-gguf` |
-| command-r | 35b-gguf | 35B | `cortex run command-r:35b-gguf` |
-| gemma | 7b-gguf | 7B | `cortex run gemma:7b-gguf` |
-| llama3 | gguf | 8B | `cortex run llama3:gguf` |
-| llama3.1 | gguf | 8B | `cortex run llama3.1:gguf` |
-| mistral | 7b-gguf | 7B | `cortex run mistral:7b-gguf` |
-| mixtral | 7x8b-gguf | 46.7B | `cortex run mixtral:7x8b-gguf` |
-| openhermes-2.5 | 7b-gguf | 7B | `cortex run openhermes-2.5:7b-gguf`|
-| phi3 | medium-gguf | 14B - 4k ctx len | `cortex run phi3:medium-gguf` |
-| phi3 | mini-gguf | 3.82B - 4k ctx len| `cortex run phi3:mini-gguf` |
-| qwen2 | 7b-gguf | 7B | `cortex run qwen2:7b-gguf` |
-| tinyllama | 1b-gguf | 1.1B | `cortex run tinyllama:1b-gguf` |
+| Model ID | Variant (Branch) | Model size |
+|------------------|------------------|-------------------|
+| codestral | 22b-gguf | 22B |
+| command-r | 35b-gguf | 35B |
+| gemma | 7b-gguf | 7B |
+| llama3 | gguf | 8B |
+| llama3.1 | gguf | 8B |
+| mistral | 7b-gguf | 7B |
+| mixtral | 7x8b-gguf | 46.7B |
+| openhermes-2.5 | 7b-gguf | 7B |
+| phi3 | medium-gguf | 14B - 4k ctx len |
+| phi3 | mini-gguf | 3.82B - 4k ctx len|
+| qwen2 | 7b-gguf | 7B |
+| tinyllama | 1b-gguf | 1.1B |
+
### `ONNX`
-| Model ID | Variant (Branch) | Model size | CLI command |
-|------------------|------------------|-------------------|------------------------------------|
-| gemma | 7b-onnx | 7B | `cortex run gemma:7b-onnx` |
-| llama3 | onnx | 8B | `cortex run llama3:onnx` |
-| mistral | 7b-onnx | 7B | `cortex run mistral:7b-onnx` |
-| openhermes-2.5 | 7b-onnx | 7B | `cortex run openhermes-2.5:7b-onnx`|
-| phi3 | mini-onnx | 3.82B - 4k ctx len| `cortex run phi3:mini-onnx` |
-| phi3 | medium-onnx | 14B - 4k ctx len | `cortex run phi3:medium-onnx` |
+| Model ID | Variant (Branch) | Model size |
+|------------------|------------------|-------------------|
+| gemma | 7b-onnx | 7B |
+| llama3 | onnx | 8B |
+| mistral | 7b-onnx | 7B |
+| openhermes-2.5 | 7b-onnx | 7B |
+| phi3 | mini-onnx | 3.82B - 4k ctx len|
+| phi3 | medium-onnx | 14B - 4k ctx len |
+
### `TensorRT-LLM`
-| Model ID | Variant (Branch) | Model size | CLI command |
-|------------------|-------------------------------|-------------------|------------------------------------|
-| llama3 | 8b-tensorrt-llm-windows-ampere | 8B | `cortex run llama3:8b-tensorrt-llm-windows-ampere` |
-| llama3 | 8b-tensorrt-llm-linux-ampere | 8B | `cortex run llama3:8b-tensorrt-llm-linux-ampere` |
-| llama3 | 8b-tensorrt-llm-linux-ada | 8B | `cortex run llama3:8b-tensorrt-llm-linux-ada`|
-| llama3 | 8b-tensorrt-llm-windows-ada | 8B | `cortex run llama3:8b-tensorrt-llm-windows-ada` |
-| mistral | 7b-tensorrt-llm-linux-ampere | 7B | `cortex run mistral:7b-tensorrt-llm-linux-ampere`|
-| mistral | 7b-tensorrt-llm-windows-ampere | 7B | `cortex run mistral:7b-tensorrt-llm-windows-ampere` |
-| mistral | 7b-tensorrt-llm-linux-ada | 7B | `cortex run mistral:7b-tensorrt-llm-linux-ada`|
-| mistral | 7b-tensorrt-llm-windows-ada | 7B | `cortex run mistral:7b-tensorrt-llm-windows-ada` |
-| openhermes-2.5 | 7b-tensorrt-llm-windows-ampere | 7B | `cortex run openhermes-2.5:7b-tensorrt-llm-windows-ampere`|
-| openhermes-2.5 | 7b-tensorrt-llm-windows-ada | 7B | `cortex run openhermes-2.5:7b-tensorrt-llm-windows-ada`|
-| openhermes-2.5 | 7b-tensorrt-llm-linux-ada | 7B | `cortex run openhermes-2.5:7b-tensorrt-llm-linux-ada`|
+| Model ID | Variant (Branch) | Model size |
+|------------------|-------------------------------|-------------------|
+| llama3 | 8b-tensorrt-llm-windows-ampere | 8B |
+| llama3 | 8b-tensorrt-llm-linux-ampere | 8B |
+| llama3 | 8b-tensorrt-llm-linux-ada | 8B |
+| llama3 | 8b-tensorrt-llm-windows-ada | 8B |
+| mistral | 7b-tensorrt-llm-linux-ampere | 7B |
+| mistral | 7b-tensorrt-llm-windows-ampere | 7B |
+| mistral | 7b-tensorrt-llm-linux-ada | 7B |
+| mistral | 7b-tensorrt-llm-windows-ada | 7B |
+| openhermes-2.5 | 7b-tensorrt-llm-windows-ampere | 7B |
+| openhermes-2.5 | 7b-tensorrt-llm-windows-ada | 7B |
+| openhermes-2.5 | 7b-tensorrt-llm-linux-ada | 7B |
> **Note**:
> You should have at least 8 GB of RAM available to run the 7B models, 16 GB to run the 14B models, and 32 GB to run the 32B models.
From 25527451fe10bac0718d7a85de0ea083ae273ed4 Mon Sep 17 00:00:00 2001
From: irfanpena
Date: Thu, 5 Sep 2024 14:18:23 +0700
Subject: [PATCH 03/21] nits
---
platform/README.md | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/platform/README.md b/platform/README.md
index 8c3c04d33..635dc3b18 100644
--- a/platform/README.md
+++ b/platform/README.md
@@ -11,7 +11,7 @@
> ⚠️ **Cortex Platform is Coming Soon!**
## About
-Cortex Platform is a fully database-driven application built on top of [`cortex.cpp`](), designed as an OpenAI API equivalent. It supports multiple engines, multi-user functionality, and operates entirely through an API.
+Cortex Platform is a fully database-driven application built on top of [`cortex.cpp`](https://github.com/janhq/cortex.cpp), designed as an OpenAI API equivalent. It supports multiple engines, multi-user functionality, and operates entirely through stateful API endpoints.
## Cortex Engines
Cortex Platform supports the following engines:
From 0347b7023a9a5a9df1b54d35f7a1c53dc7eeea05 Mon Sep 17 00:00:00 2001
From: irfanpena
Date: Thu, 5 Sep 2024 15:08:51 +0700
Subject: [PATCH 04/21] nits
---
platform/README.md | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/platform/README.md b/platform/README.md
index 635dc3b18..37242f907 100644
--- a/platform/README.md
+++ b/platform/README.md
@@ -8,7 +8,7 @@
- Changelog - Bug reports - Discord
-> ⚠️ **Cortex Platform is Coming Soon!**
+> ⚠️ **Cortex Platform is under development**
## About
Cortex Platform is a fully database-driven application built on top of [`cortex.cpp`](https://github.com/janhq/cortex.cpp), designed as an OpenAI API equivalent. It supports multiple engines, multi-user functionality, and operates entirely through stateful API endpoints.
@@ -87,8 +87,13 @@ Here are example of models that you can use based on each supported engine:
> You should have at least 8 GB of RAM available to run the 7B models, 16 GB to run the 14B models, and 32 GB to run the 32B models.
## Cortex Platfrom API
-Cortex Platform has the stateful API that runs at `localhost:1337`.
+Cortex Platform only support the following stateful API endpoints:
+- Messages
+- Threads
+- Assistants
+
+Here are some examples of the available stateful endpoints:
### Create Message
```bash
curl --request POST \
From 2a6adbf71de0005b40cfbae02555af6d7ba51066 Mon Sep 17 00:00:00 2001
From: irfanpena
Date: Fri, 6 Sep 2024 08:07:54 +0700
Subject: [PATCH 05/21] Added the installation based on discord
---
README.md | 4 ++--
platform/README.md | 20 ++++++++++++++++----
2 files changed, 18 insertions(+), 6 deletions(-)
diff --git a/README.md b/README.md
index 33a1c0e11..f2d80a258 100644
--- a/README.md
+++ b/README.md
@@ -36,8 +36,8 @@ sudo apt install cortex-engine
**Coming Soon!**
### Libraries
-- [cortex.js](https://github.com/janhq/cortex.js)
-- [cortex.py](https://github.com/janhq/cortex-python)
+- [cortex.cpp.js](https://github.com/janhq/cortex.js)
+- [cortex.cpp.py](https://github.com/janhq/cortex-python)
### Build from Source
diff --git a/platform/README.md b/platform/README.md
index 37242f907..c0949d9dd 100644
--- a/platform/README.md
+++ b/platform/README.md
@@ -20,17 +20,29 @@ Cortex Platform supports the following engines:
- [`cortex.tensorrt-llm`](https://github.com/janhq/cortex.tensorrt-llm): `cortex.tensorrt-llm` is a C++ inference library designed for NVIDIA GPUs. It incorporates NVIDIA’s TensorRT-LLM for GPU-accelerated inference.
## Installation
+
+> **Note**:
+> To install the Cortex Platform, clone our [repository](). It includes everything you need for installation using Docker and Helm.
+
### Docker
-**Coming Soon!**
+```bash
+docker compose up
+```
### Helm
-**Coming Soon!**
+```bash
+helm install cortex-platform
+```
### Yarn
-**Coming Soon!**
+```bash
+yarn install cortex-platform
+```
### Libraries
-**Coming Soon!**
+- [cortex.js]()
+- [cortex.py]()
+
### Build from Source
**Coming Soon!**
From af7035ffb4c2946cf83d06cfc76fdb4f01c8643c Mon Sep 17 00:00:00 2001
From: irfanpena
Date: Fri, 6 Sep 2024 08:10:08 +0700
Subject: [PATCH 06/21] nits
---
platform/README.md | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/platform/README.md b/platform/README.md
index c0949d9dd..3a7569f9b 100644
--- a/platform/README.md
+++ b/platform/README.md
@@ -8,7 +8,7 @@
- Changelog - Bug reports - Discord
-> ⚠️ **Cortex Platform is under development**
+> ⚠️ **Cortex Platform is under development. This documentation outlines the intended behavior of Cortex Platform, which may not yet be fully implemented in the codebase.**
## About
Cortex Platform is a fully database-driven application built on top of [`cortex.cpp`](https://github.com/janhq/cortex.cpp), designed as an OpenAI API equivalent. It supports multiple engines, multi-user functionality, and operates entirely through stateful API endpoints.
From 6be68492d9b99454721ee5a704553896d44b64a2 Mon Sep 17 00:00:00 2001
From: irfanpena
Date: Fri, 6 Sep 2024 09:05:12 +0700
Subject: [PATCH 07/21] cortex-engine->cortex.cpp
---
README.md | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/README.md b/README.md
index f2d80a258..f1c215842 100644
--- a/README.md
+++ b/README.md
@@ -22,15 +22,15 @@ Cortex supports the following engines:
## Installation
### MacOs
```bash
-brew install cortex-engine
+brew install cortex.cpp
```
### Windows
```bash
-winget install cortex-engine
+winget install cortex.cpp
```
### Linux
```bash
-sudo apt install cortex-engine
+sudo apt install cortex.cpp
```
### Docker
**Coming Soon!**
From 3b35dd9b33d8d298c7aedb4e86a1f0f2b913790f Mon Sep 17 00:00:00 2001
From: irfanpena
Date: Fri, 6 Sep 2024 10:30:43 +0700
Subject: [PATCH 08/21] use the current banner instead
---
platform/README.md | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/platform/README.md b/platform/README.md
index 3a7569f9b..e3c63dfa4 100644
--- a/platform/README.md
+++ b/platform/README.md
@@ -1,7 +1,7 @@
# Cortex
-
+
Documentation - API Reference
From da195cc93205b8258e6d8e99db7308267a8edda7 Mon Sep 17 00:00:00 2001
From: irfanpena
Date: Mon, 9 Sep 2024 13:52:40 +0700
Subject: [PATCH 09/21] Simplify and update the cortex.cpp readme
---
README.md | 105 +++++------------
platform/README.md | 282 +++++++++++++++++++++++++--------------------
2 files changed, 183 insertions(+), 204 deletions(-)
diff --git a/README.md b/README.md
index f1c215842..77b8605f1 100644
--- a/README.md
+++ b/README.md
@@ -8,16 +8,15 @@
- Changelog - Bug reports - Discord
-> ⚠️ **Cortex is currently in Development**: Expect breaking changes and bugs!
+> ⚠️ **CortexCPP is currently in Development. This documentation outlines the intended behavior of Cortex Platform, which may not yet be fully implemented in the codebase.**
## About
-Cortex is a C++ AI engine that comes with a Docker-like command-line interface and client libraries. It supports running AI models using `ONNX`, `TensorRT-LLM`, and `llama.cpp` engines. Cortex can function as a standalone server or be integrated as a library.
+Cortex is a C++ AI engine that comes with a Docker-like command-line interface and client libraries. Cortex can function as a standalone server or be integrated as a library.
-## Cortex Engines
Cortex supports the following engines:
-- [`cortex.llamacpp`](https://github.com/janhq/cortex.llamacpp): `cortex.llamacpp` library is a C++ inference tool that can be dynamically loaded by any server at runtime. We use this engine to support GGUF inference with GGUF models. The `llama.cpp` is optimized for performance on both CPU and GPU.
-- [`cortex.onnx` Repository](https://github.com/janhq/cortex.onnx): `cortex.onnx` is a C++ inference library for Windows that leverages `onnxruntime-genai` and uses DirectML to provide GPU acceleration across a wide range of hardware and drivers, including AMD, Intel, NVIDIA, and Qualcomm GPUs.
-- [`cortex.tensorrt-llm`](https://github.com/janhq/cortex.tensorrt-llm): `cortex.tensorrt-llm` is a C++ inference library designed for NVIDIA GPUs. It incorporates NVIDIA’s TensorRT-LLM for GPU-accelerated inference.
+- [`cortex.llamacpp`](https://github.com/janhq/cortex.llamacpp)
+- [`cortex.onnx` Repository](https://github.com/janhq/cortex.onnx)
+- [`cortex.tensorrt-llm`](https://github.com/janhq/cortex.tensorrt-llm)
## Installation
### MacOs
@@ -36,8 +35,8 @@ sudo apt install cortex.cpp
**Coming Soon!**
### Libraries
-- [cortex.cpp.js](https://github.com/janhq/cortex.js)
-- [cortex.cpp.py](https://github.com/janhq/cortex-python)
+- [cortex.js](https://github.com/janhq/cortex.js)
+- [cortex.py](https://github.com/janhq/cortex-python)
### Build from Source
@@ -72,9 +71,6 @@ cortex
# Start a model
cortex run [model_id]
-
-# Chat with a model
-cortex chat [model_id]
```
## Model Library
Cortex supports a list of models available on [Cortex Hub](https://huggingface.co/cortexso).
@@ -123,73 +119,30 @@ Here are example of models that you can use based on each supported engine:
> You should have at least 8 GB of RAM available to run the 7B models, 16 GB to run the 14B models, and 32 GB to run the 32B models.
## Cortex CLI Commands
+
+| Command Description | Command Example |
+|------------------------------------|---------------------------------------------------------------------|
+| **Start Cortex Server** | `cortex` |
+| **Chat with a Model** | `cortex chat [options] [model_id] [message]` |
+| **Embeddings** | `cortex embeddings [options] [model_id] [message]` |
+| **Pull a Model** | `cortex pull ` |
+| **Download and Start a Model** | `cortex run [options] [model_id]:[engine]` |
+| **Get Model Details** | `cortex models get ` |
+| **List Models** | `cortex models list [options]` |
+| **Delete a Model** | `cortex models delete ` |
+| **Start a Model** | `cortex models start [model_id]` |
+| **Stop a Model** | `cortex models stop ` |
+| **Update a Model** | `cortex models update [options] ` |
+| **Get Engine Details** | `cortex engines get ` |
+| **Install an Engine** | `cortex engines install [options]` |
+| **List Engines** | `cortex engines list [options]` |
+| **Uninnstall an Engine** | `cortex engines uninstall [options]` |
+| **Show Model Information** | `cortex ps` |
+| **Update Cortex** | `cortex update [options]` |
+
> **Note**:
> For a more detailed CLI Reference documentation, please see [here](https://cortex.so/docs/cli).
-### Start Cortex Server
-```bash
-cortex
-```
-### Chat with a Model
-```bash
-cortex chat [options] [model_id] [message]
-```
-### Embeddings
-```bash
-cortex embeddings [options] [model_id] [message]
-```
-### Pull a Model
-```bash
-cortex pull
-```
-> This command can also pulls Hugging Face's models.
-### Download and Start a Model
-```bash
-cortex run [options] [model_id]:[engine]
-```
-### Get a Model Details
-```bash
-cortex models get
-```
-### List Models
-```bash
-cortex models list [options]
-```
-### Remove a Model
-```bash
-cortex models remove
-```
-### Start a Model
-```bash
-cortex models start [model_id]
-```
-### Stop a Model
-```bash
-cortex models stop
-```
-### Update a Model Config
-```bash
-cortex models update [options]
-```
-### Get an Engine Details
-```bash
-cortex engines get
-```
-### Install an Engine
-```bash
-cortex engines install [options]
-```
-### List Engines
-```bash
-cortex engines list [options]
-```
-### Set an Engine Config
-```bash
-cortex engines set
-```
-### Show Model Information
-```bash
-cortex ps
-```
+
## REST API
Cortex has a REST API that runs at `localhost:1337`.
diff --git a/platform/README.md b/platform/README.md
index e3c63dfa4..77b8605f1 100644
--- a/platform/README.md
+++ b/platform/README.md
@@ -4,184 +4,210 @@
- Documentation - API Reference
+ Documentation - API Reference
- Changelog - Bug reports - Discord
-> ⚠️ **Cortex Platform is under development. This documentation outlines the intended behavior of Cortex Platform, which may not yet be fully implemented in the codebase.**
+> ⚠️ **CortexCPP is currently in Development. This documentation outlines the intended behavior of Cortex Platform, which may not yet be fully implemented in the codebase.**
## About
-Cortex Platform is a fully database-driven application built on top of [`cortex.cpp`](https://github.com/janhq/cortex.cpp), designed as an OpenAI API equivalent. It supports multiple engines, multi-user functionality, and operates entirely through stateful API endpoints.
+Cortex is a C++ AI engine that comes with a Docker-like command-line interface and client libraries. Cortex can function as a standalone server or be integrated as a library.
-## Cortex Engines
-Cortex Platform supports the following engines:
-- [`cortex.llamacpp`](https://github.com/janhq/cortex.llamacpp): `cortex.llamacpp` library is a C++ inference tool that can be dynamically loaded by any server at runtime. We use this engine to support GGUF inference with GGUF models. The `llama.cpp` is optimized for performance on both CPU and GPU.
-- [`cortex.onnx` Repository](https://github.com/janhq/cortex.onnx): `cortex.onnx` is a C++ inference library for Windows that leverages `onnxruntime-genai` and uses DirectML to provide GPU acceleration across a wide range of hardware and drivers, including AMD, Intel, NVIDIA, and Qualcomm GPUs.
-- [`cortex.tensorrt-llm`](https://github.com/janhq/cortex.tensorrt-llm): `cortex.tensorrt-llm` is a C++ inference library designed for NVIDIA GPUs. It incorporates NVIDIA’s TensorRT-LLM for GPU-accelerated inference.
+Cortex supports the following engines:
+- [`cortex.llamacpp`](https://github.com/janhq/cortex.llamacpp)
+- [`cortex.onnx` Repository](https://github.com/janhq/cortex.onnx)
+- [`cortex.tensorrt-llm`](https://github.com/janhq/cortex.tensorrt-llm)
## Installation
-
-> **Note**:
-> To install the Cortex Platform, clone our [repository](). It includes everything you need for installation using Docker and Helm.
-
-### Docker
+### MacOs
+```bash
+brew install cortex.cpp
+```
+### Windows
```bash
-docker compose up
+winget install cortex.cpp
```
+### Linux
+```bash
+sudo apt install cortex.cpp
+```
+### Docker
+**Coming Soon!**
+
+### Libraries
+- [cortex.js](https://github.com/janhq/cortex.js)
+- [cortex.py](https://github.com/janhq/cortex-python)
+
+### Build from Source
+
+To install Cortex from the source, follow the steps below:
+
+1. Clone the Cortex repository [here](https://github.com/janhq/cortex/tree/dev).
+2. Navigate to the `platform` folder.
+3. Open the terminal and run the following command to build the Cortex project:
-### Helm
```bash
-helm install cortex-platform
+npx nest build
```
-### Yarn
+4. Make the `command.js` executable:
+
```bash
-yarn install cortex-platform
+chmod +x '[path-to]/cortex/platform/dist/src/command.js'
```
-### Libraries
-- [cortex.js]()
-- [cortex.py]()
+5. Link the package globally:
+```bash
+npm link
+```
-### Build from Source
-**Coming Soon!**
## Quickstart
-**Coming Soon!**
+To run and chat with a model in Cortex:
+```bash
+# Start the Cortex server
+cortex
+# Start a model
+cortex run [model_id]
+```
## Model Library
-Cortex Platform supports a list of models available on [Cortex Hub](https://huggingface.co/cortexso).
+Cortex supports a list of models available on [Cortex Hub](https://huggingface.co/cortexso).
Here are example of models that you can use based on each supported engine:
### `llama.cpp`
-| Model ID | Variant (Branch) | Model size |
-|------------------|------------------|-------------------|
-| codestral | 22b-gguf | 22B |
-| command-r | 35b-gguf | 35B |
-| gemma | 7b-gguf | 7B |
-| llama3 | gguf | 8B |
-| llama3.1 | gguf | 8B |
-| mistral | 7b-gguf | 7B |
-| mixtral | 7x8b-gguf | 46.7B |
-| openhermes-2.5 | 7b-gguf | 7B |
-| phi3 | medium-gguf | 14B - 4k ctx len |
-| phi3 | mini-gguf | 3.82B - 4k ctx len|
-| qwen2 | 7b-gguf | 7B |
-| tinyllama | 1b-gguf | 1.1B |
-
+| Model ID | Variant (Branch) | Model size | CLI command |
+|------------------|------------------|-------------------|------------------------------------|
+| codestral | 22b-gguf | 22B | `cortex run codestral:22b-gguf` |
+| command-r | 35b-gguf | 35B | `cortex run command-r:35b-gguf` |
+| gemma | 7b-gguf | 7B | `cortex run gemma:7b-gguf` |
+| llama3 | gguf | 8B | `cortex run llama3:gguf` |
+| llama3.1 | gguf | 8B | `cortex run llama3.1:gguf` |
+| mistral | 7b-gguf | 7B | `cortex run mistral:7b-gguf` |
+| mixtral | 7x8b-gguf | 46.7B | `cortex run mixtral:7x8b-gguf` |
+| openhermes-2.5 | 7b-gguf | 7B | `cortex run openhermes-2.5:7b-gguf`|
+| phi3 | medium-gguf | 14B - 4k ctx len | `cortex run phi3:medium-gguf` |
+| phi3 | mini-gguf | 3.82B - 4k ctx len| `cortex run phi3:mini-gguf` |
+| qwen2 | 7b-gguf | 7B | `cortex run qwen2:7b-gguf` |
+| tinyllama | 1b-gguf | 1.1B | `cortex run tinyllama:1b-gguf` |
### `ONNX`
-| Model ID | Variant (Branch) | Model size |
-|------------------|------------------|-------------------|
-| gemma | 7b-onnx | 7B |
-| llama3 | onnx | 8B |
-| mistral | 7b-onnx | 7B |
-| openhermes-2.5 | 7b-onnx | 7B |
-| phi3 | mini-onnx | 3.82B - 4k ctx len|
-| phi3 | medium-onnx | 14B - 4k ctx len |
-
+| Model ID | Variant (Branch) | Model size | CLI command |
+|------------------|------------------|-------------------|------------------------------------|
+| gemma | 7b-onnx | 7B | `cortex run gemma:7b-onnx` |
+| llama3 | onnx | 8B | `cortex run llama3:onnx` |
+| mistral | 7b-onnx | 7B | `cortex run mistral:7b-onnx` |
+| openhermes-2.5 | 7b-onnx | 7B | `cortex run openhermes-2.5:7b-onnx`|
+| phi3 | mini-onnx | 3.82B - 4k ctx len| `cortex run phi3:mini-onnx` |
+| phi3 | medium-onnx | 14B - 4k ctx len | `cortex run phi3:medium-onnx` |
### `TensorRT-LLM`
-| Model ID | Variant (Branch) | Model size |
-|------------------|-------------------------------|-------------------|
-| llama3 | 8b-tensorrt-llm-windows-ampere | 8B |
-| llama3 | 8b-tensorrt-llm-linux-ampere | 8B |
-| llama3 | 8b-tensorrt-llm-linux-ada | 8B |
-| llama3 | 8b-tensorrt-llm-windows-ada | 8B |
-| mistral | 7b-tensorrt-llm-linux-ampere | 7B |
-| mistral | 7b-tensorrt-llm-windows-ampere | 7B |
-| mistral | 7b-tensorrt-llm-linux-ada | 7B |
-| mistral | 7b-tensorrt-llm-windows-ada | 7B |
-| openhermes-2.5 | 7b-tensorrt-llm-windows-ampere | 7B |
-| openhermes-2.5 | 7b-tensorrt-llm-windows-ada | 7B |
-| openhermes-2.5 | 7b-tensorrt-llm-linux-ada | 7B |
+| Model ID | Variant (Branch) | Model size | CLI command |
+|------------------|-------------------------------|-------------------|------------------------------------|
+| llama3 | 8b-tensorrt-llm-windows-ampere | 8B | `cortex run llama3:8b-tensorrt-llm-windows-ampere` |
+| llama3 | 8b-tensorrt-llm-linux-ampere | 8B | `cortex run llama3:8b-tensorrt-llm-linux-ampere` |
+| llama3 | 8b-tensorrt-llm-linux-ada | 8B | `cortex run llama3:8b-tensorrt-llm-linux-ada`|
+| llama3 | 8b-tensorrt-llm-windows-ada | 8B | `cortex run llama3:8b-tensorrt-llm-windows-ada` |
+| mistral | 7b-tensorrt-llm-linux-ampere | 7B | `cortex run mistral:7b-tensorrt-llm-linux-ampere`|
+| mistral | 7b-tensorrt-llm-windows-ampere | 7B | `cortex run mistral:7b-tensorrt-llm-windows-ampere` |
+| mistral | 7b-tensorrt-llm-linux-ada | 7B | `cortex run mistral:7b-tensorrt-llm-linux-ada`|
+| mistral | 7b-tensorrt-llm-windows-ada | 7B | `cortex run mistral:7b-tensorrt-llm-windows-ada` |
+| openhermes-2.5 | 7b-tensorrt-llm-windows-ampere | 7B | `cortex run openhermes-2.5:7b-tensorrt-llm-windows-ampere`|
+| openhermes-2.5 | 7b-tensorrt-llm-windows-ada | 7B | `cortex run openhermes-2.5:7b-tensorrt-llm-windows-ada`|
+| openhermes-2.5 | 7b-tensorrt-llm-linux-ada | 7B | `cortex run openhermes-2.5:7b-tensorrt-llm-linux-ada`|
> **Note**:
> You should have at least 8 GB of RAM available to run the 7B models, 16 GB to run the 14B models, and 32 GB to run the 32B models.
-## Cortex Platfrom API
-Cortex Platform only support the following stateful API endpoints:
+## Cortex CLI Commands
+
+| Command Description | Command Example |
+|------------------------------------|---------------------------------------------------------------------|
+| **Start Cortex Server** | `cortex` |
+| **Chat with a Model** | `cortex chat [options] [model_id] [message]` |
+| **Embeddings** | `cortex embeddings [options] [model_id] [message]` |
+| **Pull a Model** | `cortex pull ` |
+| **Download and Start a Model** | `cortex run [options] [model_id]:[engine]` |
+| **Get Model Details** | `cortex models get ` |
+| **List Models** | `cortex models list [options]` |
+| **Delete a Model** | `cortex models delete ` |
+| **Start a Model** | `cortex models start [model_id]` |
+| **Stop a Model** | `cortex models stop ` |
+| **Update a Model** | `cortex models update [options] ` |
+| **Get Engine Details** | `cortex engines get ` |
+| **Install an Engine** | `cortex engines install [options]` |
+| **List Engines** | `cortex engines list [options]` |
+| **Uninnstall an Engine** | `cortex engines uninstall [options]` |
+| **Show Model Information** | `cortex ps` |
+| **Update Cortex** | `cortex update [options]` |
-- Messages
-- Threads
-- Assistants
+> **Note**:
+> For a more detailed CLI Reference documentation, please see [here](https://cortex.so/docs/cli).
+
+## REST API
+Cortex has a REST API that runs at `localhost:1337`.
-Here are some examples of the available stateful endpoints:
-### Create Message
+### Pull a Model
```bash
curl --request POST \
- --url http://127.0.0.1:1337/v1/threads/__THREAD_ID__/messages \
- --header 'Content-Type: application/json' \
- --data '{
- "role": "user",
- "content": "Tell me a joke"
-}'
+ --url http://localhost:1337/v1/models/{model_id}/pull
```
-### Create Assistant
+### Start a Model
```bash
curl --request POST \
- --url http://127.0.0.1:1337/v1/assistants \
+ --url http://localhost:1337/v1/models/{model_id}/start \
--header 'Content-Type: application/json' \
--data '{
- "id": "jan",
- "avatar": "",
- "name": "Jan",
- "description": "A default assistant that can use all downloaded models",
- "model": "",
- "instructions": "",
- "tools": [],
- "metadata": {},
- "top_p": "0.7",
- "temperature": "0.7"
+ "prompt_template": "system\n{system_message}\nuser\n{prompt}\nassistant",
+ "stop": [],
+ "ngl": 4096,
+ "ctx_len": 4096,
+ "cpu_threads": 10,
+ "n_batch": 2048,
+ "caching_enabled": true,
+ "grp_attn_n": 1,
+ "grp_attn_w": 512,
+ "mlock": false,
+ "flash_attn": true,
+ "cache_type": "f16",
+ "use_mmap": true,
+ "engine": "cortex.llamacpp"
}'
```
-### Create Thread
+### Chat with a Model
```bash
-curl --request POST \
- --url http://127.0.0.1:1337/v1/threads \
- --header 'Content-Type: application/json' \
- --data '{
- "assistants": [
+curl http://localhost:1337/v1/chat/completions \
+-H "Content-Type: application/json" \
+-d '{
+ "model": "",
+ "messages": [
{
- "id": "thread_123",
- "avatar": "https://example.com/avatar.png",
- "name": "Virtual Helper",
- "model": "mistral",
- "instructions": "Assist with customer queries and provide information based on the company database.",
- "tools": [
- {
- "name": "Knowledge Retrieval",
- "settings": {
- "source": "internal",
- "endpoint": "https://api.example.com/knowledge"
- }
- }
- ],
- "description": "This assistant helps with customer support by retrieving relevant information.",
- "metadata": {
- "department": "support",
- "version": "1.0"
- },
- "object": "assistant",
- "temperature": 0.7,
- "top_p": 0.9,
- "created_at": 1622470423,
- "response_format": {
- "format": "json"
- },
- "tool_resources": {
- "resources": [
- "database1",
- "database2"
- ]
- }
- }
- ]
+ "role": "user",
+ "content": "Hello"
+ },
+ ],
+ "model": "mistral",
+ "stream": true,
+ "max_tokens": 1,
+ "stop": [
+ null
+ ],
+ "frequency_penalty": 1,
+ "presence_penalty": 1,
+ "temperature": 1,
+ "top_p": 1
}'
```
-> **Note**: Check our [API documentation](https://cortex.so/api-reference) for a full list of available stateful endpoints.
+### Stop a Model
+```bash
+curl --request POST \
+ --url http://localhost:1337/v1/models/mistral/stop
+```
+
+
+> **Note**: Check our [API documentation](https://cortex.so/api-reference) for a full list of available endpoints.
## Contact Support
- For support, please file a [GitHub ticket](https://github.com/janhq/cortex/issues/new/choose).
From c4863ff28463850d74141493cb0129327471cb4b Mon Sep 17 00:00:00 2001
From: irfanpena
Date: Mon, 9 Sep 2024 14:49:50 +0700
Subject: [PATCH 10/21] Update the CortexCPP readme
---
README.md | 61 +++++++++++++++++++++++++------------------------------
1 file changed, 28 insertions(+), 33 deletions(-)
diff --git a/README.md b/README.md
index 77b8605f1..279927750 100644
--- a/README.md
+++ b/README.md
@@ -1,4 +1,4 @@
-# Cortex
+# CortexCPP
@@ -8,31 +8,22 @@
- Changelog - Bug reports - Discord
-> ⚠️ **CortexCPP is currently in Development. This documentation outlines the intended behavior of Cortex Platform, which may not yet be fully implemented in the codebase.**
+> ⚠️ **CortexCPP is currently in Development. This documentation outlines the intended behavior of CortexCPP, which may not yet be fully implemented in the codebase.**
## About
-Cortex is a C++ AI engine that comes with a Docker-like command-line interface and client libraries. Cortex can function as a standalone server or be integrated as a library.
+CortexCPP is a C++ AI engine featuring a Docker-like command-line interface and client libraries. It can run as a standalone server or be embedded as a library, allowing you to run AI locally on your computer.
-Cortex supports the following engines:
+CortexCPP supports the following engines:
- [`cortex.llamacpp`](https://github.com/janhq/cortex.llamacpp)
- [`cortex.onnx` Repository](https://github.com/janhq/cortex.onnx)
- [`cortex.tensorrt-llm`](https://github.com/janhq/cortex.tensorrt-llm)
## Installation
-### MacOs
-```bash
-brew install cortex.cpp
-```
-### Windows
-```bash
-winget install cortex.cpp
-```
-### Linux
-```bash
-sudo apt install cortex.cpp
-```
-### Docker
-**Coming Soon!**
+To install CortexCPP, download the installer for your operating system from the following options:
+- Stable Version
+- Beta Version
+- Nightly Version
+
### Libraries
- [cortex.js](https://github.com/janhq/cortex.js)
@@ -40,20 +31,24 @@ sudo apt install cortex.cpp
### Build from Source
-To install Cortex from the source, follow the steps below:
+To install CortexCPP from the source, follow the steps below:
-1. Clone the Cortex repository [here](https://github.com/janhq/cortex/tree/dev).
-2. Navigate to the `platform` folder.
-3. Open the terminal and run the following command to build the Cortex project:
+1. Clone the CortexCPP repository [here](https://github.com/janhq/cortex.cpp).
+2. Navigate to the `engine > vcpkg` folder.
+3. Configure the vpkg:
```bash
-npx nest build
+cd vcpkg
+./bootstrap-vcpkg.bat
+vcpkg install
```
-
-4. Make the `command.js` executable:
+4. Use Visual Studio with the C++ development kit to build the project using the files generated in the `vcpkg` folder.
+5. Build the CortexCPP inside the `engine` folder:
```bash
-chmod +x '[path-to]/cortex/platform/dist/src/command.js'
+mkdir build
+cd build
+cmake .. -DBUILD_SHARED_LIBS=OFF -DCMAKE_TOOLCHAIN_FILE=path_to_vcpkg_folder/vcpkg/scripts/buildsystems/vcpkg.cmake -DVCPKG_TARGET_TRIPLET=x64-windows-static
```
5. Link the package globally:
@@ -64,16 +59,16 @@ npm link
## Quickstart
-To run and chat with a model in Cortex:
+To run and chat with a model in CortexCPP:
```bash
-# Start the Cortex server
+# Start the CortexCPP server
cortex
# Start a model
cortex run [model_id]
```
## Model Library
-Cortex supports a list of models available on [Cortex Hub](https://huggingface.co/cortexso).
+CortexCPP supports a list of models available on [Cortex Hub](https://huggingface.co/cortexso).
Here are example of models that you can use based on each supported engine:
### `llama.cpp`
@@ -118,11 +113,11 @@ Here are example of models that you can use based on each supported engine:
> **Note**:
> You should have at least 8 GB of RAM available to run the 7B models, 16 GB to run the 14B models, and 32 GB to run the 32B models.
-## Cortex CLI Commands
+## CortexCPP CLI Commands
| Command Description | Command Example |
|------------------------------------|---------------------------------------------------------------------|
-| **Start Cortex Server** | `cortex` |
+| **Start CortexCPP Server** | `cortex` |
| **Chat with a Model** | `cortex chat [options] [model_id] [message]` |
| **Embeddings** | `cortex embeddings [options] [model_id] [message]` |
| **Pull a Model** | `cortex pull ` |
@@ -138,13 +133,13 @@ Here are example of models that you can use based on each supported engine:
| **List Engines** | `cortex engines list [options]` |
| **Uninnstall an Engine** | `cortex engines uninstall [options]` |
| **Show Model Information** | `cortex ps` |
-| **Update Cortex** | `cortex update [options]` |
+| **Update CortexCPP** | `cortex update [options]` |
> **Note**:
> For a more detailed CLI Reference documentation, please see [here](https://cortex.so/docs/cli).
## REST API
-Cortex has a REST API that runs at `localhost:1337`.
+CortexCPP has a REST API that runs at `localhost:3928`.
### Pull a Model
```bash
From 8671572ba5a8016304344fdcc39d8a77986daa6b Mon Sep 17 00:00:00 2001
From: irfanpena
Date: Mon, 9 Sep 2024 14:55:42 +0700
Subject: [PATCH 11/21] Update Overview and Remove Cortex Platform readme
---
README.md | 4 +-
platform/README.md | 217 ---------------------------------------------
2 files changed, 2 insertions(+), 219 deletions(-)
delete mode 100644 platform/README.md
diff --git a/README.md b/README.md
index 279927750..995ad62ce 100644
--- a/README.md
+++ b/README.md
@@ -11,7 +11,7 @@
> ⚠️ **CortexCPP is currently in Development. This documentation outlines the intended behavior of CortexCPP, which may not yet be fully implemented in the codebase.**
## About
-CortexCPP is a C++ AI engine featuring a Docker-like command-line interface and client libraries. It can run as a standalone server or be embedded as a library, allowing you to run AI locally on your computer.
+CortexCPP is a Local AI engine that is used to run and customize LLMs. CortexCPP can be deployed as a standalone server, or integrated into apps like [Jan.ai](https://jan.ai/).
CortexCPP supports the following engines:
- [`cortex.llamacpp`](https://github.com/janhq/cortex.llamacpp)
@@ -67,7 +67,7 @@ cortex
# Start a model
cortex run [model_id]
```
-## Model Library
+## Built-in Model Library
CortexCPP supports a list of models available on [Cortex Hub](https://huggingface.co/cortexso).
Here are example of models that you can use based on each supported engine:
diff --git a/platform/README.md b/platform/README.md
deleted file mode 100644
index 77b8605f1..000000000
--- a/platform/README.md
+++ /dev/null
@@ -1,217 +0,0 @@
-# Cortex
-
-
-
-
-
- Documentation - API Reference
- - Changelog - Bug reports - Discord
-
-
-> ⚠️ **CortexCPP is currently in Development. This documentation outlines the intended behavior of Cortex Platform, which may not yet be fully implemented in the codebase.**
-
-## About
-Cortex is a C++ AI engine that comes with a Docker-like command-line interface and client libraries. Cortex can function as a standalone server or be integrated as a library.
-
-Cortex supports the following engines:
-- [`cortex.llamacpp`](https://github.com/janhq/cortex.llamacpp)
-- [`cortex.onnx` Repository](https://github.com/janhq/cortex.onnx)
-- [`cortex.tensorrt-llm`](https://github.com/janhq/cortex.tensorrt-llm)
-
-## Installation
-### MacOs
-```bash
-brew install cortex.cpp
-```
-### Windows
-```bash
-winget install cortex.cpp
-```
-### Linux
-```bash
-sudo apt install cortex.cpp
-```
-### Docker
-**Coming Soon!**
-
-### Libraries
-- [cortex.js](https://github.com/janhq/cortex.js)
-- [cortex.py](https://github.com/janhq/cortex-python)
-
-### Build from Source
-
-To install Cortex from the source, follow the steps below:
-
-1. Clone the Cortex repository [here](https://github.com/janhq/cortex/tree/dev).
-2. Navigate to the `platform` folder.
-3. Open the terminal and run the following command to build the Cortex project:
-
-```bash
-npx nest build
-```
-
-4. Make the `command.js` executable:
-
-```bash
-chmod +x '[path-to]/cortex/platform/dist/src/command.js'
-```
-
-5. Link the package globally:
-
-```bash
-npm link
-```
-
-
-## Quickstart
-To run and chat with a model in Cortex:
-```bash
-# Start the Cortex server
-cortex
-
-# Start a model
-cortex run [model_id]
-```
-## Model Library
-Cortex supports a list of models available on [Cortex Hub](https://huggingface.co/cortexso).
-
-Here are example of models that you can use based on each supported engine:
-### `llama.cpp`
-| Model ID | Variant (Branch) | Model size | CLI command |
-|------------------|------------------|-------------------|------------------------------------|
-| codestral | 22b-gguf | 22B | `cortex run codestral:22b-gguf` |
-| command-r | 35b-gguf | 35B | `cortex run command-r:35b-gguf` |
-| gemma | 7b-gguf | 7B | `cortex run gemma:7b-gguf` |
-| llama3 | gguf | 8B | `cortex run llama3:gguf` |
-| llama3.1 | gguf | 8B | `cortex run llama3.1:gguf` |
-| mistral | 7b-gguf | 7B | `cortex run mistral:7b-gguf` |
-| mixtral | 7x8b-gguf | 46.7B | `cortex run mixtral:7x8b-gguf` |
-| openhermes-2.5 | 7b-gguf | 7B | `cortex run openhermes-2.5:7b-gguf`|
-| phi3 | medium-gguf | 14B - 4k ctx len | `cortex run phi3:medium-gguf` |
-| phi3 | mini-gguf | 3.82B - 4k ctx len| `cortex run phi3:mini-gguf` |
-| qwen2 | 7b-gguf | 7B | `cortex run qwen2:7b-gguf` |
-| tinyllama | 1b-gguf | 1.1B | `cortex run tinyllama:1b-gguf` |
-### `ONNX`
-| Model ID | Variant (Branch) | Model size | CLI command |
-|------------------|------------------|-------------------|------------------------------------|
-| gemma | 7b-onnx | 7B | `cortex run gemma:7b-onnx` |
-| llama3 | onnx | 8B | `cortex run llama3:onnx` |
-| mistral | 7b-onnx | 7B | `cortex run mistral:7b-onnx` |
-| openhermes-2.5 | 7b-onnx | 7B | `cortex run openhermes-2.5:7b-onnx`|
-| phi3 | mini-onnx | 3.82B - 4k ctx len| `cortex run phi3:mini-onnx` |
-| phi3 | medium-onnx | 14B - 4k ctx len | `cortex run phi3:medium-onnx` |
-### `TensorRT-LLM`
-| Model ID | Variant (Branch) | Model size | CLI command |
-|------------------|-------------------------------|-------------------|------------------------------------|
-| llama3 | 8b-tensorrt-llm-windows-ampere | 8B | `cortex run llama3:8b-tensorrt-llm-windows-ampere` |
-| llama3 | 8b-tensorrt-llm-linux-ampere | 8B | `cortex run llama3:8b-tensorrt-llm-linux-ampere` |
-| llama3 | 8b-tensorrt-llm-linux-ada | 8B | `cortex run llama3:8b-tensorrt-llm-linux-ada`|
-| llama3 | 8b-tensorrt-llm-windows-ada | 8B | `cortex run llama3:8b-tensorrt-llm-windows-ada` |
-| mistral | 7b-tensorrt-llm-linux-ampere | 7B | `cortex run mistral:7b-tensorrt-llm-linux-ampere`|
-| mistral | 7b-tensorrt-llm-windows-ampere | 7B | `cortex run mistral:7b-tensorrt-llm-windows-ampere` |
-| mistral | 7b-tensorrt-llm-linux-ada | 7B | `cortex run mistral:7b-tensorrt-llm-linux-ada`|
-| mistral | 7b-tensorrt-llm-windows-ada | 7B | `cortex run mistral:7b-tensorrt-llm-windows-ada` |
-| openhermes-2.5 | 7b-tensorrt-llm-windows-ampere | 7B | `cortex run openhermes-2.5:7b-tensorrt-llm-windows-ampere`|
-| openhermes-2.5 | 7b-tensorrt-llm-windows-ada | 7B | `cortex run openhermes-2.5:7b-tensorrt-llm-windows-ada`|
-| openhermes-2.5 | 7b-tensorrt-llm-linux-ada | 7B | `cortex run openhermes-2.5:7b-tensorrt-llm-linux-ada`|
-
-> **Note**:
-> You should have at least 8 GB of RAM available to run the 7B models, 16 GB to run the 14B models, and 32 GB to run the 32B models.
-
-## Cortex CLI Commands
-
-| Command Description | Command Example |
-|------------------------------------|---------------------------------------------------------------------|
-| **Start Cortex Server** | `cortex` |
-| **Chat with a Model** | `cortex chat [options] [model_id] [message]` |
-| **Embeddings** | `cortex embeddings [options] [model_id] [message]` |
-| **Pull a Model** | `cortex pull ` |
-| **Download and Start a Model** | `cortex run [options] [model_id]:[engine]` |
-| **Get Model Details** | `cortex models get ` |
-| **List Models** | `cortex models list [options]` |
-| **Delete a Model** | `cortex models delete ` |
-| **Start a Model** | `cortex models start [model_id]` |
-| **Stop a Model** | `cortex models stop ` |
-| **Update a Model** | `cortex models update [options] ` |
-| **Get Engine Details** | `cortex engines get ` |
-| **Install an Engine** | `cortex engines install [options]` |
-| **List Engines** | `cortex engines list [options]` |
-| **Uninnstall an Engine** | `cortex engines uninstall [options]` |
-| **Show Model Information** | `cortex ps` |
-| **Update Cortex** | `cortex update [options]` |
-
-> **Note**:
-> For a more detailed CLI Reference documentation, please see [here](https://cortex.so/docs/cli).
-
-## REST API
-Cortex has a REST API that runs at `localhost:1337`.
-
-### Pull a Model
-```bash
-curl --request POST \
- --url http://localhost:1337/v1/models/{model_id}/pull
-```
-
-### Start a Model
-```bash
-curl --request POST \
- --url http://localhost:1337/v1/models/{model_id}/start \
- --header 'Content-Type: application/json' \
- --data '{
- "prompt_template": "system\n{system_message}\nuser\n{prompt}\nassistant",
- "stop": [],
- "ngl": 4096,
- "ctx_len": 4096,
- "cpu_threads": 10,
- "n_batch": 2048,
- "caching_enabled": true,
- "grp_attn_n": 1,
- "grp_attn_w": 512,
- "mlock": false,
- "flash_attn": true,
- "cache_type": "f16",
- "use_mmap": true,
- "engine": "cortex.llamacpp"
-}'
-```
-
-### Chat with a Model
-```bash
-curl http://localhost:1337/v1/chat/completions \
--H "Content-Type: application/json" \
--d '{
- "model": "",
- "messages": [
- {
- "role": "user",
- "content": "Hello"
- },
- ],
- "model": "mistral",
- "stream": true,
- "max_tokens": 1,
- "stop": [
- null
- ],
- "frequency_penalty": 1,
- "presence_penalty": 1,
- "temperature": 1,
- "top_p": 1
-}'
-```
-
-### Stop a Model
-```bash
-curl --request POST \
- --url http://localhost:1337/v1/models/mistral/stop
-```
-
-
-> **Note**: Check our [API documentation](https://cortex.so/api-reference) for a full list of available endpoints.
-
-## Contact Support
-- For support, please file a [GitHub ticket](https://github.com/janhq/cortex/issues/new/choose).
-- For questions, join our Discord [here](https://discord.gg/FTk2MvZwJH).
-- For long-form inquiries, please email [hello@jan.ai](mailto:hello@jan.ai).
-
-
From af180e485088099ca2649469cf961d10a42a8cc3 Mon Sep 17 00:00:00 2001
From: irfanpena
Date: Mon, 9 Sep 2024 14:56:40 +0700
Subject: [PATCH 12/21] nits
---
README.md | 6 ------
1 file changed, 6 deletions(-)
diff --git a/README.md b/README.md
index 995ad62ce..1965ce377 100644
--- a/README.md
+++ b/README.md
@@ -51,12 +51,6 @@ cd build
cmake .. -DBUILD_SHARED_LIBS=OFF -DCMAKE_TOOLCHAIN_FILE=path_to_vcpkg_folder/vcpkg/scripts/buildsystems/vcpkg.cmake -DVCPKG_TARGET_TRIPLET=x64-windows-static
```
-5. Link the package globally:
-
-```bash
-npm link
-```
-
## Quickstart
To run and chat with a model in CortexCPP:
From f38f6780427394204f48f81666ac6a00f2cd7f4f Mon Sep 17 00:00:00 2001
From: irfanpena
Date: Mon, 9 Sep 2024 15:01:27 +0700
Subject: [PATCH 13/21] Separate installer for each version
---
README.md | 12 ++++++++++++
1 file changed, 12 insertions(+)
diff --git a/README.md b/README.md
index 1965ce377..5c736716c 100644
--- a/README.md
+++ b/README.md
@@ -21,8 +21,20 @@ CortexCPP supports the following engines:
## Installation
To install CortexCPP, download the installer for your operating system from the following options:
- Stable Version
+ - Windows
+ - Mac
+ - Linux (Debian)
+ - Linux (Fedora)
- Beta Version
+ - Windows
+ - Mac
+ - Linux (Debian)
+ - Linux (Fedora)
- Nightly Version
+ - Windows
+ - Mac
+ - Linux (Debian)
+ - Linux (Fedora)
### Libraries
From 425f641dddcb154cc5599a94ea59ca6e377a1136 Mon Sep 17 00:00:00 2001
From: irfanpena
Date: Mon, 9 Sep 2024 15:03:47 +0700
Subject: [PATCH 14/21] Update the build from source steps
---
README.md | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/README.md b/README.md
index 5c736716c..e7a034cd8 100644
--- a/README.md
+++ b/README.md
@@ -54,15 +54,14 @@ cd vcpkg
./bootstrap-vcpkg.bat
vcpkg install
```
-4. Use Visual Studio with the C++ development kit to build the project using the files generated in the `vcpkg` folder.
-5. Build the CortexCPP inside the `engine` folder:
+4. Build the CortexCPP inside the `build` folder:
```bash
mkdir build
cd build
cmake .. -DBUILD_SHARED_LIBS=OFF -DCMAKE_TOOLCHAIN_FILE=path_to_vcpkg_folder/vcpkg/scripts/buildsystems/vcpkg.cmake -DVCPKG_TARGET_TRIPLET=x64-windows-static
```
-
+5. Use Visual Studio with the C++ development kit to build the project using the files generated in the `build` folder.
## Quickstart
To run and chat with a model in CortexCPP:
From 7bcd47ba3b9a34d7ff902593ad43bc5e9207995c Mon Sep 17 00:00:00 2001
From: irfanpena
Date: Mon, 9 Sep 2024 15:07:20 +0700
Subject: [PATCH 15/21] nits
---
README.md | 30 +++++++++++++++---------------
1 file changed, 15 insertions(+), 15 deletions(-)
diff --git a/README.md b/README.md
index e7a034cd8..7ab60c9d0 100644
--- a/README.md
+++ b/README.md
@@ -20,21 +20,21 @@ CortexCPP supports the following engines:
## Installation
To install CortexCPP, download the installer for your operating system from the following options:
-- Stable Version
- - Windows
- - Mac
- - Linux (Debian)
- - Linux (Fedora)
-- Beta Version
- - Windows
- - Mac
- - Linux (Debian)
- - Linux (Fedora)
-- Nightly Version
- - Windows
- - Mac
- - Linux (Debian)
- - Linux (Fedora)
+- **Stable Version**
+ - [Windows]()
+ - [Mac]()
+ - [Linux (Debian)]()
+ - [Linux (Fedora)]()
+- **Beta Version**
+ - [Windows]()
+ - [Mac]()
+ - [Linux (Debian)]()
+ - [Linux (Fedora)]()
+- **Nightly Version**
+ - [Windows]()
+ - [Mac]()
+ - [Linux (Debian)]()
+ - [Linux (Fedora)]()
### Libraries
From ba71e2bcd94fad62169106ad945c271ebed3b17d Mon Sep 17 00:00:00 2001
From: irfanpena
Date: Mon, 9 Sep 2024 15:14:42 +0700
Subject: [PATCH 16/21] nits
---
README.md | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/README.md b/README.md
index 7ab60c9d0..612824ef7 100644
--- a/README.md
+++ b/README.md
@@ -15,7 +15,7 @@ CortexCPP is a Local AI engine that is used to run and customize LLMs. CortexCPP
CortexCPP supports the following engines:
- [`cortex.llamacpp`](https://github.com/janhq/cortex.llamacpp)
-- [`cortex.onnx` Repository](https://github.com/janhq/cortex.onnx)
+- [`cortex.onnx`](https://github.com/janhq/cortex.onnx)
- [`cortex.tensorrt-llm`](https://github.com/janhq/cortex.tensorrt-llm)
## Installation
From fb145214116335e79c448d49595f58586b631acd Mon Sep 17 00:00:00 2001
From: irfanpena
Date: Mon, 9 Sep 2024 16:26:19 +0700
Subject: [PATCH 17/21] 1337 -> 3928
---
README.md | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/README.md b/README.md
index 612824ef7..c34655bdd 100644
--- a/README.md
+++ b/README.md
@@ -149,13 +149,13 @@ CortexCPP has a REST API that runs at `localhost:3928`.
### Pull a Model
```bash
curl --request POST \
- --url http://localhost:1337/v1/models/{model_id}/pull
+ --url http://localhost:3928/v1/models/{model_id}/pull
```
### Start a Model
```bash
curl --request POST \
- --url http://localhost:1337/v1/models/{model_id}/start \
+ --url http://localhost:3928/v1/models/{model_id}/start \
--header 'Content-Type: application/json' \
--data '{
"prompt_template": "system\n{system_message}\nuser\n{prompt}\nassistant",
@@ -177,7 +177,7 @@ curl --request POST \
### Chat with a Model
```bash
-curl http://localhost:1337/v1/chat/completions \
+curl http://localhost:3928/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "",
@@ -203,7 +203,7 @@ curl http://localhost:1337/v1/chat/completions \
### Stop a Model
```bash
curl --request POST \
- --url http://localhost:1337/v1/models/mistral/stop
+ --url http://localhost:3928/v1/models/mistral/stop
```
From ea84b2d1ebd5af6aa1f427d53193d48f4b855b2e Mon Sep 17 00:00:00 2001
From: irfanpena
Date: Tue, 10 Sep 2024 11:34:10 +0700
Subject: [PATCH 18/21] CortexCPP -> Cortex.cpp
---
README.md | 30 +++++++++++++++---------------
1 file changed, 15 insertions(+), 15 deletions(-)
diff --git a/README.md b/README.md
index c34655bdd..be63f2a1d 100644
--- a/README.md
+++ b/README.md
@@ -1,4 +1,4 @@
-# CortexCPP
+# Cortex.cpp
@@ -8,18 +8,18 @@
- Changelog - Bug reports - Discord
-> ⚠️ **CortexCPP is currently in Development. This documentation outlines the intended behavior of CortexCPP, which may not yet be fully implemented in the codebase.**
+> ⚠️ **Cortex.cpp is currently in Development. This documentation outlines the intended behavior of Cortex, which may not yet be fully implemented in the codebase.**
## About
-CortexCPP is a Local AI engine that is used to run and customize LLMs. CortexCPP can be deployed as a standalone server, or integrated into apps like [Jan.ai](https://jan.ai/).
+Cortex.cpp is a Local AI engine that is used to run and customize LLMs. Cortex can be deployed as a standalone server, or integrated into apps like [Jan.ai](https://jan.ai/).
-CortexCPP supports the following engines:
+Cortex supports the following engines:
- [`cortex.llamacpp`](https://github.com/janhq/cortex.llamacpp)
- [`cortex.onnx`](https://github.com/janhq/cortex.onnx)
- [`cortex.tensorrt-llm`](https://github.com/janhq/cortex.tensorrt-llm)
## Installation
-To install CortexCPP, download the installer for your operating system from the following options:
+To install Cortex, download the installer for your operating system from the following options:
- **Stable Version**
- [Windows]()
- [Mac]()
@@ -43,9 +43,9 @@ To install CortexCPP, download the installer for your operating system from the
### Build from Source
-To install CortexCPP from the source, follow the steps below:
+To install Cortex from the source, follow the steps below:
-1. Clone the CortexCPP repository [here](https://github.com/janhq/cortex.cpp).
+1. Clone the Cortex.cpp repository [here](https://github.com/janhq/cortex.cpp).
2. Navigate to the `engine > vcpkg` folder.
3. Configure the vpkg:
@@ -54,7 +54,7 @@ cd vcpkg
./bootstrap-vcpkg.bat
vcpkg install
```
-4. Build the CortexCPP inside the `build` folder:
+4. Build the Cortex inside the `build` folder:
```bash
mkdir build
@@ -64,16 +64,16 @@ cmake .. -DBUILD_SHARED_LIBS=OFF -DCMAKE_TOOLCHAIN_FILE=path_to_vcpkg_folder/vcp
5. Use Visual Studio with the C++ development kit to build the project using the files generated in the `build` folder.
## Quickstart
-To run and chat with a model in CortexCPP:
+To run and chat with a model in Cortex:
```bash
-# Start the CortexCPP server
+# Start the Cortex server
cortex
# Start a model
cortex run [model_id]
```
## Built-in Model Library
-CortexCPP supports a list of models available on [Cortex Hub](https://huggingface.co/cortexso).
+Cortex.cpp supports a list of models available on [Cortex Hub](https://huggingface.co/cortexso).
Here are example of models that you can use based on each supported engine:
### `llama.cpp`
@@ -118,11 +118,11 @@ Here are example of models that you can use based on each supported engine:
> **Note**:
> You should have at least 8 GB of RAM available to run the 7B models, 16 GB to run the 14B models, and 32 GB to run the 32B models.
-## CortexCPP CLI Commands
+## Cortex.cpp CLI Commands
| Command Description | Command Example |
|------------------------------------|---------------------------------------------------------------------|
-| **Start CortexCPP Server** | `cortex` |
+| **Start Cortex Server** | `cortex` |
| **Chat with a Model** | `cortex chat [options] [model_id] [message]` |
| **Embeddings** | `cortex embeddings [options] [model_id] [message]` |
| **Pull a Model** | `cortex pull ` |
@@ -138,13 +138,13 @@ Here are example of models that you can use based on each supported engine:
| **List Engines** | `cortex engines list [options]` |
| **Uninnstall an Engine** | `cortex engines uninstall [options]` |
| **Show Model Information** | `cortex ps` |
-| **Update CortexCPP** | `cortex update [options]` |
+| **Update Cortex** | `cortex update [options]` |
> **Note**:
> For a more detailed CLI Reference documentation, please see [here](https://cortex.so/docs/cli).
## REST API
-CortexCPP has a REST API that runs at `localhost:3928`.
+Cortex.cpp has a REST API that runs at `localhost:3928`.
### Pull a Model
```bash
From 01981b9ea2f6f65a1871c93411883ae3c09dbad1 Mon Sep 17 00:00:00 2001
From: irfanpena
Date: Tue, 10 Sep 2024 15:42:58 +0700
Subject: [PATCH 19/21] nits
---
README.md | 45 +++++++++++++++++++++++++--------------------
1 file changed, 25 insertions(+), 20 deletions(-)
diff --git a/README.md b/README.md
index be63f2a1d..94e2cbee5 100644
--- a/README.md
+++ b/README.md
@@ -13,28 +13,28 @@
## About
Cortex.cpp is a Local AI engine that is used to run and customize LLMs. Cortex can be deployed as a standalone server, or integrated into apps like [Jan.ai](https://jan.ai/).
-Cortex supports the following engines:
+Cortex.cpp supports the following engines:
- [`cortex.llamacpp`](https://github.com/janhq/cortex.llamacpp)
- [`cortex.onnx`](https://github.com/janhq/cortex.onnx)
- [`cortex.tensorrt-llm`](https://github.com/janhq/cortex.tensorrt-llm)
## Installation
-To install Cortex, download the installer for your operating system from the following options:
+To install Cortex.cpp, download the installer for your operating system from the following options:
- **Stable Version**
- - [Windows]()
- - [Mac]()
- - [Linux (Debian)]()
- - [Linux (Fedora)]()
+ - [Windows](https://github.com/janhq/cortex.cpp/releases)
+ - [Mac](https://github.com/janhq/cortex.cpp/releases)
+ - [Linux (Debian)](https://github.com/janhq/cortex.cpp/releases)
+ - [Linux (Fedora)](https://github.com/janhq/cortex.cpp/releases)
- **Beta Version**
- - [Windows]()
- - [Mac]()
- - [Linux (Debian)]()
- - [Linux (Fedora)]()
+ - [Windows](https://github.com/janhq/cortex.cpp/releases)
+ - [Mac](https://github.com/janhq/cortex.cpp/releases)
+ - [Linux (Debian)](https://github.com/janhq/cortex.cpp/releases)
+ - [Linux (Fedora)](https://github.com/janhq/cortex.cpp/releases)
- **Nightly Version**
- - [Windows]()
- - [Mac]()
- - [Linux (Debian)]()
- - [Linux (Fedora)]()
+ - [Windows](https://github.com/janhq/cortex.cpp/releases)
+ - [Mac](https://github.com/janhq/cortex.cpp/releases)
+ - [Linux (Debian)](https://github.com/janhq/cortex.cpp/releases)
+ - [Linux (Fedora)](https://github.com/janhq/cortex.cpp/releases)
### Libraries
@@ -43,7 +43,7 @@ To install Cortex, download the installer for your operating system from the fol
### Build from Source
-To install Cortex from the source, follow the steps below:
+To install Cortex.cpp from the source, follow the steps below:
1. Clone the Cortex.cpp repository [here](https://github.com/janhq/cortex.cpp).
2. Navigate to the `engine > vcpkg` folder.
@@ -54,7 +54,7 @@ cd vcpkg
./bootstrap-vcpkg.bat
vcpkg install
```
-4. Build the Cortex inside the `build` folder:
+4. Build the Cortex.cpp inside the `build` folder:
```bash
mkdir build
@@ -62,11 +62,16 @@ cd build
cmake .. -DBUILD_SHARED_LIBS=OFF -DCMAKE_TOOLCHAIN_FILE=path_to_vcpkg_folder/vcpkg/scripts/buildsystems/vcpkg.cmake -DVCPKG_TARGET_TRIPLET=x64-windows-static
```
5. Use Visual Studio with the C++ development kit to build the project using the files generated in the `build` folder.
+6. Verify that Cortex.cpp is installed correctly by getting help information.
+```sh
+# Get the help information
+cortex -h
+```
## Quickstart
-To run and chat with a model in Cortex:
+To run and chat with a model in Cortex.cpp:
```bash
-# Start the Cortex server
+# Start the Cortex.cpp server
cortex
# Start a model
@@ -122,7 +127,7 @@ Here are example of models that you can use based on each supported engine:
| Command Description | Command Example |
|------------------------------------|---------------------------------------------------------------------|
-| **Start Cortex Server** | `cortex` |
+| **Start Cortex.cpp Server** | `cortex` |
| **Chat with a Model** | `cortex chat [options] [model_id] [message]` |
| **Embeddings** | `cortex embeddings [options] [model_id] [message]` |
| **Pull a Model** | `cortex pull ` |
@@ -138,7 +143,7 @@ Here are example of models that you can use based on each supported engine:
| **List Engines** | `cortex engines list [options]` |
| **Uninnstall an Engine** | `cortex engines uninstall [options]` |
| **Show Model Information** | `cortex ps` |
-| **Update Cortex** | `cortex update [options]` |
+| **Update Cortex.cpp** | `cortex update [options]` |
> **Note**:
> For a more detailed CLI Reference documentation, please see [here](https://cortex.so/docs/cli).
From 1d434cf096841197d20dc560834e7d7f899ae529 Mon Sep 17 00:00:00 2001
From: irfanpena
Date: Wed, 11 Sep 2024 16:32:57 +0700
Subject: [PATCH 20/21] change the engine names
---
README.md | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/README.md b/README.md
index 94e2cbee5..44bb3ff9d 100644
--- a/README.md
+++ b/README.md
@@ -14,9 +14,9 @@
Cortex.cpp is a Local AI engine that is used to run and customize LLMs. Cortex can be deployed as a standalone server, or integrated into apps like [Jan.ai](https://jan.ai/).
Cortex.cpp supports the following engines:
-- [`cortex.llamacpp`](https://github.com/janhq/cortex.llamacpp)
-- [`cortex.onnx`](https://github.com/janhq/cortex.onnx)
-- [`cortex.tensorrt-llm`](https://github.com/janhq/cortex.tensorrt-llm)
+- [`llamacpp`](https://github.com/janhq/cortex.llamacpp)
+- [`onnx`](https://github.com/janhq/cortex.onnx)
+- [`tensorrt-llm`](https://github.com/janhq/cortex.tensorrt-llm)
## Installation
To install Cortex.cpp, download the installer for your operating system from the following options:
@@ -176,7 +176,7 @@ curl --request POST \
"flash_attn": true,
"cache_type": "f16",
"use_mmap": true,
- "engine": "cortex.llamacpp"
+ "engine": "llamacpp"
}'
```
From a7e1de336dd2dfca98bfb0ae1209ee4925411e73 Mon Sep 17 00:00:00 2001
From: irfanpena
Date: Thu, 12 Sep 2024 12:43:06 +0700
Subject: [PATCH 21/21] Updated per feedbacks, except the PORT
---
README.md | 247 ++++++++++++++++++++++++++++++++++++++++++++----------
1 file changed, 201 insertions(+), 46 deletions(-)
diff --git a/README.md b/README.md
index 44bb3ff9d..84d46d84b 100644
--- a/README.md
+++ b/README.md
@@ -3,6 +3,15 @@
+
+
+
+
+
+
+
+
+
Documentation - API Reference
- Changelog - Bug reports - Discord
@@ -13,61 +22,130 @@
## About
Cortex.cpp is a Local AI engine that is used to run and customize LLMs. Cortex can be deployed as a standalone server, or integrated into apps like [Jan.ai](https://jan.ai/).
-Cortex.cpp supports the following engines:
+Cortex.cpp is a multi-engine that uses `llama.cpp` as the default engine but also supports the following:
- [`llamacpp`](https://github.com/janhq/cortex.llamacpp)
- [`onnx`](https://github.com/janhq/cortex.onnx)
- [`tensorrt-llm`](https://github.com/janhq/cortex.tensorrt-llm)
## Installation
To install Cortex.cpp, download the installer for your operating system from the following options:
-- **Stable Version**
- - [Windows](https://github.com/janhq/cortex.cpp/releases)
- - [Mac](https://github.com/janhq/cortex.cpp/releases)
- - [Linux (Debian)](https://github.com/janhq/cortex.cpp/releases)
- - [Linux (Fedora)](https://github.com/janhq/cortex.cpp/releases)
-- **Beta Version**
- - [Windows](https://github.com/janhq/cortex.cpp/releases)
- - [Mac](https://github.com/janhq/cortex.cpp/releases)
- - [Linux (Debian)](https://github.com/janhq/cortex.cpp/releases)
- - [Linux (Fedora)](https://github.com/janhq/cortex.cpp/releases)
-- **Nightly Version**
- - [Windows](https://github.com/janhq/cortex.cpp/releases)
- - [Mac](https://github.com/janhq/cortex.cpp/releases)
- - [Linux (Debian)](https://github.com/janhq/cortex.cpp/releases)
- - [Linux (Fedora)](https://github.com/janhq/cortex.cpp/releases)
+
+
+
+> **Note**:
+> You can also build Cortex.cpp from source by following the steps [here](#build-from-source).
### Libraries
- [cortex.js](https://github.com/janhq/cortex.js)
- [cortex.py](https://github.com/janhq/cortex-python)
-### Build from Source
-
-To install Cortex.cpp from the source, follow the steps below:
-
-1. Clone the Cortex.cpp repository [here](https://github.com/janhq/cortex.cpp).
-2. Navigate to the `engine > vcpkg` folder.
-3. Configure the vpkg:
-
-```bash
-cd vcpkg
-./bootstrap-vcpkg.bat
-vcpkg install
-```
-4. Build the Cortex.cpp inside the `build` folder:
-
-```bash
-mkdir build
-cd build
-cmake .. -DBUILD_SHARED_LIBS=OFF -DCMAKE_TOOLCHAIN_FILE=path_to_vcpkg_folder/vcpkg/scripts/buildsystems/vcpkg.cmake -DVCPKG_TARGET_TRIPLET=x64-windows-static
-```
-5. Use Visual Studio with the C++ development kit to build the project using the files generated in the `build` folder.
-6. Verify that Cortex.cpp is installed correctly by getting help information.
-
-```sh
-# Get the help information
-cortex -h
-```
## Quickstart
To run and chat with a model in Cortex.cpp:
```bash
@@ -75,7 +153,7 @@ To run and chat with a model in Cortex.cpp:
cortex
# Start a model
-cortex run [model_id]
+cortex run :[engine_name]
```
## Built-in Model Library
Cortex.cpp supports a list of models available on [Cortex Hub](https://huggingface.co/cortexso).
@@ -145,7 +223,7 @@ Here are example of models that you can use based on each supported engine:
| **Show Model Information** | `cortex ps` |
| **Update Cortex.cpp** | `cortex update [options]` |
-> **Note**:
+> **Note**
> For a more detailed CLI Reference documentation, please see [here](https://cortex.so/docs/cli).
## REST API
@@ -211,8 +289,85 @@ curl --request POST \
--url http://localhost:3928/v1/models/mistral/stop
```
+> **Note**
+> Check our [API documentation](https://cortex.so/api-reference) for a full list of available endpoints.
+
+## Build from Source
+
+### Windows
+1. Clone the Cortex.cpp repository [here](https://github.com/janhq/cortex.cpp).
+2. Navigate to the `engine > vcpkg` folder.
+3. Configure the vpkg:
+
+```bash
+cd vcpkg
+./bootstrap-vcpkg.bat
+vcpkg install
+```
+4. Build the Cortex.cpp inside the `build` folder:
+
+```bash
+mkdir build
+cd build
+cmake .. -DBUILD_SHARED_LIBS=OFF -DCMAKE_TOOLCHAIN_FILE=path_to_vcpkg_folder/vcpkg/scripts/buildsystems/vcpkg.cmake -DVCPKG_TARGET_TRIPLET=x64-windows-static
+```
+5. Use Visual Studio with the C++ development kit to build the project using the files generated in the `build` folder.
+6. Verify that Cortex.cpp is installed correctly by getting help information.
+
+```sh
+# Get the help information
+cortex -h
+```
+### MacOS
+1. Clone the Cortex.cpp repository [here](https://github.com/janhq/cortex.cpp).
+2. Navigate to the `engine > vcpkg` folder.
+3. Configure the vpkg:
-> **Note**: Check our [API documentation](https://cortex.so/api-reference) for a full list of available endpoints.
+```bash
+cd vcpkg
+./bootstrap-vcpkg.sh
+vcpkg install
+```
+4. Build the Cortex.cpp inside the `build` folder:
+
+```bash
+mkdir build
+cd build
+cmake .. -DCMAKE_TOOLCHAIN_FILE=path_to_vcpkg_folder/vcpkg/scripts/buildsystems/vcpkg.cmake
+make -j4
+```
+5. Use Visual Studio with the C++ development kit to build the project using the files generated in the `build` folder.
+6. Verify that Cortex.cpp is installed correctly by getting help information.
+
+```sh
+# Get the help information
+cortex -h
+```
+### Linux
+1. Clone the Cortex.cpp repository [here](https://github.com/janhq/cortex.cpp).
+2. Navigate to the `engine > vcpkg` folder.
+3. Configure the vpkg:
+
+```bash
+cd vcpkg
+./bootstrap-vcpkg.sh
+vcpkg install
+```
+4. Build the Cortex.cpp inside the `build` folder:
+
+```bash
+mkdir build
+cd build
+cmake .. -DCMAKE_TOOLCHAIN_FILE=path_to_vcpkg_folder/vcpkg/scripts/buildsystems/vcpkg.cmake
+make -j4
+```
+5. Use Visual Studio with the C++ development kit to build the project using the files generated in the `build` folder.
+6. Verify that Cortex.cpp is installed correctly by getting help information.
+
+```sh
+# Get the help information
+cortex -h
+```
## Contact Support
- For support, please file a [GitHub ticket](https://github.com/janhq/cortex/issues/new/choose).