From ac428aae8a2755a4a1426ee15ea00c6a71a33d47 Mon Sep 17 00:00:00 2001
From: irfanpena <irfan@penateam.com>
Date: Thu, 5 Sep 2024 14:04:52 +0700
Subject: [PATCH 01/21] Draft the Platform readme

---
 README.md          |   2 +-
 platform/README.md | 244 +++++++++++++++++++++++++--------------------
 2 files changed, 138 insertions(+), 108 deletions(-)
diff --git a/README.md b/README.md
index 6b21c4448..33a1c0e11 100644
--- a/README.md
+++ b/README.md
@@ -5,7 +5,7 @@
 
 <p align="center">
   <a href="https://cortex.so/docs/">Documentation</a> - <a href="https://cortex.so/api-reference">API Reference</a> 
-  - <a href="https://github.com/janhq/cortex/releases">Changelog</a> - <a href="https://github.com/janhq/cortex/issues">Bug reports</a> - <a href="https://discord.gg/AsJ8krTT3N">Discord</a>
+  - <a href="https://github.com/janhq/cortex.cpp/releases">Changelog</a> - <a href="https://github.com/janhq/cortex.cpp/issues">Bug reports</a> - <a href="https://discord.gg/AsJ8krTT3N">Discord</a>
 </p>
 
 > ⚠️ **Cortex is currently in Development**: Expect breaking changes and bugs!
diff --git a/platform/README.md b/platform/README.md
index 660664159..10a191e8e 100644
--- a/platform/README.md
+++ b/platform/README.md
@@ -1,138 +1,168 @@
 # Cortex
-<p align="center">
+<!-- <p align="center">
   <img alt="cortex-cpplogo" src="https://raw.githubusercontent.com/janhq/cortex/dev/assets/cortex-banner.png">
-</p>
+</p> -->
 
 <p align="center">
-  <a href="https://jan.ai/cortex">Documentation</a> - <a href="https://jan.ai/api-reference">API Reference</a> 
-  - <a href="https://github.com/janhq/cortex/releases">Changelog</a> - <a href="https://github.com/janhq/cortex/issues">Bug reports</a> - <a href="https://discord.gg/AsJ8krTT3N">Discord</a>
+  <a href="https://cortex.so/docs/cortex-platform/">Documentation</a> - <a href="https://cortex.so/api-reference">API Reference</a> 
+  - <a href="https://github.com/janhq/cortex.cpp/releases">Changelog</a> - <a href="https://github.com/janhq/cortex.cpp/issues">Bug reports</a> - <a href="https://discord.gg/AsJ8krTT3N">Discord</a>
 </p>
 
-> ⚠️ **Cortex is currently in Development**: Expect breaking changes and bugs!
+> ⚠️ **Cortex Platform is Coming Soon!**
 
 ## About
-Cortex is an OpenAI-compatible AI engine that developers can use to build LLM apps. It is packaged with a Docker-inspired command-line interface and client libraries. It can be used as a standalone server or imported as a library. 
+Cortex Platform is a fully database-driven application built on top of [`cortex.cpp`](), designed as an OpenAI API equivalent. It supports multiple engines, multi-user functionality, and operates entirely through an API.
 
 ## Cortex Engines
-Cortex supports the following engines:
+Cortex Platform supports the following engines:
 - [`cortex.llamacpp`](https://github.com/janhq/cortex.llamacpp): `cortex.llamacpp` library is a C++ inference tool that can be dynamically loaded by any server at runtime. We use this engine to support GGUF inference with GGUF models. The `llama.cpp` is optimized for performance on both CPU and GPU.
 - [`cortex.onnx` Repository](https://github.com/janhq/cortex.onnx): `cortex.onnx` is a C++ inference library for Windows that leverages `onnxruntime-genai` and uses DirectML to provide GPU acceleration across a wide range of hardware and drivers, including AMD, Intel, NVIDIA, and Qualcomm GPUs.
 - [`cortex.tensorrt-llm`](https://github.com/janhq/cortex.tensorrt-llm): `cortex.tensorrt-llm` is a C++ inference library designed for NVIDIA GPUs. It incorporates NVIDIA’s TensorRT-LLM for GPU-accelerated inference.
 
-## Quicklinks
-
-- [Homepage](https://cortex.so/)
-- [Docs](https://cortex.so/docs/)
-
-## Quickstart
-### Prerequisites
-- **OS**:
-  - MacOSX 13.6 or higher.
-  - Windows 10 or higher.
-  - Ubuntu 22.04 and later.
-- **Dependencies**:
-  - **Node.js**: Version 18 and above is required to run the installation.
-  - **NPM**: Needed to manage packages.
-  - **CPU Instruction Sets**: Available for download from the [Cortex GitHub Releases](https://github.com/janhq/cortex/releases) page.
-  - **OpenMPI**: Required for Linux. Install by using the following command:
-    ```bash
-    sudo apt install openmpi-bin libopenmpi-dev
-    ```
-
-> Visit [Quickstart](https://cortex.so/docs/quickstart) to get started.
-
-### NPM
-``` bash
-# Install using NPM
-npm i -g cortexso
-# Run model
-cortex run mistral
-# To uninstall globally using NPM
-npm uninstall -g cortexso
-```
-
-### Homebrew
-``` bash
-# Install using Brew
-brew install cortexso
-# Run model
-cortex run mistral
-# To uninstall using Brew
-brew uninstall cortexso
-```
-> You can also install Cortex using the Cortex Installer available on [GitHub Releases](https://github.com/janhq/cortex/releases).
-
-## Cortex Server
-```bash
-cortex serve
-
-# Output
-# Started server at http://localhost:1337
-# Swagger UI available at http://localhost:1337/api
-```
+## Installation
+### Docker
+**Coming Soon!**
 
-You can now access the Cortex API server at `http://localhost:1337`,
-and the Swagger UI at `http://localhost:1337/api`.
+### Helm
+**Coming Soon!**
 
-## Build from Source
+### Yarn
+**Coming Soon!**
 
-To install Cortex from the source, follow the steps below:
+### Libraries
+**Coming Soon!**
 
-1. Clone the Cortex repository [here](https://github.com/janhq/cortex/tree/dev).
-2. Navigate to the `cortex-js` folder.
-3. Open the terminal and run the following command to build the Cortex project:
+### Build from Source
+**Coming Soon!**
 
+## Quickstart
+**Coming Soon!**
+
+## Model Library
+Cortex supports a list of models available on [Cortex Hub](https://huggingface.co/cortexso).
+
+Here are example of models that you can use based on each supported engine:
+### `llama.cpp`
+| Model ID         | Variant (Branch) | Model size        | CLI command                        |
+|------------------|------------------|-------------------|------------------------------------|
+| codestral        | 22b-gguf         | 22B               | `cortex run codestral:22b-gguf`    |
+| command-r        | 35b-gguf         | 35B               | `cortex run command-r:35b-gguf`    |
+| gemma            | 7b-gguf          | 7B                | `cortex run gemma:7b-gguf`         |
+| llama3           | gguf             | 8B                | `cortex run llama3:gguf`           |
+| llama3.1         | gguf             | 8B                | `cortex run llama3.1:gguf`         |
+| mistral          | 7b-gguf          | 7B                | `cortex run mistral:7b-gguf`       |
+| mixtral          | 7x8b-gguf        | 46.7B             | `cortex run mixtral:7x8b-gguf`     |
+| openhermes-2.5   | 7b-gguf          | 7B                | `cortex run openhermes-2.5:7b-gguf`|
+| phi3             | medium-gguf      | 14B - 4k ctx len  | `cortex run phi3:medium-gguf`      |
+| phi3             | mini-gguf        | 3.82B - 4k ctx len| `cortex run phi3:mini-gguf`        |
+| qwen2            | 7b-gguf          | 7B                | `cortex run qwen2:7b-gguf`         |
+| tinyllama        | 1b-gguf          | 1.1B              | `cortex run tinyllama:1b-gguf`     |
+### `ONNX`
+| Model ID         | Variant (Branch) | Model size        | CLI command                        |
+|------------------|------------------|-------------------|------------------------------------|
+| gemma            | 7b-onnx          | 7B                | `cortex run gemma:7b-onnx`         |
+| llama3           | onnx             | 8B                | `cortex run llama3:onnx`           |
+| mistral          | 7b-onnx          | 7B                | `cortex run mistral:7b-onnx`       |
+| openhermes-2.5   | 7b-onnx          | 7B                | `cortex run openhermes-2.5:7b-onnx`|
+| phi3             | mini-onnx        | 3.82B - 4k ctx len| `cortex run phi3:mini-onnx`        |
+| phi3             | medium-onnx      | 14B - 4k ctx len  | `cortex run phi3:medium-onnx`      |
+### `TensorRT-LLM`
+| Model ID         | Variant (Branch)              | Model size        | CLI command                        |
+|------------------|-------------------------------|-------------------|------------------------------------|
+| llama3           | 8b-tensorrt-llm-windows-ampere       | 8B                | `cortex run llama3:8b-tensorrt-llm-windows-ampere`   |
+| llama3           | 8b-tensorrt-llm-linux-ampere     | 8B                | `cortex run llama3:8b-tensorrt-llm-linux-ampere` |
+| llama3           | 8b-tensorrt-llm-linux-ada   | 8B                | `cortex run llama3:8b-tensorrt-llm-linux-ada`|
+| llama3           | 8b-tensorrt-llm-windows-ada       | 8B                | `cortex run llama3:8b-tensorrt-llm-windows-ada`   |
+| mistral          | 7b-tensorrt-llm-linux-ampere     | 7B                | `cortex run mistral:7b-tensorrt-llm-linux-ampere`|
+| mistral          | 7b-tensorrt-llm-windows-ampere       | 7B                | `cortex run mistral:7b-tensorrt-llm-windows-ampere`  |
+| mistral          | 7b-tensorrt-llm-linux-ada   | 7B                | `cortex run mistral:7b-tensorrt-llm-linux-ada`|
+| mistral          | 7b-tensorrt-llm-windows-ada       | 7B                | `cortex run mistral:7b-tensorrt-llm-windows-ada`  |
+| openhermes-2.5   | 7b-tensorrt-llm-windows-ampere       | 7B                | `cortex run openhermes-2.5:7b-tensorrt-llm-windows-ampere`|
+| openhermes-2.5   | 7b-tensorrt-llm-windows-ada     | 7B                | `cortex run openhermes-2.5:7b-tensorrt-llm-windows-ada`|
+| openhermes-2.5   | 7b-tensorrt-llm-linux-ada   | 7B                | `cortex run openhermes-2.5:7b-tensorrt-llm-linux-ada`|
+
+> **Note**:
+> You should have at least 8 GB of RAM available to run the 7B models, 16 GB to run the 14B models, and 32 GB to run the 32B models.
+
+## Cortex Platfrom API
+Cortex Platform has the stateful API that runs at `localhost:1337`.
+
+### Create Message
 ```bash
-npx nest build
+curl --request POST \
+  --url http://127.0.0.1:1337/v1/threads/__THREAD_ID__/messages \
+  --header 'Content-Type: application/json' \
+  --data '{
+  "role": "user",
+  "content": "Tell me a joke"
+}'
 ```
 
-4. Make the `command.js` executable:
-
+### Create Assistant
 ```bash
-chmod +x '[path-to]/cortex/cortex-js/dist/src/command.js'
+curl --request POST \
+  --url http://127.0.0.1:1337/v1/assistants \
+  --header 'Content-Type: application/json' \
+  --data '{
+  "id": "jan",
+  "avatar": "",
+  "name": "Jan",
+  "description": "A default assistant that can use all downloaded models",
+  "model": "",
+  "instructions": "",
+  "tools": [],
+  "metadata": {},
+  "top_p": "0.7",
+  "temperature": "0.7"
+}'
 ```
 
-5. Link the package globally:
-
+### Create Thread
 ```bash
-npm link
+curl --request POST \
+  --url http://127.0.0.1:1337/v1/threads \
+  --header 'Content-Type: application/json' \
+  --data '{
+  "assistants": [
+    {
+      "id": "thread_123",
+      "avatar": "https://example.com/avatar.png",
+      "name": "Virtual Helper",
+      "model": "mistral",
+      "instructions": "Assist with customer queries and provide information based on the company database.",
+      "tools": [
+        {
+          "name": "Knowledge Retrieval",
+          "settings": {
+            "source": "internal",
+            "endpoint": "https://api.example.com/knowledge"
+          }
+        }
+      ],
+      "description": "This assistant helps with customer support by retrieving relevant information.",
+      "metadata": {
+        "department": "support",
+        "version": "1.0"
+      },
+      "object": "assistant",
+      "temperature": 0.7,
+      "top_p": 0.9,
+      "created_at": 1622470423,
+      "response_format": {
+        "format": "json"
+      },
+      "tool_resources": {
+        "resources": [
+          "database1",
+          "database2"
+        ]
+      }
+    }
+  ]
+}'
 ```
 
-## Cortex CLI Commands
-
-The following CLI commands are currently available.
-See [CLI Reference Docs](https://cortex.so/docs/cli) for more information.
-
-```bash
-
-  serve               Providing API endpoint for Cortex backend.
-  chat                Send a chat request to a model.
-  init|setup          Init settings and download cortex's dependencies.
-  ps                  Show running models and their status.
-  kill                Kill running cortex processes.
-  pull|download       Download a model. Working with HuggingFace model id.
-  run [options]       EXPERIMENTAL: Shortcut to start a model and chat.
-  models              Subcommands for managing models.
-  models list         List all available models.
-  models pull         Download a specified model.
-  models remove       Delete a specified model.
-  models get          Retrieve the configuration of a specified model.
-  models start        Start a specified model.
-  models stop         Stop a specified model.
-  models update       Update the configuration of a specified model.
-  benchmark           Benchmark and analyze the performance of a specific AI model using your system.
-  presets             Show all the available model presets within Cortex.
-  telemetry           Retrieve telemetry logs for monitoring and analysis.
-  embeddings          Creates an embedding vector representing the input text.
-  engines             Subcommands for managing engines.
-  engines get         Get an engine details.
-  engines list        Get all the available Cortex engines.
-  engines init        Setup and download the required dependencies to run cortex engines.
-  configs             Subcommands for managing configurations.
-  configs get         Get a configuration details.
-  configs list        Get all the available configurations.
-  configs set         Set a configuration.
-```
+> **Note**: Check our [API documentation](https://cortex.so/api-reference) for a full list of available stateful endpoints.
 
 ## Contact Support
 - For support, please file a [GitHub ticket](https://github.com/janhq/cortex/issues/new/choose).

From 3cca455528e8511d0d230b0136d52d0a3254d1d8 Mon Sep 17 00:00:00 2001
From: irfanpena <irfan@penateam.com>
Date: Thu, 5 Sep 2024 14:16:39 +0700
Subject: [PATCH 02/21] Update the Model library table

---
 platform/README.md | 74 ++++++++++++++++++++++++----------------------
 1 file changed, 38 insertions(+), 36 deletions(-)

diff --git a/platform/README.md b/platform/README.md
index 10a191e8e..8c3c04d33 100644
--- a/platform/README.md
+++ b/platform/README.md
@@ -39,47 +39,49 @@ Cortex Platform supports the following engines:
 **Coming Soon!**
 
 ## Model Library
-Cortex supports a list of models available on [Cortex Hub](https://huggingface.co/cortexso).
+Cortex Platform supports a list of models available on [Cortex Hub](https://huggingface.co/cortexso).
 
 Here are example of models that you can use based on each supported engine:
 ### `llama.cpp`
-| Model ID         | Variant (Branch) | Model size        | CLI command                        |
-|------------------|------------------|-------------------|------------------------------------|
-| codestral        | 22b-gguf         | 22B               | `cortex run codestral:22b-gguf`    |
-| command-r        | 35b-gguf         | 35B               | `cortex run command-r:35b-gguf`    |
-| gemma            | 7b-gguf          | 7B                | `cortex run gemma:7b-gguf`         |
-| llama3           | gguf             | 8B                | `cortex run llama3:gguf`           |
-| llama3.1         | gguf             | 8B                | `cortex run llama3.1:gguf`         |
-| mistral          | 7b-gguf          | 7B                | `cortex run mistral:7b-gguf`       |
-| mixtral          | 7x8b-gguf        | 46.7B             | `cortex run mixtral:7x8b-gguf`     |
-| openhermes-2.5   | 7b-gguf          | 7B                | `cortex run openhermes-2.5:7b-gguf`|
-| phi3             | medium-gguf      | 14B - 4k ctx len  | `cortex run phi3:medium-gguf`      |
-| phi3             | mini-gguf        | 3.82B - 4k ctx len| `cortex run phi3:mini-gguf`        |
-| qwen2            | 7b-gguf          | 7B                | `cortex run qwen2:7b-gguf`         |
-| tinyllama        | 1b-gguf          | 1.1B              | `cortex run tinyllama:1b-gguf`     |
+| Model ID         | Variant (Branch) | Model size        |
+|------------------|------------------|-------------------|
+| codestral        | 22b-gguf         | 22B               |
+| command-r        | 35b-gguf         | 35B               |
+| gemma            | 7b-gguf          | 7B                |
+| llama3           | gguf             | 8B                |
+| llama3.1         | gguf             | 8B                |
+| mistral          | 7b-gguf          | 7B                |
+| mixtral          | 7x8b-gguf        | 46.7B             |
+| openhermes-2.5   | 7b-gguf          | 7B                |
+| phi3             | medium-gguf      | 14B - 4k ctx len  |
+| phi3             | mini-gguf        | 3.82B - 4k ctx len|
+| qwen2            | 7b-gguf          | 7B                |
+| tinyllama        | 1b-gguf          | 1.1B              |
+
 ### `ONNX`
-| Model ID         | Variant (Branch) | Model size        | CLI command                        |
-|------------------|------------------|-------------------|------------------------------------|
-| gemma            | 7b-onnx          | 7B                | `cortex run gemma:7b-onnx`         |
-| llama3           | onnx             | 8B                | `cortex run llama3:onnx`           |
-| mistral          | 7b-onnx          | 7B                | `cortex run mistral:7b-onnx`       |
-| openhermes-2.5   | 7b-onnx          | 7B                | `cortex run openhermes-2.5:7b-onnx`|
-| phi3             | mini-onnx        | 3.82B - 4k ctx len| `cortex run phi3:mini-onnx`        |
-| phi3             | medium-onnx      | 14B - 4k ctx len  | `cortex run phi3:medium-onnx`      |
+| Model ID         | Variant (Branch) | Model size        |
+|------------------|------------------|-------------------|
+| gemma            | 7b-onnx          | 7B                |
+| llama3           | onnx             | 8B                |
+| mistral          | 7b-onnx          | 7B                |
+| openhermes-2.5   | 7b-onnx          | 7B                |
+| phi3             | mini-onnx        | 3.82B - 4k ctx len|
+| phi3             | medium-onnx      | 14B - 4k ctx len  |
+
 ### `TensorRT-LLM`
-| Model ID         | Variant (Branch)              | Model size        | CLI command                        |
-|------------------|-------------------------------|-------------------|------------------------------------|
-| llama3           | 8b-tensorrt-llm-windows-ampere       | 8B                | `cortex run llama3:8b-tensorrt-llm-windows-ampere`   |
-| llama3           | 8b-tensorrt-llm-linux-ampere     | 8B                | `cortex run llama3:8b-tensorrt-llm-linux-ampere` |
-| llama3           | 8b-tensorrt-llm-linux-ada   | 8B                | `cortex run llama3:8b-tensorrt-llm-linux-ada`|
-| llama3           | 8b-tensorrt-llm-windows-ada       | 8B                | `cortex run llama3:8b-tensorrt-llm-windows-ada`   |
-| mistral          | 7b-tensorrt-llm-linux-ampere     | 7B                | `cortex run mistral:7b-tensorrt-llm-linux-ampere`|
-| mistral          | 7b-tensorrt-llm-windows-ampere       | 7B                | `cortex run mistral:7b-tensorrt-llm-windows-ampere`  |
-| mistral          | 7b-tensorrt-llm-linux-ada   | 7B                | `cortex run mistral:7b-tensorrt-llm-linux-ada`|
-| mistral          | 7b-tensorrt-llm-windows-ada       | 7B                | `cortex run mistral:7b-tensorrt-llm-windows-ada`  |
-| openhermes-2.5   | 7b-tensorrt-llm-windows-ampere       | 7B                | `cortex run openhermes-2.5:7b-tensorrt-llm-windows-ampere`|
-| openhermes-2.5   | 7b-tensorrt-llm-windows-ada     | 7B                | `cortex run openhermes-2.5:7b-tensorrt-llm-windows-ada`|
-| openhermes-2.5   | 7b-tensorrt-llm-linux-ada   | 7B                | `cortex run openhermes-2.5:7b-tensorrt-llm-linux-ada`|
+| Model ID         | Variant (Branch)              | Model size        |
+|------------------|-------------------------------|-------------------|
+| llama3           | 8b-tensorrt-llm-windows-ampere | 8B                |
+| llama3           | 8b-tensorrt-llm-linux-ampere   | 8B                |
+| llama3           | 8b-tensorrt-llm-linux-ada      | 8B                |
+| llama3           | 8b-tensorrt-llm-windows-ada    | 8B                |
+| mistral          | 7b-tensorrt-llm-linux-ampere   | 7B                |
+| mistral          | 7b-tensorrt-llm-windows-ampere | 7B                |
+| mistral          | 7b-tensorrt-llm-linux-ada      | 7B                |
+| mistral          | 7b-tensorrt-llm-windows-ada    | 7B                |
+| openhermes-2.5   | 7b-tensorrt-llm-windows-ampere | 7B                |
+| openhermes-2.5   | 7b-tensorrt-llm-windows-ada    | 7B                |
+| openhermes-2.5   | 7b-tensorrt-llm-linux-ada      | 7B                |
 
 > **Note**:
 > You should have at least 8 GB of RAM available to run the 7B models, 16 GB to run the 14B models, and 32 GB to run the 32B models.

From 25527451fe10bac0718d7a85de0ea083ae273ed4 Mon Sep 17 00:00:00 2001
From: irfanpena <irfan@penateam.com>
Date: Thu, 5 Sep 2024 14:18:23 +0700
Subject: [PATCH 03/21] nits

---
 platform/README.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/platform/README.md b/platform/README.md
index 8c3c04d33..635dc3b18 100644
--- a/platform/README.md
+++ b/platform/README.md
@@ -11,7 +11,7 @@
 > ⚠️ **Cortex Platform is Coming Soon!**
 
 ## About
-Cortex Platform is a fully database-driven application built on top of [`cortex.cpp`](), designed as an OpenAI API equivalent. It supports multiple engines, multi-user functionality, and operates entirely through an API.
+Cortex Platform is a fully database-driven application built on top of [`cortex.cpp`](https://github.com/janhq/cortex.cpp), designed as an OpenAI API equivalent. It supports multiple engines, multi-user functionality, and operates entirely through stateful API endpoints.
 
 ## Cortex Engines
 Cortex Platform supports the following engines:

From 0347b7023a9a5a9df1b54d35f7a1c53dc7eeea05 Mon Sep 17 00:00:00 2001
From: irfanpena <irfan@penateam.com>
Date: Thu, 5 Sep 2024 15:08:51 +0700
Subject: [PATCH 04/21] nits

---
 platform/README.md | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/platform/README.md b/platform/README.md
index 635dc3b18..37242f907 100644
--- a/platform/README.md
+++ b/platform/README.md
@@ -8,7 +8,7 @@
   - <a href="https://github.com/janhq/cortex.cpp/releases">Changelog</a> - <a href="https://github.com/janhq/cortex.cpp/issues">Bug reports</a> - <a href="https://discord.gg/AsJ8krTT3N">Discord</a>
 </p>
 
-> ⚠️ **Cortex Platform is Coming Soon!**
+> ⚠️ **Cortex Platform is under development**
 
 ## About
 Cortex Platform is a fully database-driven application built on top of [`cortex.cpp`](https://github.com/janhq/cortex.cpp), designed as an OpenAI API equivalent. It supports multiple engines, multi-user functionality, and operates entirely through stateful API endpoints.
@@ -87,8 +87,13 @@ Here are example of models that you can use based on each supported engine:
 > You should have at least 8 GB of RAM available to run the 7B models, 16 GB to run the 14B models, and 32 GB to run the 32B models.
 
 ## Cortex Platfrom API
-Cortex Platform has the stateful API that runs at `localhost:1337`.
+Cortex Platform only support the following stateful API endpoints:
 
+- Messages
+- Threads
+- Assistants
+
+Here are some examples of the available stateful endpoints:
 ### Create Message
 ```bash
 curl --request POST \

From 2a6adbf71de0005b40cfbae02555af6d7ba51066 Mon Sep 17 00:00:00 2001
From: irfanpena <irfan@penateam.com>
Date: Fri, 6 Sep 2024 08:07:54 +0700
Subject: [PATCH 05/21] Added the installation based on discord

---
 README.md          |  4 ++--
 platform/README.md | 20 ++++++++++++++++----
 2 files changed, 18 insertions(+), 6 deletions(-)

diff --git a/README.md b/README.md
index 33a1c0e11..f2d80a258 100644
--- a/README.md
+++ b/README.md
@@ -36,8 +36,8 @@ sudo apt install cortex-engine
 **Coming Soon!**
 
 ### Libraries
-- [cortex.js](https://github.com/janhq/cortex.js)
-- [cortex.py](https://github.com/janhq/cortex-python)
+- [cortex.cpp.js](https://github.com/janhq/cortex.js)
+- [cortex.cpp.py](https://github.com/janhq/cortex-python)
 
 ### Build from Source
 
diff --git a/platform/README.md b/platform/README.md
index 37242f907..c0949d9dd 100644
--- a/platform/README.md
+++ b/platform/README.md
@@ -20,17 +20,29 @@ Cortex Platform supports the following engines:
 - [`cortex.tensorrt-llm`](https://github.com/janhq/cortex.tensorrt-llm): `cortex.tensorrt-llm` is a C++ inference library designed for NVIDIA GPUs. It incorporates NVIDIA’s TensorRT-LLM for GPU-accelerated inference.
 
 ## Installation
+
+> **Note**:
+> To install the Cortex Platform, clone our [repository](). It includes everything you need for installation using Docker and Helm.
+
 ### Docker
-**Coming Soon!**
+```bash
+docker compose up
+```
 
 ### Helm
-**Coming Soon!**
+```bash
+helm install cortex-platform
+```
 
 ### Yarn
-**Coming Soon!**
+```bash
+yarn install cortex-platform
+```
 
 ### Libraries
-**Coming Soon!**
+- [cortex.js]()
+- [cortex.py]()
+
 
 ### Build from Source
 **Coming Soon!**

From af7035ffb4c2946cf83d06cfc76fdb4f01c8643c Mon Sep 17 00:00:00 2001
From: irfanpena <irfan@penateam.com>
Date: Fri, 6 Sep 2024 08:10:08 +0700
Subject: [PATCH 06/21] nits

---
 platform/README.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/platform/README.md b/platform/README.md
index c0949d9dd..3a7569f9b 100644
--- a/platform/README.md
+++ b/platform/README.md
@@ -8,7 +8,7 @@
   - <a href="https://github.com/janhq/cortex.cpp/releases">Changelog</a> - <a href="https://github.com/janhq/cortex.cpp/issues">Bug reports</a> - <a href="https://discord.gg/AsJ8krTT3N">Discord</a>
 </p>
 
-> ⚠️ **Cortex Platform is under development**
+> ⚠️ **Cortex Platform is under development. This documentation outlines the intended behavior of Cortex Platform, which may not yet be fully implemented in the codebase.**
 
 ## About
 Cortex Platform is a fully database-driven application built on top of [`cortex.cpp`](https://github.com/janhq/cortex.cpp), designed as an OpenAI API equivalent. It supports multiple engines, multi-user functionality, and operates entirely through stateful API endpoints.

From 6be68492d9b99454721ee5a704553896d44b64a2 Mon Sep 17 00:00:00 2001
From: irfanpena <irfan@penateam.com>
Date: Fri, 6 Sep 2024 09:05:12 +0700
Subject: [PATCH 07/21] cortex-engine->cortex.cpp

---
 README.md | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/README.md b/README.md
index f2d80a258..f1c215842 100644
--- a/README.md
+++ b/README.md
@@ -22,15 +22,15 @@ Cortex supports the following engines:
 ## Installation
 ### MacOs
 ```bash
-brew install cortex-engine
+brew install cortex.cpp
 ```
 ### Windows
 ```bash
-winget install cortex-engine
+winget install cortex.cpp
 ```
 ### Linux
 ```bash
-sudo apt install cortex-engine
+sudo apt install cortex.cpp
 ```
 ### Docker
 **Coming Soon!**

From 3b35dd9b33d8d298c7aedb4e86a1f0f2b913790f Mon Sep 17 00:00:00 2001
From: irfanpena <irfan@penateam.com>
Date: Fri, 6 Sep 2024 10:30:43 +0700
Subject: [PATCH 08/21] use the current banner instead

---
 platform/README.md | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/platform/README.md b/platform/README.md
index 3a7569f9b..e3c63dfa4 100644
--- a/platform/README.md
+++ b/platform/README.md
@@ -1,7 +1,7 @@
 # Cortex
-<!-- <p align="center">
+<p align="center">
   <img alt="cortex-cpplogo" src="https://raw.githubusercontent.com/janhq/cortex/dev/assets/cortex-banner.png">
-</p> -->
+</p>
 
 <p align="center">
   <a href="https://cortex.so/docs/cortex-platform/">Documentation</a> - <a href="https://cortex.so/api-reference">API Reference</a> 

From da195cc93205b8258e6d8e99db7308267a8edda7 Mon Sep 17 00:00:00 2001
From: irfanpena <irfan@penateam.com>
Date: Mon, 9 Sep 2024 13:52:40 +0700
Subject: [PATCH 09/21] Simplify and update the cortex.cpp readme

---
 README.md          | 105 +++++------------
 platform/README.md | 282 +++++++++++++++++++++++++--------------------
 2 files changed, 183 insertions(+), 204 deletions(-)

diff --git a/README.md b/README.md
index f1c215842..77b8605f1 100644
--- a/README.md
+++ b/README.md
@@ -8,16 +8,15 @@
   - <a href="https://github.com/janhq/cortex.cpp/releases">Changelog</a> - <a href="https://github.com/janhq/cortex.cpp/issues">Bug reports</a> - <a href="https://discord.gg/AsJ8krTT3N">Discord</a>
 </p>
 
-> ⚠️ **Cortex is currently in Development**: Expect breaking changes and bugs!
+> ⚠️ **CortexCPP is currently in Development. This documentation outlines the intended behavior of Cortex Platform, which may not yet be fully implemented in the codebase.**
 
 ## About
-Cortex is a C++ AI engine that comes with a Docker-like command-line interface and client libraries. It supports running AI models using `ONNX`, `TensorRT-LLM`, and `llama.cpp` engines. Cortex can function as a standalone server or be integrated as a library.
+Cortex is a C++ AI engine that comes with a Docker-like command-line interface and client libraries. Cortex can function as a standalone server or be integrated as a library.
 
-## Cortex Engines
 Cortex supports the following engines:
-- [`cortex.llamacpp`](https://github.com/janhq/cortex.llamacpp): `cortex.llamacpp` library is a C++ inference tool that can be dynamically loaded by any server at runtime. We use this engine to support GGUF inference with GGUF models. The `llama.cpp` is optimized for performance on both CPU and GPU.
-- [`cortex.onnx` Repository](https://github.com/janhq/cortex.onnx): `cortex.onnx` is a C++ inference library for Windows that leverages `onnxruntime-genai` and uses DirectML to provide GPU acceleration across a wide range of hardware and drivers, including AMD, Intel, NVIDIA, and Qualcomm GPUs.
-- [`cortex.tensorrt-llm`](https://github.com/janhq/cortex.tensorrt-llm): `cortex.tensorrt-llm` is a C++ inference library designed for NVIDIA GPUs. It incorporates NVIDIA’s TensorRT-LLM for GPU-accelerated inference.
+- [`cortex.llamacpp`](https://github.com/janhq/cortex.llamacpp)
+- [`cortex.onnx` Repository](https://github.com/janhq/cortex.onnx)
+- [`cortex.tensorrt-llm`](https://github.com/janhq/cortex.tensorrt-llm)
 
 ## Installation
 ### MacOs
@@ -36,8 +35,8 @@ sudo apt install cortex.cpp
 **Coming Soon!**
 
 ### Libraries
-- [cortex.cpp.js](https://github.com/janhq/cortex.js)
-- [cortex.cpp.py](https://github.com/janhq/cortex-python)
+- [cortex.js](https://github.com/janhq/cortex.js)
+- [cortex.py](https://github.com/janhq/cortex-python)
 
 ### Build from Source
 
@@ -72,9 +71,6 @@ cortex
 
 # Start a model
 cortex run [model_id]
-
-# Chat with a model
-cortex chat [model_id]
 ```
 ## Model Library
 Cortex supports a list of models available on [Cortex Hub](https://huggingface.co/cortexso).
@@ -123,73 +119,30 @@ Here are example of models that you can use based on each supported engine:
 > You should have at least 8 GB of RAM available to run the 7B models, 16 GB to run the 14B models, and 32 GB to run the 32B models.
 
 ## Cortex CLI Commands
+
+| Command Description                | Command Example                                                     |
+|------------------------------------|---------------------------------------------------------------------|
+| **Start Cortex Server**            | `cortex`                                                            |
+| **Chat with a Model**              | `cortex chat [options] [model_id] [message]`                        |
+| **Embeddings**                     | `cortex embeddings [options] [model_id] [message]`                  |
+| **Pull a Model**                   | `cortex pull <model_id>`                                            |
+| **Download and Start a Model**     | `cortex run [options] [model_id]:[engine]`                          |
+| **Get Model Details**              | `cortex models get <model_id>`                                      |
+| **List Models**                    | `cortex models list [options]`                                      |
+| **Delete a Model**                 | `cortex models delete <model_id>`                                   |
+| **Start a Model**                  | `cortex models start [model_id]`                                    |
+| **Stop a Model**                   | `cortex models stop <model_id>`                                     |
+| **Update a Model**            | `cortex models update [options] <model_id>`                         |
+| **Get Engine Details**             | `cortex engines get <engine_name>`                                  |
+| **Install an Engine**              | `cortex engines install <engine_name> [options]`                    |
+| **List Engines**                   | `cortex engines list [options]`                                     |
+| **Uninnstall an Engine**              | `cortex engines uninstall <engine_name> [options]`                 |
+| **Show Model Information**         | `cortex ps`                                                         |
+| **Update Cortex**         | `cortex update [options]`                                                         |
+
 > **Note**:
 > For a more detailed CLI Reference documentation, please see [here](https://cortex.so/docs/cli).
-### Start Cortex Server
-```bash
-cortex 
-```
-### Chat with a Model
-```bash
-cortex chat [options] [model_id] [message]
-```
-### Embeddings
-```bash
-cortex embeddings [options] [model_id] [message]
-```
-### Pull a Model
-```bash
-cortex pull <model_id>
-```
-> This command can also pulls Hugging Face's models.
-### Download and Start a Model
-```bash
-cortex run [options] [model_id]:[engine]
-```
-### Get a Model Details
-```bash
-cortex models get <model_id>
-```
-### List Models
-```bash
-cortex models list [options]
-```
-### Remove a Model
-```bash
-cortex models remove <model_id>
-```
-### Start a Model
-```bash
-cortex models start [model_id]
-```
-### Stop a Model
-```bash
-cortex models stop <model_id>
-```
-### Update a Model Config
-```bash
-cortex models update [options] <model_id>
-```
-### Get an Engine Details
-```bash
-cortex engines get <engine_name>
-```
-### Install an Engine
-```bash
-cortex engines install <engine_name> [options]
-```
-### List Engines
-```bash
-cortex engines list [options]
-```
-### Set an Engine Config
-```bash
-cortex engines set <engine_name> <config> <value>
-```
-### Show Model Information
-```bash
-cortex ps
-```
+
 ## REST API
 Cortex has a REST API that runs at `localhost:1337`.
 
diff --git a/platform/README.md b/platform/README.md
index e3c63dfa4..77b8605f1 100644
--- a/platform/README.md
+++ b/platform/README.md
@@ -4,184 +4,210 @@
 </p>
 
 <p align="center">
-  <a href="https://cortex.so/docs/cortex-platform/">Documentation</a> - <a href="https://cortex.so/api-reference">API Reference</a> 
+  <a href="https://cortex.so/docs/">Documentation</a> - <a href="https://cortex.so/api-reference">API Reference</a> 
   - <a href="https://github.com/janhq/cortex.cpp/releases">Changelog</a> - <a href="https://github.com/janhq/cortex.cpp/issues">Bug reports</a> - <a href="https://discord.gg/AsJ8krTT3N">Discord</a>
 </p>
 
-> ⚠️ **Cortex Platform is under development. This documentation outlines the intended behavior of Cortex Platform, which may not yet be fully implemented in the codebase.**
+> ⚠️ **CortexCPP is currently in Development. This documentation outlines the intended behavior of Cortex Platform, which may not yet be fully implemented in the codebase.**
 
 ## About
-Cortex Platform is a fully database-driven application built on top of [`cortex.cpp`](https://github.com/janhq/cortex.cpp), designed as an OpenAI API equivalent. It supports multiple engines, multi-user functionality, and operates entirely through stateful API endpoints.
+Cortex is a C++ AI engine that comes with a Docker-like command-line interface and client libraries. Cortex can function as a standalone server or be integrated as a library.
 
-## Cortex Engines
-Cortex Platform supports the following engines:
-- [`cortex.llamacpp`](https://github.com/janhq/cortex.llamacpp): `cortex.llamacpp` library is a C++ inference tool that can be dynamically loaded by any server at runtime. We use this engine to support GGUF inference with GGUF models. The `llama.cpp` is optimized for performance on both CPU and GPU.
-- [`cortex.onnx` Repository](https://github.com/janhq/cortex.onnx): `cortex.onnx` is a C++ inference library for Windows that leverages `onnxruntime-genai` and uses DirectML to provide GPU acceleration across a wide range of hardware and drivers, including AMD, Intel, NVIDIA, and Qualcomm GPUs.
-- [`cortex.tensorrt-llm`](https://github.com/janhq/cortex.tensorrt-llm): `cortex.tensorrt-llm` is a C++ inference library designed for NVIDIA GPUs. It incorporates NVIDIA’s TensorRT-LLM for GPU-accelerated inference.
+Cortex supports the following engines:
+- [`cortex.llamacpp`](https://github.com/janhq/cortex.llamacpp)
+- [`cortex.onnx` Repository](https://github.com/janhq/cortex.onnx)
+- [`cortex.tensorrt-llm`](https://github.com/janhq/cortex.tensorrt-llm)
 
 ## Installation
-
-> **Note**:
-> To install the Cortex Platform, clone our [repository](). It includes everything you need for installation using Docker and Helm.
-
-### Docker
+### MacOs
+```bash
+brew install cortex.cpp
+```
+### Windows
 ```bash
-docker compose up
+winget install cortex.cpp
 ```
+### Linux
+```bash
+sudo apt install cortex.cpp
+```
+### Docker
+**Coming Soon!**
+
+### Libraries
+- [cortex.js](https://github.com/janhq/cortex.js)
+- [cortex.py](https://github.com/janhq/cortex-python)
+
+### Build from Source
+
+To install Cortex from the source, follow the steps below:
+
+1. Clone the Cortex repository [here](https://github.com/janhq/cortex/tree/dev).
+2. Navigate to the `platform` folder.
+3. Open the terminal and run the following command to build the Cortex project:
 
-### Helm
 ```bash
-helm install cortex-platform
+npx nest build
 ```
 
-### Yarn
+4. Make the `command.js` executable:
+
 ```bash
-yarn install cortex-platform
+chmod +x '[path-to]/cortex/platform/dist/src/command.js'
 ```
 
-### Libraries
-- [cortex.js]()
-- [cortex.py]()
+5. Link the package globally:
 
+```bash
+npm link
+```
 
-### Build from Source
-**Coming Soon!**
 
 ## Quickstart
-**Coming Soon!**
+To run and chat with a model in Cortex:
+```bash
+# Start the Cortex server
+cortex
 
+# Start a model
+cortex run [model_id]
+```
 ## Model Library
-Cortex Platform supports a list of models available on [Cortex Hub](https://huggingface.co/cortexso).
+Cortex supports a list of models available on [Cortex Hub](https://huggingface.co/cortexso).
 
 Here are example of models that you can use based on each supported engine:
 ### `llama.cpp`
-| Model ID         | Variant (Branch) | Model size        |
-|------------------|------------------|-------------------|
-| codestral        | 22b-gguf         | 22B               |
-| command-r        | 35b-gguf         | 35B               |
-| gemma            | 7b-gguf          | 7B                |
-| llama3           | gguf             | 8B                |
-| llama3.1         | gguf             | 8B                |
-| mistral          | 7b-gguf          | 7B                |
-| mixtral          | 7x8b-gguf        | 46.7B             |
-| openhermes-2.5   | 7b-gguf          | 7B                |
-| phi3             | medium-gguf      | 14B - 4k ctx len  |
-| phi3             | mini-gguf        | 3.82B - 4k ctx len|
-| qwen2            | 7b-gguf          | 7B                |
-| tinyllama        | 1b-gguf          | 1.1B              |
-
+| Model ID         | Variant (Branch) | Model size        | CLI command                        |
+|------------------|------------------|-------------------|------------------------------------|
+| codestral        | 22b-gguf         | 22B               | `cortex run codestral:22b-gguf`    |
+| command-r        | 35b-gguf         | 35B               | `cortex run command-r:35b-gguf`    |
+| gemma            | 7b-gguf          | 7B                | `cortex run gemma:7b-gguf`         |
+| llama3           | gguf             | 8B                | `cortex run llama3:gguf`           |
+| llama3.1         | gguf             | 8B                | `cortex run llama3.1:gguf`         |
+| mistral          | 7b-gguf          | 7B                | `cortex run mistral:7b-gguf`       |
+| mixtral          | 7x8b-gguf        | 46.7B             | `cortex run mixtral:7x8b-gguf`     |
+| openhermes-2.5   | 7b-gguf          | 7B                | `cortex run openhermes-2.5:7b-gguf`|
+| phi3             | medium-gguf      | 14B - 4k ctx len  | `cortex run phi3:medium-gguf`      |
+| phi3             | mini-gguf        | 3.82B - 4k ctx len| `cortex run phi3:mini-gguf`        |
+| qwen2            | 7b-gguf          | 7B                | `cortex run qwen2:7b-gguf`         |
+| tinyllama        | 1b-gguf          | 1.1B              | `cortex run tinyllama:1b-gguf`     |
 ### `ONNX`
-| Model ID         | Variant (Branch) | Model size        |
-|------------------|------------------|-------------------|
-| gemma            | 7b-onnx          | 7B                |
-| llama3           | onnx             | 8B                |
-| mistral          | 7b-onnx          | 7B                |
-| openhermes-2.5   | 7b-onnx          | 7B                |
-| phi3             | mini-onnx        | 3.82B - 4k ctx len|
-| phi3             | medium-onnx      | 14B - 4k ctx len  |
-
+| Model ID         | Variant (Branch) | Model size        | CLI command                        |
+|------------------|------------------|-------------------|------------------------------------|
+| gemma            | 7b-onnx          | 7B                | `cortex run gemma:7b-onnx`         |
+| llama3           | onnx             | 8B                | `cortex run llama3:onnx`           |
+| mistral          | 7b-onnx          | 7B                | `cortex run mistral:7b-onnx`       |
+| openhermes-2.5   | 7b-onnx          | 7B                | `cortex run openhermes-2.5:7b-onnx`|
+| phi3             | mini-onnx        | 3.82B - 4k ctx len| `cortex run phi3:mini-onnx`        |
+| phi3             | medium-onnx      | 14B - 4k ctx len  | `cortex run phi3:medium-onnx`      |
 ### `TensorRT-LLM`
-| Model ID         | Variant (Branch)              | Model size        |
-|------------------|-------------------------------|-------------------|
-| llama3           | 8b-tensorrt-llm-windows-ampere | 8B                |
-| llama3           | 8b-tensorrt-llm-linux-ampere   | 8B                |
-| llama3           | 8b-tensorrt-llm-linux-ada      | 8B                |
-| llama3           | 8b-tensorrt-llm-windows-ada    | 8B                |
-| mistral          | 7b-tensorrt-llm-linux-ampere   | 7B                |
-| mistral          | 7b-tensorrt-llm-windows-ampere | 7B                |
-| mistral          | 7b-tensorrt-llm-linux-ada      | 7B                |
-| mistral          | 7b-tensorrt-llm-windows-ada    | 7B                |
-| openhermes-2.5   | 7b-tensorrt-llm-windows-ampere | 7B                |
-| openhermes-2.5   | 7b-tensorrt-llm-windows-ada    | 7B                |
-| openhermes-2.5   | 7b-tensorrt-llm-linux-ada      | 7B                |
+| Model ID         | Variant (Branch)              | Model size        | CLI command                        |
+|------------------|-------------------------------|-------------------|------------------------------------|
+| llama3           | 8b-tensorrt-llm-windows-ampere       | 8B                | `cortex run llama3:8b-tensorrt-llm-windows-ampere`   |
+| llama3           | 8b-tensorrt-llm-linux-ampere     | 8B                | `cortex run llama3:8b-tensorrt-llm-linux-ampere` |
+| llama3           | 8b-tensorrt-llm-linux-ada   | 8B                | `cortex run llama3:8b-tensorrt-llm-linux-ada`|
+| llama3           | 8b-tensorrt-llm-windows-ada       | 8B                | `cortex run llama3:8b-tensorrt-llm-windows-ada`   |
+| mistral          | 7b-tensorrt-llm-linux-ampere     | 7B                | `cortex run mistral:7b-tensorrt-llm-linux-ampere`|
+| mistral          | 7b-tensorrt-llm-windows-ampere       | 7B                | `cortex run mistral:7b-tensorrt-llm-windows-ampere`  |
+| mistral          | 7b-tensorrt-llm-linux-ada   | 7B                | `cortex run mistral:7b-tensorrt-llm-linux-ada`|
+| mistral          | 7b-tensorrt-llm-windows-ada       | 7B                | `cortex run mistral:7b-tensorrt-llm-windows-ada`  |
+| openhermes-2.5   | 7b-tensorrt-llm-windows-ampere       | 7B                | `cortex run openhermes-2.5:7b-tensorrt-llm-windows-ampere`|
+| openhermes-2.5   | 7b-tensorrt-llm-windows-ada     | 7B                | `cortex run openhermes-2.5:7b-tensorrt-llm-windows-ada`|
+| openhermes-2.5   | 7b-tensorrt-llm-linux-ada   | 7B                | `cortex run openhermes-2.5:7b-tensorrt-llm-linux-ada`|
 
 > **Note**:
 > You should have at least 8 GB of RAM available to run the 7B models, 16 GB to run the 14B models, and 32 GB to run the 32B models.
 
-## Cortex Platfrom API
-Cortex Platform only support the following stateful API endpoints:
+## Cortex CLI Commands
+
+| Command Description                | Command Example                                                     |
+|------------------------------------|---------------------------------------------------------------------|
+| **Start Cortex Server**            | `cortex`                                                            |
+| **Chat with a Model**              | `cortex chat [options] [model_id] [message]`                        |
+| **Embeddings**                     | `cortex embeddings [options] [model_id] [message]`                  |
+| **Pull a Model**                   | `cortex pull <model_id>`                                            |
+| **Download and Start a Model**     | `cortex run [options] [model_id]:[engine]`                          |
+| **Get Model Details**              | `cortex models get <model_id>`                                      |
+| **List Models**                    | `cortex models list [options]`                                      |
+| **Delete a Model**                 | `cortex models delete <model_id>`                                   |
+| **Start a Model**                  | `cortex models start [model_id]`                                    |
+| **Stop a Model**                   | `cortex models stop <model_id>`                                     |
+| **Update a Model**            | `cortex models update [options] <model_id>`                         |
+| **Get Engine Details**             | `cortex engines get <engine_name>`                                  |
+| **Install an Engine**              | `cortex engines install <engine_name> [options]`                    |
+| **List Engines**                   | `cortex engines list [options]`                                     |
+| **Uninnstall an Engine**              | `cortex engines uninstall <engine_name> [options]`                 |
+| **Show Model Information**         | `cortex ps`                                                         |
+| **Update Cortex**         | `cortex update [options]`                                                         |
 
-- Messages
-- Threads
-- Assistants
+> **Note**:
+> For a more detailed CLI Reference documentation, please see [here](https://cortex.so/docs/cli).
+
+## REST API
+Cortex has a REST API that runs at `localhost:1337`.
 
-Here are some examples of the available stateful endpoints:
-### Create Message
+### Pull a Model
 ```bash
 curl --request POST \
-  --url http://127.0.0.1:1337/v1/threads/__THREAD_ID__/messages \
-  --header 'Content-Type: application/json' \
-  --data '{
-  "role": "user",
-  "content": "Tell me a joke"
-}'
+  --url http://localhost:1337/v1/models/{model_id}/pull
 ```
 
-### Create Assistant
+### Start a Model
 ```bash
 curl --request POST \
-  --url http://127.0.0.1:1337/v1/assistants \
+  --url http://localhost:1337/v1/models/{model_id}/start \
   --header 'Content-Type: application/json' \
   --data '{
-  "id": "jan",
-  "avatar": "",
-  "name": "Jan",
-  "description": "A default assistant that can use all downloaded models",
-  "model": "",
-  "instructions": "",
-  "tools": [],
-  "metadata": {},
-  "top_p": "0.7",
-  "temperature": "0.7"
+  "prompt_template": "system\n{system_message}\nuser\n{prompt}\nassistant",
+  "stop": [],
+  "ngl": 4096,
+  "ctx_len": 4096,
+  "cpu_threads": 10,
+  "n_batch": 2048,
+  "caching_enabled": true,
+  "grp_attn_n": 1,
+  "grp_attn_w": 512,
+  "mlock": false,
+  "flash_attn": true,
+  "cache_type": "f16",
+  "use_mmap": true,
+  "engine": "cortex.llamacpp"
 }'
 ```
 
-### Create Thread
+### Chat with a Model
 ```bash
-curl --request POST \
-  --url http://127.0.0.1:1337/v1/threads \
-  --header 'Content-Type: application/json' \
-  --data '{
-  "assistants": [
+curl http://localhost:1337/v1/chat/completions \
+-H "Content-Type: application/json" \
+-d '{
+  "model": "",
+  "messages": [
     {
-      "id": "thread_123",
-      "avatar": "https://example.com/avatar.png",
-      "name": "Virtual Helper",
-      "model": "mistral",
-      "instructions": "Assist with customer queries and provide information based on the company database.",
-      "tools": [
-        {
-          "name": "Knowledge Retrieval",
-          "settings": {
-            "source": "internal",
-            "endpoint": "https://api.example.com/knowledge"
-          }
-        }
-      ],
-      "description": "This assistant helps with customer support by retrieving relevant information.",
-      "metadata": {
-        "department": "support",
-        "version": "1.0"
-      },
-      "object": "assistant",
-      "temperature": 0.7,
-      "top_p": 0.9,
-      "created_at": 1622470423,
-      "response_format": {
-        "format": "json"
-      },
-      "tool_resources": {
-        "resources": [
-          "database1",
-          "database2"
-        ]
-      }
-    }
-  ]
+      "role": "user",
+      "content": "Hello"
+    },
+  ],
+  "model": "mistral",
+  "stream": true,
+  "max_tokens": 1,
+  "stop": [
+      null
+  ],
+  "frequency_penalty": 1,
+  "presence_penalty": 1,
+  "temperature": 1,
+  "top_p": 1
 }'
 ```
 
-> **Note**: Check our [API documentation](https://cortex.so/api-reference) for a full list of available stateful endpoints.
+### Stop a Model
+```bash
+curl --request POST \
+  --url http://localhost:1337/v1/models/mistral/stop
+```
+
+
+> **Note**: Check our [API documentation](https://cortex.so/api-reference) for a full list of available endpoints.
 
 ## Contact Support
 - For support, please file a [GitHub ticket](https://github.com/janhq/cortex/issues/new/choose).

From c4863ff28463850d74141493cb0129327471cb4b Mon Sep 17 00:00:00 2001
From: irfanpena <irfan@penateam.com>
Date: Mon, 9 Sep 2024 14:49:50 +0700
Subject: [PATCH 10/21] Update the CortexCPP readme

---
 README.md | 61 +++++++++++++++++++++++++------------------------------
 1 file changed, 28 insertions(+), 33 deletions(-)

diff --git a/README.md b/README.md
index 77b8605f1..279927750 100644
--- a/README.md
+++ b/README.md
@@ -1,4 +1,4 @@
-# Cortex
+# CortexCPP
 <p align="center">
   <img alt="cortex-cpplogo" src="https://raw.githubusercontent.com/janhq/cortex/dev/assets/cortex-banner.png">
 </p>
@@ -8,31 +8,22 @@
   - <a href="https://github.com/janhq/cortex.cpp/releases">Changelog</a> - <a href="https://github.com/janhq/cortex.cpp/issues">Bug reports</a> - <a href="https://discord.gg/AsJ8krTT3N">Discord</a>
 </p>
 
-> ⚠️ **CortexCPP is currently in Development. This documentation outlines the intended behavior of Cortex Platform, which may not yet be fully implemented in the codebase.**
+> ⚠️ **CortexCPP is currently in Development. This documentation outlines the intended behavior of CortexCPP, which may not yet be fully implemented in the codebase.**
 
 ## About
-Cortex is a C++ AI engine that comes with a Docker-like command-line interface and client libraries. Cortex can function as a standalone server or be integrated as a library.
+CortexCPP is a C++ AI engine featuring a Docker-like command-line interface and client libraries. It can run as a standalone server or be embedded as a library, allowing you to run AI locally on your computer.
 
-Cortex supports the following engines:
+CortexCPP supports the following engines:
 - [`cortex.llamacpp`](https://github.com/janhq/cortex.llamacpp)
 - [`cortex.onnx` Repository](https://github.com/janhq/cortex.onnx)
 - [`cortex.tensorrt-llm`](https://github.com/janhq/cortex.tensorrt-llm)
 
 ## Installation
-### MacOs
-```bash
-brew install cortex.cpp
-```
-### Windows
-```bash
-winget install cortex.cpp
-```
-### Linux
-```bash
-sudo apt install cortex.cpp
-```
-### Docker
-**Coming Soon!**
+To install CortexCPP, download the installer for your operating system from the following options:
+- Stable Version
+- Beta Version
+- Nightly Version
+
 
 ### Libraries
 - [cortex.js](https://github.com/janhq/cortex.js)
@@ -40,20 +31,24 @@ sudo apt install cortex.cpp
 
 ### Build from Source
 
-To install Cortex from the source, follow the steps below:
+To install CortexCPP from the source, follow the steps below:
 
-1. Clone the Cortex repository [here](https://github.com/janhq/cortex/tree/dev).
-2. Navigate to the `platform` folder.
-3. Open the terminal and run the following command to build the Cortex project:
+1. Clone the CortexCPP repository [here](https://github.com/janhq/cortex.cpp).
+2. Navigate to the `engine > vcpkg` folder.
+3. Configure the vpkg:
 
 ```bash
-npx nest build
+cd vcpkg
+./bootstrap-vcpkg.bat
+vcpkg install
 ```
-
-4. Make the `command.js` executable:
+4. Use Visual Studio with the C++ development kit to build the project using the files generated in the `vcpkg` folder.
+5. Build the CortexCPP inside the `engine` folder:
 
 ```bash
-chmod +x '[path-to]/cortex/platform/dist/src/command.js'
+mkdir build
+cd build
+cmake .. -DBUILD_SHARED_LIBS=OFF -DCMAKE_TOOLCHAIN_FILE=path_to_vcpkg_folder/vcpkg/scripts/buildsystems/vcpkg.cmake -DVCPKG_TARGET_TRIPLET=x64-windows-static
 ```
 
 5. Link the package globally:
@@ -64,16 +59,16 @@ npm link
 
 
 ## Quickstart
-To run and chat with a model in Cortex:
+To run and chat with a model in CortexCPP:
 ```bash
-# Start the Cortex server
+# Start the CortexCPP server
 cortex
 
 # Start a model
 cortex run [model_id]
 ```
 ## Model Library
-Cortex supports a list of models available on [Cortex Hub](https://huggingface.co/cortexso).
+CortexCPP supports a list of models available on [Cortex Hub](https://huggingface.co/cortexso).
 
 Here are example of models that you can use based on each supported engine:
 ### `llama.cpp`
@@ -118,11 +113,11 @@ Here are example of models that you can use based on each supported engine:
 > **Note**:
 > You should have at least 8 GB of RAM available to run the 7B models, 16 GB to run the 14B models, and 32 GB to run the 32B models.
 
-## Cortex CLI Commands
+## CortexCPP CLI Commands
 
 | Command Description                | Command Example                                                     |
 |------------------------------------|---------------------------------------------------------------------|
-| **Start Cortex Server**            | `cortex`                                                            |
+| **Start CortexCPP Server**            | `cortex`                                                            |
 | **Chat with a Model**              | `cortex chat [options] [model_id] [message]`                        |
 | **Embeddings**                     | `cortex embeddings [options] [model_id] [message]`                  |
 | **Pull a Model**                   | `cortex pull <model_id>`                                            |
@@ -138,13 +133,13 @@ Here are example of models that you can use based on each supported engine:
 | **List Engines**                   | `cortex engines list [options]`                                     |
 | **Uninnstall an Engine**              | `cortex engines uninstall <engine_name> [options]`                 |
 | **Show Model Information**         | `cortex ps`                                                         |
-| **Update Cortex**         | `cortex update [options]`                                                         |
+| **Update CortexCPP**         | `cortex update [options]`                                                         |
 
 > **Note**:
 > For a more detailed CLI Reference documentation, please see [here](https://cortex.so/docs/cli).
 
 ## REST API
-Cortex has a REST API that runs at `localhost:1337`.
+CortexCPP has a REST API that runs at `localhost:3928`.
 
 ### Pull a Model
 ```bash

From 8671572ba5a8016304344fdcc39d8a77986daa6b Mon Sep 17 00:00:00 2001
From: irfanpena <irfan@penateam.com>
Date: Mon, 9 Sep 2024 14:55:42 +0700
Subject: [PATCH 11/21] Update Overview and Remove Cortex Platform readme

---
 README.md          |   4 +-
 platform/README.md | 217 ---------------------------------------------
 2 files changed, 2 insertions(+), 219 deletions(-)
 delete mode 100644 platform/README.md

diff --git a/README.md b/README.md
index 279927750..995ad62ce 100644
--- a/README.md
+++ b/README.md
@@ -11,7 +11,7 @@
 > ⚠️ **CortexCPP is currently in Development. This documentation outlines the intended behavior of CortexCPP, which may not yet be fully implemented in the codebase.**
 
 ## About
-CortexCPP is a C++ AI engine featuring a Docker-like command-line interface and client libraries. It can run as a standalone server or be embedded as a library, allowing you to run AI locally on your computer.
+CortexCPP is a Local AI engine that is used to run and customize LLMs. CortexCPP can be deployed as a standalone server, or integrated into apps like [Jan.ai](https://jan.ai/).
 
 CortexCPP supports the following engines:
 - [`cortex.llamacpp`](https://github.com/janhq/cortex.llamacpp)
@@ -67,7 +67,7 @@ cortex
 # Start a model
 cortex run [model_id]
 ```
-## Model Library
+## Built-in Model Library
 CortexCPP supports a list of models available on [Cortex Hub](https://huggingface.co/cortexso).
 
 Here are example of models that you can use based on each supported engine:
diff --git a/platform/README.md b/platform/README.md
deleted file mode 100644
index 77b8605f1..000000000
--- a/platform/README.md
+++ /dev/null
@@ -1,217 +0,0 @@
-# Cortex
-<p align="center">
-  <img alt="cortex-cpplogo" src="https://raw.githubusercontent.com/janhq/cortex/dev/assets/cortex-banner.png">
-</p>
-
-<p align="center">
-  <a href="https://cortex.so/docs/">Documentation</a> - <a href="https://cortex.so/api-reference">API Reference</a> 
-  - <a href="https://github.com/janhq/cortex.cpp/releases">Changelog</a> - <a href="https://github.com/janhq/cortex.cpp/issues">Bug reports</a> - <a href="https://discord.gg/AsJ8krTT3N">Discord</a>
-</p>
-
-> ⚠️ **CortexCPP is currently in Development. This documentation outlines the intended behavior of Cortex Platform, which may not yet be fully implemented in the codebase.**
-
-## About
-Cortex is a C++ AI engine that comes with a Docker-like command-line interface and client libraries. Cortex can function as a standalone server or be integrated as a library.
-
-Cortex supports the following engines:
-- [`cortex.llamacpp`](https://github.com/janhq/cortex.llamacpp)
-- [`cortex.onnx` Repository](https://github.com/janhq/cortex.onnx)
-- [`cortex.tensorrt-llm`](https://github.com/janhq/cortex.tensorrt-llm)
-
-## Installation
-### MacOs
-```bash
-brew install cortex.cpp
-```
-### Windows
-```bash
-winget install cortex.cpp
-```
-### Linux
-```bash
-sudo apt install cortex.cpp
-```
-### Docker
-**Coming Soon!**
-
-### Libraries
-- [cortex.js](https://github.com/janhq/cortex.js)
-- [cortex.py](https://github.com/janhq/cortex-python)
-
-### Build from Source
-
-To install Cortex from the source, follow the steps below:
-
-1. Clone the Cortex repository [here](https://github.com/janhq/cortex/tree/dev).
-2. Navigate to the `platform` folder.
-3. Open the terminal and run the following command to build the Cortex project:
-
-```bash
-npx nest build
-```
-
-4. Make the `command.js` executable:
-
-```bash
-chmod +x '[path-to]/cortex/platform/dist/src/command.js'
-```
-
-5. Link the package globally:
-
-```bash
-npm link
-```
-
-
-## Quickstart
-To run and chat with a model in Cortex:
-```bash
-# Start the Cortex server
-cortex
-
-# Start a model
-cortex run [model_id]
-```
-## Model Library
-Cortex supports a list of models available on [Cortex Hub](https://huggingface.co/cortexso).
-
-Here are example of models that you can use based on each supported engine:
-### `llama.cpp`
-| Model ID         | Variant (Branch) | Model size        | CLI command                        |
-|------------------|------------------|-------------------|------------------------------------|
-| codestral        | 22b-gguf         | 22B               | `cortex run codestral:22b-gguf`    |
-| command-r        | 35b-gguf         | 35B               | `cortex run command-r:35b-gguf`    |
-| gemma            | 7b-gguf          | 7B                | `cortex run gemma:7b-gguf`         |
-| llama3           | gguf             | 8B                | `cortex run llama3:gguf`           |
-| llama3.1         | gguf             | 8B                | `cortex run llama3.1:gguf`         |
-| mistral          | 7b-gguf          | 7B                | `cortex run mistral:7b-gguf`       |
-| mixtral          | 7x8b-gguf        | 46.7B             | `cortex run mixtral:7x8b-gguf`     |
-| openhermes-2.5   | 7b-gguf          | 7B                | `cortex run openhermes-2.5:7b-gguf`|
-| phi3             | medium-gguf      | 14B - 4k ctx len  | `cortex run phi3:medium-gguf`      |
-| phi3             | mini-gguf        | 3.82B - 4k ctx len| `cortex run phi3:mini-gguf`        |
-| qwen2            | 7b-gguf          | 7B                | `cortex run qwen2:7b-gguf`         |
-| tinyllama        | 1b-gguf          | 1.1B              | `cortex run tinyllama:1b-gguf`     |
-### `ONNX`
-| Model ID         | Variant (Branch) | Model size        | CLI command                        |
-|------------------|------------------|-------------------|------------------------------------|
-| gemma            | 7b-onnx          | 7B                | `cortex run gemma:7b-onnx`         |
-| llama3           | onnx             | 8B                | `cortex run llama3:onnx`           |
-| mistral          | 7b-onnx          | 7B                | `cortex run mistral:7b-onnx`       |
-| openhermes-2.5   | 7b-onnx          | 7B                | `cortex run openhermes-2.5:7b-onnx`|
-| phi3             | mini-onnx        | 3.82B - 4k ctx len| `cortex run phi3:mini-onnx`        |
-| phi3             | medium-onnx      | 14B - 4k ctx len  | `cortex run phi3:medium-onnx`      |
-### `TensorRT-LLM`
-| Model ID         | Variant (Branch)              | Model size        | CLI command                        |
-|------------------|-------------------------------|-------------------|------------------------------------|
-| llama3           | 8b-tensorrt-llm-windows-ampere       | 8B                | `cortex run llama3:8b-tensorrt-llm-windows-ampere`   |
-| llama3           | 8b-tensorrt-llm-linux-ampere     | 8B                | `cortex run llama3:8b-tensorrt-llm-linux-ampere` |
-| llama3           | 8b-tensorrt-llm-linux-ada   | 8B                | `cortex run llama3:8b-tensorrt-llm-linux-ada`|
-| llama3           | 8b-tensorrt-llm-windows-ada       | 8B                | `cortex run llama3:8b-tensorrt-llm-windows-ada`   |
-| mistral          | 7b-tensorrt-llm-linux-ampere     | 7B                | `cortex run mistral:7b-tensorrt-llm-linux-ampere`|
-| mistral          | 7b-tensorrt-llm-windows-ampere       | 7B                | `cortex run mistral:7b-tensorrt-llm-windows-ampere`  |
-| mistral          | 7b-tensorrt-llm-linux-ada   | 7B                | `cortex run mistral:7b-tensorrt-llm-linux-ada`|
-| mistral          | 7b-tensorrt-llm-windows-ada       | 7B                | `cortex run mistral:7b-tensorrt-llm-windows-ada`  |
-| openhermes-2.5   | 7b-tensorrt-llm-windows-ampere       | 7B                | `cortex run openhermes-2.5:7b-tensorrt-llm-windows-ampere`|
-| openhermes-2.5   | 7b-tensorrt-llm-windows-ada     | 7B                | `cortex run openhermes-2.5:7b-tensorrt-llm-windows-ada`|
-| openhermes-2.5   | 7b-tensorrt-llm-linux-ada   | 7B                | `cortex run openhermes-2.5:7b-tensorrt-llm-linux-ada`|
-
-> **Note**:
-> You should have at least 8 GB of RAM available to run the 7B models, 16 GB to run the 14B models, and 32 GB to run the 32B models.
-
-## Cortex CLI Commands
-
-| Command Description                | Command Example                                                     |
-|------------------------------------|---------------------------------------------------------------------|
-| **Start Cortex Server**            | `cortex`                                                            |
-| **Chat with a Model**              | `cortex chat [options] [model_id] [message]`                        |
-| **Embeddings**                     | `cortex embeddings [options] [model_id] [message]`                  |
-| **Pull a Model**                   | `cortex pull <model_id>`                                            |
-| **Download and Start a Model**     | `cortex run [options] [model_id]:[engine]`                          |
-| **Get Model Details**              | `cortex models get <model_id>`                                      |
-| **List Models**                    | `cortex models list [options]`                                      |
-| **Delete a Model**                 | `cortex models delete <model_id>`                                   |
-| **Start a Model**                  | `cortex models start [model_id]`                                    |
-| **Stop a Model**                   | `cortex models stop <model_id>`                                     |
-| **Update a Model**            | `cortex models update [options] <model_id>`                         |
-| **Get Engine Details**             | `cortex engines get <engine_name>`                                  |
-| **Install an Engine**              | `cortex engines install <engine_name> [options]`                    |
-| **List Engines**                   | `cortex engines list [options]`                                     |
-| **Uninnstall an Engine**              | `cortex engines uninstall <engine_name> [options]`                 |
-| **Show Model Information**         | `cortex ps`                                                         |
-| **Update Cortex**         | `cortex update [options]`                                                         |
-
-> **Note**:
-> For a more detailed CLI Reference documentation, please see [here](https://cortex.so/docs/cli).
-
-## REST API
-Cortex has a REST API that runs at `localhost:1337`.
-
-### Pull a Model
-```bash
-curl --request POST \
-  --url http://localhost:1337/v1/models/{model_id}/pull
-```
-
-### Start a Model
-```bash
-curl --request POST \
-  --url http://localhost:1337/v1/models/{model_id}/start \
-  --header 'Content-Type: application/json' \
-  --data '{
-  "prompt_template": "system\n{system_message}\nuser\n{prompt}\nassistant",
-  "stop": [],
-  "ngl": 4096,
-  "ctx_len": 4096,
-  "cpu_threads": 10,
-  "n_batch": 2048,
-  "caching_enabled": true,
-  "grp_attn_n": 1,
-  "grp_attn_w": 512,
-  "mlock": false,
-  "flash_attn": true,
-  "cache_type": "f16",
-  "use_mmap": true,
-  "engine": "cortex.llamacpp"
-}'
-```
-
-### Chat with a Model
-```bash
-curl http://localhost:1337/v1/chat/completions \
--H "Content-Type: application/json" \
--d '{
-  "model": "",
-  "messages": [
-    {
-      "role": "user",
-      "content": "Hello"
-    },
-  ],
-  "model": "mistral",
-  "stream": true,
-  "max_tokens": 1,
-  "stop": [
-      null
-  ],
-  "frequency_penalty": 1,
-  "presence_penalty": 1,
-  "temperature": 1,
-  "top_p": 1
-}'
-```
-
-### Stop a Model
-```bash
-curl --request POST \
-  --url http://localhost:1337/v1/models/mistral/stop
-```
-
-
-> **Note**: Check our [API documentation](https://cortex.so/api-reference) for a full list of available endpoints.
-
-## Contact Support
-- For support, please file a [GitHub ticket](https://github.com/janhq/cortex/issues/new/choose).
-- For questions, join our Discord [here](https://discord.gg/FTk2MvZwJH).
-- For long-form inquiries, please email [hello@jan.ai](mailto:hello@jan.ai).
-
-

From af180e485088099ca2649469cf961d10a42a8cc3 Mon Sep 17 00:00:00 2001
From: irfanpena <irfan@penateam.com>
Date: Mon, 9 Sep 2024 14:56:40 +0700
Subject: [PATCH 12/21] nits

---
 README.md | 6 ------
 1 file changed, 6 deletions(-)

diff --git a/README.md b/README.md
index 995ad62ce..1965ce377 100644
--- a/README.md
+++ b/README.md
@@ -51,12 +51,6 @@ cd build
 cmake .. -DBUILD_SHARED_LIBS=OFF -DCMAKE_TOOLCHAIN_FILE=path_to_vcpkg_folder/vcpkg/scripts/buildsystems/vcpkg.cmake -DVCPKG_TARGET_TRIPLET=x64-windows-static
 ```
 
-5. Link the package globally:
-
-```bash
-npm link
-```
-
 
 ## Quickstart
 To run and chat with a model in CortexCPP:

From f38f6780427394204f48f81666ac6a00f2cd7f4f Mon Sep 17 00:00:00 2001
From: irfanpena <irfan@penateam.com>
Date: Mon, 9 Sep 2024 15:01:27 +0700
Subject: [PATCH 13/21] Separate installer for each version

---
 README.md | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/README.md b/README.md
index 1965ce377..5c736716c 100644
--- a/README.md
+++ b/README.md
@@ -21,8 +21,20 @@ CortexCPP supports the following engines:
 ## Installation
 To install CortexCPP, download the installer for your operating system from the following options:
 - Stable Version
+  - Windows
+  - Mac
+  - Linux (Debian)
+  - Linux (Fedora)
 - Beta Version
+  - Windows
+  - Mac
+  - Linux (Debian)
+  - Linux (Fedora)
 - Nightly Version
+  - Windows
+  - Mac
+  - Linux (Debian)
+  - Linux (Fedora)
 
 
 ### Libraries

From 425f641dddcb154cc5599a94ea59ca6e377a1136 Mon Sep 17 00:00:00 2001
From: irfanpena <irfan@penateam.com>
Date: Mon, 9 Sep 2024 15:03:47 +0700
Subject: [PATCH 14/21] Update the build from source steps

---
 README.md | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/README.md b/README.md
index 5c736716c..e7a034cd8 100644
--- a/README.md
+++ b/README.md
@@ -54,15 +54,14 @@ cd vcpkg
 ./bootstrap-vcpkg.bat
 vcpkg install
 ```
-4. Use Visual Studio with the C++ development kit to build the project using the files generated in the `vcpkg` folder.
-5. Build the CortexCPP inside the `engine` folder:
+4. Build the CortexCPP inside the `build` folder:
 
 ```bash
 mkdir build
 cd build
 cmake .. -DBUILD_SHARED_LIBS=OFF -DCMAKE_TOOLCHAIN_FILE=path_to_vcpkg_folder/vcpkg/scripts/buildsystems/vcpkg.cmake -DVCPKG_TARGET_TRIPLET=x64-windows-static
 ```
-
+5. Use Visual Studio with the C++ development kit to build the project using the files generated in the `build` folder.
 
 ## Quickstart
 To run and chat with a model in CortexCPP:

From 7bcd47ba3b9a34d7ff902593ad43bc5e9207995c Mon Sep 17 00:00:00 2001
From: irfanpena <irfan@penateam.com>
Date: Mon, 9 Sep 2024 15:07:20 +0700
Subject: [PATCH 15/21] nits

---
 README.md | 30 +++++++++++++++---------------
 1 file changed, 15 insertions(+), 15 deletions(-)

diff --git a/README.md b/README.md
index e7a034cd8..7ab60c9d0 100644
--- a/README.md
+++ b/README.md
@@ -20,21 +20,21 @@ CortexCPP supports the following engines:
 
 ## Installation
 To install CortexCPP, download the installer for your operating system from the following options:
-- Stable Version
-  - Windows
-  - Mac
-  - Linux (Debian)
-  - Linux (Fedora)
-- Beta Version
-  - Windows
-  - Mac
-  - Linux (Debian)
-  - Linux (Fedora)
-- Nightly Version
-  - Windows
-  - Mac
-  - Linux (Debian)
-  - Linux (Fedora)
+- **Stable Version**
+  - [Windows]()
+  - [Mac]()
+  - [Linux (Debian)]()
+  - [Linux (Fedora)]()
+- **Beta Version**
+  - [Windows]()
+  - [Mac]()
+  - [Linux (Debian)]()
+  - [Linux (Fedora)]()
+- **Nightly Version**
+  - [Windows]()
+  - [Mac]()
+  - [Linux (Debian)]()
+  - [Linux (Fedora)]()
 
 
 ### Libraries

From ba71e2bcd94fad62169106ad945c271ebed3b17d Mon Sep 17 00:00:00 2001
From: irfanpena <irfan@penateam.com>
Date: Mon, 9 Sep 2024 15:14:42 +0700
Subject: [PATCH 16/21] nits

---
 README.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/README.md b/README.md
index 7ab60c9d0..612824ef7 100644
--- a/README.md
+++ b/README.md
@@ -15,7 +15,7 @@ CortexCPP is a Local AI engine that is used to run and customize LLMs. CortexCPP
 
 CortexCPP supports the following engines:
 - [`cortex.llamacpp`](https://github.com/janhq/cortex.llamacpp)
-- [`cortex.onnx` Repository](https://github.com/janhq/cortex.onnx)
+- [`cortex.onnx`](https://github.com/janhq/cortex.onnx)
 - [`cortex.tensorrt-llm`](https://github.com/janhq/cortex.tensorrt-llm)
 
 ## Installation

From fb145214116335e79c448d49595f58586b631acd Mon Sep 17 00:00:00 2001
From: irfanpena <irfan@penateam.com>
Date: Mon, 9 Sep 2024 16:26:19 +0700
Subject: [PATCH 17/21] 1337 -> 3928

---
 README.md | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/README.md b/README.md
index 612824ef7..c34655bdd 100644
--- a/README.md
+++ b/README.md
@@ -149,13 +149,13 @@ CortexCPP has a REST API that runs at `localhost:3928`.
 ### Pull a Model
 ```bash
 curl --request POST \
-  --url http://localhost:1337/v1/models/{model_id}/pull
+  --url http://localhost:3928/v1/models/{model_id}/pull
 ```
 
 ### Start a Model
 ```bash
 curl --request POST \
-  --url http://localhost:1337/v1/models/{model_id}/start \
+  --url http://localhost:3928/v1/models/{model_id}/start \
   --header 'Content-Type: application/json' \
   --data '{
   "prompt_template": "system\n{system_message}\nuser\n{prompt}\nassistant",
@@ -177,7 +177,7 @@ curl --request POST \
 
 ### Chat with a Model
 ```bash
-curl http://localhost:1337/v1/chat/completions \
+curl http://localhost:3928/v1/chat/completions \
 -H "Content-Type: application/json" \
 -d '{
   "model": "",
@@ -203,7 +203,7 @@ curl http://localhost:1337/v1/chat/completions \
 ### Stop a Model
 ```bash
 curl --request POST \
-  --url http://localhost:1337/v1/models/mistral/stop
+  --url http://localhost:3928/v1/models/mistral/stop
 ```
 
 

From ea84b2d1ebd5af6aa1f427d53193d48f4b855b2e Mon Sep 17 00:00:00 2001
From: irfanpena <irfan@penateam.com>
Date: Tue, 10 Sep 2024 11:34:10 +0700
Subject: [PATCH 18/21] CortexCPP -> Cortex.cpp

---
 README.md | 30 +++++++++++++++---------------
 1 file changed, 15 insertions(+), 15 deletions(-)

diff --git a/README.md b/README.md
index c34655bdd..be63f2a1d 100644
--- a/README.md
+++ b/README.md
@@ -1,4 +1,4 @@
-# CortexCPP
+# Cortex.cpp
 <p align="center">
   <img alt="cortex-cpplogo" src="https://raw.githubusercontent.com/janhq/cortex/dev/assets/cortex-banner.png">
 </p>
@@ -8,18 +8,18 @@
   - <a href="https://github.com/janhq/cortex.cpp/releases">Changelog</a> - <a href="https://github.com/janhq/cortex.cpp/issues">Bug reports</a> - <a href="https://discord.gg/AsJ8krTT3N">Discord</a>
 </p>
 
-> ⚠️ **CortexCPP is currently in Development. This documentation outlines the intended behavior of CortexCPP, which may not yet be fully implemented in the codebase.**
+> ⚠️ **Cortex.cpp is currently in Development. This documentation outlines the intended behavior of Cortex, which may not yet be fully implemented in the codebase.**
 
 ## About
-CortexCPP is a Local AI engine that is used to run and customize LLMs. CortexCPP can be deployed as a standalone server, or integrated into apps like [Jan.ai](https://jan.ai/).
+Cortex.cpp is a Local AI engine that is used to run and customize LLMs. Cortex can be deployed as a standalone server, or integrated into apps like [Jan.ai](https://jan.ai/).
 
-CortexCPP supports the following engines:
+Cortex supports the following engines:
 - [`cortex.llamacpp`](https://github.com/janhq/cortex.llamacpp)
 - [`cortex.onnx`](https://github.com/janhq/cortex.onnx)
 - [`cortex.tensorrt-llm`](https://github.com/janhq/cortex.tensorrt-llm)
 
 ## Installation
-To install CortexCPP, download the installer for your operating system from the following options:
+To install Cortex, download the installer for your operating system from the following options:
 - **Stable Version**
   - [Windows]()
   - [Mac]()
@@ -43,9 +43,9 @@ To install CortexCPP, download the installer for your operating system from the
 
 ### Build from Source
 
-To install CortexCPP from the source, follow the steps below:
+To install Cortex from the source, follow the steps below:
 
-1. Clone the CortexCPP repository [here](https://github.com/janhq/cortex.cpp).
+1. Clone the Cortex.cpp repository [here](https://github.com/janhq/cortex.cpp).
 2. Navigate to the `engine > vcpkg` folder.
 3. Configure the vpkg:
 
@@ -54,7 +54,7 @@ cd vcpkg
 ./bootstrap-vcpkg.bat
 vcpkg install
 ```
-4. Build the CortexCPP inside the `build` folder:
+4. Build the Cortex inside the `build` folder:
 
 ```bash
 mkdir build
@@ -64,16 +64,16 @@ cmake .. -DBUILD_SHARED_LIBS=OFF -DCMAKE_TOOLCHAIN_FILE=path_to_vcpkg_folder/vcp
 5. Use Visual Studio with the C++ development kit to build the project using the files generated in the `build` folder.
 
 ## Quickstart
-To run and chat with a model in CortexCPP:
+To run and chat with a model in Cortex:
 ```bash
-# Start the CortexCPP server
+# Start the Cortex server
 cortex
 
 # Start a model
 cortex run [model_id]
 ```
 ## Built-in Model Library
-CortexCPP supports a list of models available on [Cortex Hub](https://huggingface.co/cortexso).
+Cortex.cpp supports a list of models available on [Cortex Hub](https://huggingface.co/cortexso).
 
 Here are example of models that you can use based on each supported engine:
 ### `llama.cpp`
@@ -118,11 +118,11 @@ Here are example of models that you can use based on each supported engine:
 > **Note**:
 > You should have at least 8 GB of RAM available to run the 7B models, 16 GB to run the 14B models, and 32 GB to run the 32B models.
 
-## CortexCPP CLI Commands
+## Cortex.cpp CLI Commands
 
 | Command Description                | Command Example                                                     |
 |------------------------------------|---------------------------------------------------------------------|
-| **Start CortexCPP Server**            | `cortex`                                                            |
+| **Start Cortex Server**            | `cortex`                                                            |
 | **Chat with a Model**              | `cortex chat [options] [model_id] [message]`                        |
 | **Embeddings**                     | `cortex embeddings [options] [model_id] [message]`                  |
 | **Pull a Model**                   | `cortex pull <model_id>`                                            |
@@ -138,13 +138,13 @@ Here are example of models that you can use based on each supported engine:
 | **List Engines**                   | `cortex engines list [options]`                                     |
 | **Uninnstall an Engine**              | `cortex engines uninstall <engine_name> [options]`                 |
 | **Show Model Information**         | `cortex ps`                                                         |
-| **Update CortexCPP**         | `cortex update [options]`                                                         |
+| **Update Cortex**         | `cortex update [options]`                                                         |
 
 > **Note**:
 > For a more detailed CLI Reference documentation, please see [here](https://cortex.so/docs/cli).
 
 ## REST API
-CortexCPP has a REST API that runs at `localhost:3928`.
+Cortex.cpp has a REST API that runs at `localhost:3928`.
 
 ### Pull a Model
 ```bash

From 01981b9ea2f6f65a1871c93411883ae3c09dbad1 Mon Sep 17 00:00:00 2001
From: irfanpena <irfan@penateam.com>
Date: Tue, 10 Sep 2024 15:42:58 +0700
Subject: [PATCH 19/21] nits

---
 README.md | 45 +++++++++++++++++++++++++--------------------
 1 file changed, 25 insertions(+), 20 deletions(-)

diff --git a/README.md b/README.md
index be63f2a1d..94e2cbee5 100644
--- a/README.md
+++ b/README.md
@@ -13,28 +13,28 @@
 ## About
 Cortex.cpp is a Local AI engine that is used to run and customize LLMs. Cortex can be deployed as a standalone server, or integrated into apps like [Jan.ai](https://jan.ai/).
 
-Cortex supports the following engines:
+Cortex.cpp supports the following engines:
 - [`cortex.llamacpp`](https://github.com/janhq/cortex.llamacpp)
 - [`cortex.onnx`](https://github.com/janhq/cortex.onnx)
 - [`cortex.tensorrt-llm`](https://github.com/janhq/cortex.tensorrt-llm)
 
 ## Installation
-To install Cortex, download the installer for your operating system from the following options:
+To install Cortex.cpp, download the installer for your operating system from the following options:
 - **Stable Version**
-  - [Windows]()
-  - [Mac]()
-  - [Linux (Debian)]()
-  - [Linux (Fedora)]()
+  - [Windows](https://github.com/janhq/cortex.cpp/releases)
+  - [Mac](https://github.com/janhq/cortex.cpp/releases)
+  - [Linux (Debian)](https://github.com/janhq/cortex.cpp/releases)
+  - [Linux (Fedora)](https://github.com/janhq/cortex.cpp/releases)
 - **Beta Version**
-  - [Windows]()
-  - [Mac]()
-  - [Linux (Debian)]()
-  - [Linux (Fedora)]()
+  - [Windows](https://github.com/janhq/cortex.cpp/releases)
+  - [Mac](https://github.com/janhq/cortex.cpp/releases)
+  - [Linux (Debian)](https://github.com/janhq/cortex.cpp/releases)
+  - [Linux (Fedora)](https://github.com/janhq/cortex.cpp/releases)
 - **Nightly Version**
-  - [Windows]()
-  - [Mac]()
-  - [Linux (Debian)]()
-  - [Linux (Fedora)]()
+  - [Windows](https://github.com/janhq/cortex.cpp/releases)
+  - [Mac](https://github.com/janhq/cortex.cpp/releases)
+  - [Linux (Debian)](https://github.com/janhq/cortex.cpp/releases)
+  - [Linux (Fedora)](https://github.com/janhq/cortex.cpp/releases)
 
 
 ### Libraries
@@ -43,7 +43,7 @@ To install Cortex, download the installer for your operating system from the fol
 
 ### Build from Source
 
-To install Cortex from the source, follow the steps below:
+To install Cortex.cpp from the source, follow the steps below:
 
 1. Clone the Cortex.cpp repository [here](https://github.com/janhq/cortex.cpp).
 2. Navigate to the `engine > vcpkg` folder.
@@ -54,7 +54,7 @@ cd vcpkg
 ./bootstrap-vcpkg.bat
 vcpkg install
 ```
-4. Build the Cortex inside the `build` folder:
+4. Build the Cortex.cpp inside the `build` folder:
 
 ```bash
 mkdir build
@@ -62,11 +62,16 @@ cd build
 cmake .. -DBUILD_SHARED_LIBS=OFF -DCMAKE_TOOLCHAIN_FILE=path_to_vcpkg_folder/vcpkg/scripts/buildsystems/vcpkg.cmake -DVCPKG_TARGET_TRIPLET=x64-windows-static
 ```
 5. Use Visual Studio with the C++ development kit to build the project using the files generated in the `build` folder.
+6. Verify that Cortex.cpp is installed correctly by getting help information.
 
+```sh
+# Get the help information
+cortex -h
+```
 ## Quickstart
-To run and chat with a model in Cortex:
+To run and chat with a model in Cortex.cpp:
 ```bash
-# Start the Cortex server
+# Start the Cortex.cpp server
 cortex
 
 # Start a model
@@ -122,7 +127,7 @@ Here are example of models that you can use based on each supported engine:
 
 | Command Description                | Command Example                                                     |
 |------------------------------------|---------------------------------------------------------------------|
-| **Start Cortex Server**            | `cortex`                                                            |
+| **Start Cortex.cpp Server**            | `cortex`                                                            |
 | **Chat with a Model**              | `cortex chat [options] [model_id] [message]`                        |
 | **Embeddings**                     | `cortex embeddings [options] [model_id] [message]`                  |
 | **Pull a Model**                   | `cortex pull <model_id>`                                            |
@@ -138,7 +143,7 @@ Here are example of models that you can use based on each supported engine:
 | **List Engines**                   | `cortex engines list [options]`                                     |
 | **Uninnstall an Engine**              | `cortex engines uninstall <engine_name> [options]`                 |
 | **Show Model Information**         | `cortex ps`                                                         |
-| **Update Cortex**         | `cortex update [options]`                                                         |
+| **Update Cortex.cpp**         | `cortex update [options]`                                                         |
 
 > **Note**:
 > For a more detailed CLI Reference documentation, please see [here](https://cortex.so/docs/cli).

From 1d434cf096841197d20dc560834e7d7f899ae529 Mon Sep 17 00:00:00 2001
From: irfanpena <irfan@penateam.com>
Date: Wed, 11 Sep 2024 16:32:57 +0700
Subject: [PATCH 20/21] change the engine names

---
 README.md | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/README.md b/README.md
index 94e2cbee5..44bb3ff9d 100644
--- a/README.md
+++ b/README.md
@@ -14,9 +14,9 @@
 Cortex.cpp is a Local AI engine that is used to run and customize LLMs. Cortex can be deployed as a standalone server, or integrated into apps like [Jan.ai](https://jan.ai/).
 
 Cortex.cpp supports the following engines:
-- [`cortex.llamacpp`](https://github.com/janhq/cortex.llamacpp)
-- [`cortex.onnx`](https://github.com/janhq/cortex.onnx)
-- [`cortex.tensorrt-llm`](https://github.com/janhq/cortex.tensorrt-llm)
+- [`llamacpp`](https://github.com/janhq/cortex.llamacpp)
+- [`onnx`](https://github.com/janhq/cortex.onnx)
+- [`tensorrt-llm`](https://github.com/janhq/cortex.tensorrt-llm)
 
 ## Installation
 To install Cortex.cpp, download the installer for your operating system from the following options:
@@ -176,7 +176,7 @@ curl --request POST \
   "flash_attn": true,
   "cache_type": "f16",
   "use_mmap": true,
-  "engine": "cortex.llamacpp"
+  "engine": "llamacpp"
 }'
 ```
 

From a7e1de336dd2dfca98bfb0ae1209ee4925411e73 Mon Sep 17 00:00:00 2001
From: irfanpena <irfan@penateam.com>
Date: Thu, 12 Sep 2024 12:43:06 +0700
Subject: [PATCH 21/21] Updated per feedbacks, except the PORT

---
 README.md | 247 ++++++++++++++++++++++++++++++++++++++++++++----------
 1 file changed, 201 insertions(+), 46 deletions(-)

diff --git a/README.md b/README.md
index 44bb3ff9d..84d46d84b 100644
--- a/README.md
+++ b/README.md
@@ -3,6 +3,15 @@
   <img alt="cortex-cpplogo" src="https://raw.githubusercontent.com/janhq/cortex/dev/assets/cortex-banner.png">
 </p>
 
+<p align="center">
+  <!-- ALL-CONTRIBUTORS-BADGE:START - Do not remove or modify this section -->
+  <img alt="GitHub commit activity" src="https://img.shields.io/github/commit-activity/m/janhq/cortex.cpp"/>
+  <img alt="Github Last Commit" src="https://img.shields.io/github/last-commit/janhq/cortex.cpp"/>
+  <img alt="Github Contributors" src="https://img.shields.io/github/contributors/janhq/cortex.cpp"/>
+  <img alt="GitHub closed issues" src="https://img.shields.io/github/issues-closed/janhq/cortex.cpp"/>
+  <img alt="Discord" src="https://img.shields.io/discord/1107178041848909847?label=discord"/>
+</p>
+
 <p align="center">
   <a href="https://cortex.so/docs/">Documentation</a> - <a href="https://cortex.so/api-reference">API Reference</a> 
   - <a href="https://github.com/janhq/cortex.cpp/releases">Changelog</a> - <a href="https://github.com/janhq/cortex.cpp/issues">Bug reports</a> - <a href="https://discord.gg/AsJ8krTT3N">Discord</a>
@@ -13,61 +22,130 @@
 ## About
 Cortex.cpp is a Local AI engine that is used to run and customize LLMs. Cortex can be deployed as a standalone server, or integrated into apps like [Jan.ai](https://jan.ai/).
 
-Cortex.cpp supports the following engines:
+Cortex.cpp is a multi-engine that uses `llama.cpp` as the default engine but also supports the following:
 - [`llamacpp`](https://github.com/janhq/cortex.llamacpp)
 - [`onnx`](https://github.com/janhq/cortex.onnx)
 - [`tensorrt-llm`](https://github.com/janhq/cortex.tensorrt-llm)
 
 ## Installation
 To install Cortex.cpp, download the installer for your operating system from the following options:
-- **Stable Version**
-  - [Windows](https://github.com/janhq/cortex.cpp/releases)
-  - [Mac](https://github.com/janhq/cortex.cpp/releases)
-  - [Linux (Debian)](https://github.com/janhq/cortex.cpp/releases)
-  - [Linux (Fedora)](https://github.com/janhq/cortex.cpp/releases)
-- **Beta Version**
-  - [Windows](https://github.com/janhq/cortex.cpp/releases)
-  - [Mac](https://github.com/janhq/cortex.cpp/releases)
-  - [Linux (Debian)](https://github.com/janhq/cortex.cpp/releases)
-  - [Linux (Fedora)](https://github.com/janhq/cortex.cpp/releases)
-- **Nightly Version**
-  - [Windows](https://github.com/janhq/cortex.cpp/releases)
-  - [Mac](https://github.com/janhq/cortex.cpp/releases)
-  - [Linux (Debian)](https://github.com/janhq/cortex.cpp/releases)
-  - [Linux (Fedora)](https://github.com/janhq/cortex.cpp/releases)
+
+<table>
+  <tr style="text-align:center">
+    <td style="text-align:center"><b>Version Type</b></td>
+    <td style="text-align:center"><b>Windows</b></td>
+    <td colspan="2" style="text-align:center"><b>MacOS</b></td>
+    <td colspan="2" style="text-align:center"><b>Linux</b></td>
+  </tr>
+  <tr style="text-align:center">
+    <td style="text-align:center"><b>Stable (Recommended)</b></td>
+    <td style="text-align:center">
+      <a href='https://github.com/janhq/cortex.cpp/releases'>
+        <img src='https://github.com/janhq/docs/blob/main/static/img/windows.png' style="height:14px; width: 14px" />
+        <b>cortexcpp.exe</b>
+      </a>
+    </td>
+    <td style="text-align:center">
+      <a href='https://github.com/janhq/cortex.cpp/releases'>
+        <img src='https://github.com/janhq/docs/blob/main/static/img/mac.png' style="height:15px; width: 15px" />
+        <b>Intel</b>
+      </a>
+    </td>
+    <td style="text-align:center">
+      <a href='https://github.com/janhq/cortex.cpp/releases'>
+        <img src='https://github.com/janhq/docs/blob/main/static/img/mac.png' style="height:15px; width: 15px" />
+        <b>M1/M2/M3/M4</b>
+      </a>
+    </td>
+    <td style="text-align:center">
+      <a href='https://github.com/janhq/cortex.cpp/releases'>
+        <img src='https://github.com/janhq/docs/blob/main/static/img/linux.png' style="height:14px; width: 14px" />
+        <b>cortexcpp.deb</b>
+      </a>
+    </td>
+    <td style="text-align:center">
+      <a href='https://github.com/janhq/cortex.cpp/releases'>
+        <img src='https://github.com/janhq/docs/blob/main/static/img/linux.png' style="height:14px; width: 14px" />
+        <b>cortexcpp.AppImage</b>
+      </a>
+    </td>
+  </tr>
+  <tr style="text-align:center">
+    <td style="text-align:center"><b>Beta Build</b></td>
+    <td style="text-align:center">
+      <a href='https://github.com/janhq/cortex.cpp/releases'>
+        <img src='https://github.com/janhq/docs/blob/main/static/img/windows.png' style="height:14px; width: 14px" />
+        <b>cortexcpp.exe</b>
+      </a>
+    </td>
+    <td style="text-align:center">
+      <a href='https://github.com/janhq/cortex.cpp/releases'>
+        <img src='https://github.com/janhq/docs/blob/main/static/img/mac.png' style="height:15px; width: 15px" />
+        <b>Intel</b>
+      </a>
+    </td>
+    <td style="text-align:center">
+      <a href='https://github.com/janhq/cortex.cpp/releases'>
+        <img src='https://github.com/janhq/docs/blob/main/static/img/mac.png' style="height:15px; width: 15px" />
+        <b>M1/M2/M3/M4</b>
+      </a>
+    </td>
+    <td style="text-align:center">
+      <a href='https://github.com/janhq/cortex.cpp/releases'>
+        <img src='https://github.com/janhq/docs/blob/main/static/img/linux.png' style="height:14px; width: 14px" />
+        <b>cortexcpp.deb</b>
+      </a>
+    </td>
+    <td style="text-align:center">
+      <a href='https://github.com/janhq/cortex.cpp/releases'>
+        <img src='https://github.com/janhq/docs/blob/main/static/img/linux.png' style="height:14px; width: 14px" />
+        <b>cortexcpp.AppImage</b>
+      </a>
+    </td>
+  </tr>
+  <tr style="text-align:center">
+    <td style="text-align:center"><b>Nightly Build</b></td>
+    <td style="text-align:center">
+      <a href='https://github.com/janhq/cortex.cpp/releases'>
+        <img src='https://github.com/janhq/docs/blob/main/static/img/windows.png' style="height:14px; width: 14px" />
+        <b>cortexcpp.exe</b>
+      </a>
+    </td>
+    <td style="text-align:center">
+      <a href='https://github.com/janhq/cortex.cpp/releases'>
+        <img src='https://github.com/janhq/docs/blob/main/static/img/mac.png' style="height:15px; width: 15px" />
+        <b>Intel</b>
+      </a>
+    </td>
+    <td style="text-align:center">
+      <a href='https://github.com/janhq/cortex.cpp/releases'>
+        <img src='https://github.com/janhq/docs/blob/main/static/img/mac.png' style="height:15px; width: 15px" />
+        <b>M1/M2/M3/M4</b>
+      </a>
+    </td>
+    <td style="text-align:center">
+      <a href='https://github.com/janhq/cortex.cpp/releases'>
+        <img src='https://github.com/janhq/docs/blob/main/static/img/linux.png' style="height:14px; width: 14px" />
+        <b>cortexcpp.deb</b>
+      </a>
+    </td>
+    <td style="text-align:center">
+      <a href='https://github.com/janhq/cortex.cpp/releases'>
+        <img src='https://github.com/janhq/docs/blob/main/static/img/linux.png' style="height:14px; width: 14px" />
+        <b>cortexcpp.AppImage</b>
+      </a>
+    </td>
+  </tr>
+</table>
+
+> **Note**:
+> You can also build Cortex.cpp from source by following the steps [here](#build-from-source).
 
 
 ### Libraries
 - [cortex.js](https://github.com/janhq/cortex.js)
 - [cortex.py](https://github.com/janhq/cortex-python)
 
-### Build from Source
-
-To install Cortex.cpp from the source, follow the steps below:
-
-1. Clone the Cortex.cpp repository [here](https://github.com/janhq/cortex.cpp).
-2. Navigate to the `engine > vcpkg` folder.
-3. Configure the vpkg:
-
-```bash
-cd vcpkg
-./bootstrap-vcpkg.bat
-vcpkg install
-```
-4. Build the Cortex.cpp inside the `build` folder:
-
-```bash
-mkdir build
-cd build
-cmake .. -DBUILD_SHARED_LIBS=OFF -DCMAKE_TOOLCHAIN_FILE=path_to_vcpkg_folder/vcpkg/scripts/buildsystems/vcpkg.cmake -DVCPKG_TARGET_TRIPLET=x64-windows-static
-```
-5. Use Visual Studio with the C++ development kit to build the project using the files generated in the `build` folder.
-6. Verify that Cortex.cpp is installed correctly by getting help information.
-
-```sh
-# Get the help information
-cortex -h
-```
 ## Quickstart
 To run and chat with a model in Cortex.cpp:
 ```bash
@@ -75,7 +153,7 @@ To run and chat with a model in Cortex.cpp:
 cortex
 
 # Start a model
-cortex run [model_id]
+cortex run <model_id>:[engine_name]
 ```
 ## Built-in Model Library
 Cortex.cpp supports a list of models available on [Cortex Hub](https://huggingface.co/cortexso).
@@ -145,7 +223,7 @@ Here are example of models that you can use based on each supported engine:
 | **Show Model Information**         | `cortex ps`                                                         |
 | **Update Cortex.cpp**         | `cortex update [options]`                                                         |
 
-> **Note**:
+> **Note**
 > For a more detailed CLI Reference documentation, please see [here](https://cortex.so/docs/cli).
 
 ## REST API
@@ -211,8 +289,85 @@ curl --request POST \
   --url http://localhost:3928/v1/models/mistral/stop
 ```
 
+> **Note**
+> Check our [API documentation](https://cortex.so/api-reference) for a full list of available endpoints.
+
+## Build from Source
+
+### Windows
+1. Clone the Cortex.cpp repository [here](https://github.com/janhq/cortex.cpp).
+2. Navigate to the `engine > vcpkg` folder.
+3. Configure the vpkg:
+
+```bash
+cd vcpkg
+./bootstrap-vcpkg.bat
+vcpkg install
+```
+4. Build the Cortex.cpp inside the `build` folder:
+
+```bash
+mkdir build
+cd build
+cmake .. -DBUILD_SHARED_LIBS=OFF -DCMAKE_TOOLCHAIN_FILE=path_to_vcpkg_folder/vcpkg/scripts/buildsystems/vcpkg.cmake -DVCPKG_TARGET_TRIPLET=x64-windows-static
+```
+5. Use Visual Studio with the C++ development kit to build the project using the files generated in the `build` folder.
+6. Verify that Cortex.cpp is installed correctly by getting help information.
+
+```sh
+# Get the help information
+cortex -h
+```
+### MacOS
+1. Clone the Cortex.cpp repository [here](https://github.com/janhq/cortex.cpp).
+2. Navigate to the `engine > vcpkg` folder.
+3. Configure the vpkg:
 
-> **Note**: Check our [API documentation](https://cortex.so/api-reference) for a full list of available endpoints.
+```bash
+cd vcpkg
+./bootstrap-vcpkg.sh
+vcpkg install
+```
+4. Build the Cortex.cpp inside the `build` folder:
+
+```bash
+mkdir build
+cd build
+cmake .. -DCMAKE_TOOLCHAIN_FILE=path_to_vcpkg_folder/vcpkg/scripts/buildsystems/vcpkg.cmake
+make -j4
+```
+5. Use Visual Studio with the C++ development kit to build the project using the files generated in the `build` folder.
+6. Verify that Cortex.cpp is installed correctly by getting help information.
+
+```sh
+# Get the help information
+cortex -h
+```
+### Linux
+1. Clone the Cortex.cpp repository [here](https://github.com/janhq/cortex.cpp).
+2. Navigate to the `engine > vcpkg` folder.
+3. Configure the vpkg:
+
+```bash
+cd vcpkg
+./bootstrap-vcpkg.sh
+vcpkg install
+```
+4. Build the Cortex.cpp inside the `build` folder:
+
+```bash
+mkdir build
+cd build
+cmake .. -DCMAKE_TOOLCHAIN_FILE=path_to_vcpkg_folder/vcpkg/scripts/buildsystems/vcpkg.cmake
+make -j4
+```
+5. Use Visual Studio with the C++ development kit to build the project using the files generated in the `build` folder.
+6. Verify that Cortex.cpp is installed correctly by getting help information.
+
+```sh
+# Get the help information
+cortex -h
+```
 
 ## Contact Support
 - For support, please file a [GitHub ticket](https://github.com/janhq/cortex/issues/new/choose).

Version Type	Windows	MacOS		Linux
Stable (Recommended)	+ + + cortexcpp.exe + +	+ + + Intel + +	+ + + M1/M2/M3/M4 + +	+ + + cortexcpp.deb + +	+ + + cortexcpp.AppImage + +
Beta Build	+ + + cortexcpp.exe + +	+ + + Intel + +	+ + + M1/M2/M3/M4 + +	+ + + cortexcpp.deb + +	+ + + cortexcpp.AppImage + +
Nightly Build	+ + + cortexcpp.exe + +	+ + + Intel + +	+ + + M1/M2/M3/M4 + +	+ + + cortexcpp.deb + +	+ + + cortexcpp.AppImage + +