From ac428aae8a2755a4a1426ee15ea00c6a71a33d47 Mon Sep 17 00:00:00 2001 From: irfanpena Date: Thu, 5 Sep 2024 14:04:52 +0700 Subject: [PATCH 01/21] Draft the Platform readme --- README.md | 2 +- platform/README.md | 244 +++++++++++++++++++++++++-------------------- 2 files changed, 138 insertions(+), 108 deletions(-) diff --git a/README.md b/README.md index 6b21c4448..33a1c0e11 100644 --- a/README.md +++ b/README.md @@ -5,7 +5,7 @@

Documentation - API Reference - - Changelog - Bug reports - Discord + - Changelog - Bug reports - Discord

> ⚠️ **Cortex is currently in Development**: Expect breaking changes and bugs! diff --git a/platform/README.md b/platform/README.md index 660664159..10a191e8e 100644 --- a/platform/README.md +++ b/platform/README.md @@ -1,138 +1,168 @@ # Cortex -

+

- Documentation - API Reference - - Changelog - Bug reports - Discord + Documentation - API Reference + - Changelog - Bug reports - Discord

-> ⚠️ **Cortex is currently in Development**: Expect breaking changes and bugs! +> ⚠️ **Cortex Platform is Coming Soon!** ## About -Cortex is an OpenAI-compatible AI engine that developers can use to build LLM apps. It is packaged with a Docker-inspired command-line interface and client libraries. It can be used as a standalone server or imported as a library. +Cortex Platform is a fully database-driven application built on top of [`cortex.cpp`](), designed as an OpenAI API equivalent. It supports multiple engines, multi-user functionality, and operates entirely through an API. ## Cortex Engines -Cortex supports the following engines: +Cortex Platform supports the following engines: - [`cortex.llamacpp`](https://github.com/janhq/cortex.llamacpp): `cortex.llamacpp` library is a C++ inference tool that can be dynamically loaded by any server at runtime. We use this engine to support GGUF inference with GGUF models. The `llama.cpp` is optimized for performance on both CPU and GPU. - [`cortex.onnx` Repository](https://github.com/janhq/cortex.onnx): `cortex.onnx` is a C++ inference library for Windows that leverages `onnxruntime-genai` and uses DirectML to provide GPU acceleration across a wide range of hardware and drivers, including AMD, Intel, NVIDIA, and Qualcomm GPUs. - [`cortex.tensorrt-llm`](https://github.com/janhq/cortex.tensorrt-llm): `cortex.tensorrt-llm` is a C++ inference library designed for NVIDIA GPUs. It incorporates NVIDIA’s TensorRT-LLM for GPU-accelerated inference. -## Quicklinks - -- [Homepage](https://cortex.so/) -- [Docs](https://cortex.so/docs/) - -## Quickstart -### Prerequisites -- **OS**: - - MacOSX 13.6 or higher. - - Windows 10 or higher. - - Ubuntu 22.04 and later. -- **Dependencies**: - - **Node.js**: Version 18 and above is required to run the installation. - - **NPM**: Needed to manage packages. - - **CPU Instruction Sets**: Available for download from the [Cortex GitHub Releases](https://github.com/janhq/cortex/releases) page. - - **OpenMPI**: Required for Linux. Install by using the following command: - ```bash - sudo apt install openmpi-bin libopenmpi-dev - ``` - -> Visit [Quickstart](https://cortex.so/docs/quickstart) to get started. - -### NPM -``` bash -# Install using NPM -npm i -g cortexso -# Run model -cortex run mistral -# To uninstall globally using NPM -npm uninstall -g cortexso -``` - -### Homebrew -``` bash -# Install using Brew -brew install cortexso -# Run model -cortex run mistral -# To uninstall using Brew -brew uninstall cortexso -``` -> You can also install Cortex using the Cortex Installer available on [GitHub Releases](https://github.com/janhq/cortex/releases). - -## Cortex Server -```bash -cortex serve - -# Output -# Started server at http://localhost:1337 -# Swagger UI available at http://localhost:1337/api -``` +## Installation +### Docker +**Coming Soon!** -You can now access the Cortex API server at `http://localhost:1337`, -and the Swagger UI at `http://localhost:1337/api`. +### Helm +**Coming Soon!** -## Build from Source +### Yarn +**Coming Soon!** -To install Cortex from the source, follow the steps below: +### Libraries +**Coming Soon!** -1. Clone the Cortex repository [here](https://github.com/janhq/cortex/tree/dev). -2. Navigate to the `cortex-js` folder. -3. Open the terminal and run the following command to build the Cortex project: +### Build from Source +**Coming Soon!** +## Quickstart +**Coming Soon!** + +## Model Library +Cortex supports a list of models available on [Cortex Hub](https://huggingface.co/cortexso). + +Here are example of models that you can use based on each supported engine: +### `llama.cpp` +| Model ID | Variant (Branch) | Model size | CLI command | +|------------------|------------------|-------------------|------------------------------------| +| codestral | 22b-gguf | 22B | `cortex run codestral:22b-gguf` | +| command-r | 35b-gguf | 35B | `cortex run command-r:35b-gguf` | +| gemma | 7b-gguf | 7B | `cortex run gemma:7b-gguf` | +| llama3 | gguf | 8B | `cortex run llama3:gguf` | +| llama3.1 | gguf | 8B | `cortex run llama3.1:gguf` | +| mistral | 7b-gguf | 7B | `cortex run mistral:7b-gguf` | +| mixtral | 7x8b-gguf | 46.7B | `cortex run mixtral:7x8b-gguf` | +| openhermes-2.5 | 7b-gguf | 7B | `cortex run openhermes-2.5:7b-gguf`| +| phi3 | medium-gguf | 14B - 4k ctx len | `cortex run phi3:medium-gguf` | +| phi3 | mini-gguf | 3.82B - 4k ctx len| `cortex run phi3:mini-gguf` | +| qwen2 | 7b-gguf | 7B | `cortex run qwen2:7b-gguf` | +| tinyllama | 1b-gguf | 1.1B | `cortex run tinyllama:1b-gguf` | +### `ONNX` +| Model ID | Variant (Branch) | Model size | CLI command | +|------------------|------------------|-------------------|------------------------------------| +| gemma | 7b-onnx | 7B | `cortex run gemma:7b-onnx` | +| llama3 | onnx | 8B | `cortex run llama3:onnx` | +| mistral | 7b-onnx | 7B | `cortex run mistral:7b-onnx` | +| openhermes-2.5 | 7b-onnx | 7B | `cortex run openhermes-2.5:7b-onnx`| +| phi3 | mini-onnx | 3.82B - 4k ctx len| `cortex run phi3:mini-onnx` | +| phi3 | medium-onnx | 14B - 4k ctx len | `cortex run phi3:medium-onnx` | +### `TensorRT-LLM` +| Model ID | Variant (Branch) | Model size | CLI command | +|------------------|-------------------------------|-------------------|------------------------------------| +| llama3 | 8b-tensorrt-llm-windows-ampere | 8B | `cortex run llama3:8b-tensorrt-llm-windows-ampere` | +| llama3 | 8b-tensorrt-llm-linux-ampere | 8B | `cortex run llama3:8b-tensorrt-llm-linux-ampere` | +| llama3 | 8b-tensorrt-llm-linux-ada | 8B | `cortex run llama3:8b-tensorrt-llm-linux-ada`| +| llama3 | 8b-tensorrt-llm-windows-ada | 8B | `cortex run llama3:8b-tensorrt-llm-windows-ada` | +| mistral | 7b-tensorrt-llm-linux-ampere | 7B | `cortex run mistral:7b-tensorrt-llm-linux-ampere`| +| mistral | 7b-tensorrt-llm-windows-ampere | 7B | `cortex run mistral:7b-tensorrt-llm-windows-ampere` | +| mistral | 7b-tensorrt-llm-linux-ada | 7B | `cortex run mistral:7b-tensorrt-llm-linux-ada`| +| mistral | 7b-tensorrt-llm-windows-ada | 7B | `cortex run mistral:7b-tensorrt-llm-windows-ada` | +| openhermes-2.5 | 7b-tensorrt-llm-windows-ampere | 7B | `cortex run openhermes-2.5:7b-tensorrt-llm-windows-ampere`| +| openhermes-2.5 | 7b-tensorrt-llm-windows-ada | 7B | `cortex run openhermes-2.5:7b-tensorrt-llm-windows-ada`| +| openhermes-2.5 | 7b-tensorrt-llm-linux-ada | 7B | `cortex run openhermes-2.5:7b-tensorrt-llm-linux-ada`| + +> **Note**: +> You should have at least 8 GB of RAM available to run the 7B models, 16 GB to run the 14B models, and 32 GB to run the 32B models. + +## Cortex Platfrom API +Cortex Platform has the stateful API that runs at `localhost:1337`. + +### Create Message ```bash -npx nest build +curl --request POST \ + --url http://127.0.0.1:1337/v1/threads/__THREAD_ID__/messages \ + --header 'Content-Type: application/json' \ + --data '{ + "role": "user", + "content": "Tell me a joke" +}' ``` -4. Make the `command.js` executable: - +### Create Assistant ```bash -chmod +x '[path-to]/cortex/cortex-js/dist/src/command.js' +curl --request POST \ + --url http://127.0.0.1:1337/v1/assistants \ + --header 'Content-Type: application/json' \ + --data '{ + "id": "jan", + "avatar": "", + "name": "Jan", + "description": "A default assistant that can use all downloaded models", + "model": "", + "instructions": "", + "tools": [], + "metadata": {}, + "top_p": "0.7", + "temperature": "0.7" +}' ``` -5. Link the package globally: - +### Create Thread ```bash -npm link +curl --request POST \ + --url http://127.0.0.1:1337/v1/threads \ + --header 'Content-Type: application/json' \ + --data '{ + "assistants": [ + { + "id": "thread_123", + "avatar": "https://example.com/avatar.png", + "name": "Virtual Helper", + "model": "mistral", + "instructions": "Assist with customer queries and provide information based on the company database.", + "tools": [ + { + "name": "Knowledge Retrieval", + "settings": { + "source": "internal", + "endpoint": "https://api.example.com/knowledge" + } + } + ], + "description": "This assistant helps with customer support by retrieving relevant information.", + "metadata": { + "department": "support", + "version": "1.0" + }, + "object": "assistant", + "temperature": 0.7, + "top_p": 0.9, + "created_at": 1622470423, + "response_format": { + "format": "json" + }, + "tool_resources": { + "resources": [ + "database1", + "database2" + ] + } + } + ] +}' ``` -## Cortex CLI Commands - -The following CLI commands are currently available. -See [CLI Reference Docs](https://cortex.so/docs/cli) for more information. - -```bash - - serve Providing API endpoint for Cortex backend. - chat Send a chat request to a model. - init|setup Init settings and download cortex's dependencies. - ps Show running models and their status. - kill Kill running cortex processes. - pull|download Download a model. Working with HuggingFace model id. - run [options] EXPERIMENTAL: Shortcut to start a model and chat. - models Subcommands for managing models. - models list List all available models. - models pull Download a specified model. - models remove Delete a specified model. - models get Retrieve the configuration of a specified model. - models start Start a specified model. - models stop Stop a specified model. - models update Update the configuration of a specified model. - benchmark Benchmark and analyze the performance of a specific AI model using your system. - presets Show all the available model presets within Cortex. - telemetry Retrieve telemetry logs for monitoring and analysis. - embeddings Creates an embedding vector representing the input text. - engines Subcommands for managing engines. - engines get Get an engine details. - engines list Get all the available Cortex engines. - engines init Setup and download the required dependencies to run cortex engines. - configs Subcommands for managing configurations. - configs get Get a configuration details. - configs list Get all the available configurations. - configs set Set a configuration. -``` +> **Note**: Check our [API documentation](https://cortex.so/api-reference) for a full list of available stateful endpoints. ## Contact Support - For support, please file a [GitHub ticket](https://github.com/janhq/cortex/issues/new/choose). From 3cca455528e8511d0d230b0136d52d0a3254d1d8 Mon Sep 17 00:00:00 2001 From: irfanpena Date: Thu, 5 Sep 2024 14:16:39 +0700 Subject: [PATCH 02/21] Update the Model library table --- platform/README.md | 74 ++++++++++++++++++++++++---------------------- 1 file changed, 38 insertions(+), 36 deletions(-) diff --git a/platform/README.md b/platform/README.md index 10a191e8e..8c3c04d33 100644 --- a/platform/README.md +++ b/platform/README.md @@ -39,47 +39,49 @@ Cortex Platform supports the following engines: **Coming Soon!** ## Model Library -Cortex supports a list of models available on [Cortex Hub](https://huggingface.co/cortexso). +Cortex Platform supports a list of models available on [Cortex Hub](https://huggingface.co/cortexso). Here are example of models that you can use based on each supported engine: ### `llama.cpp` -| Model ID | Variant (Branch) | Model size | CLI command | -|------------------|------------------|-------------------|------------------------------------| -| codestral | 22b-gguf | 22B | `cortex run codestral:22b-gguf` | -| command-r | 35b-gguf | 35B | `cortex run command-r:35b-gguf` | -| gemma | 7b-gguf | 7B | `cortex run gemma:7b-gguf` | -| llama3 | gguf | 8B | `cortex run llama3:gguf` | -| llama3.1 | gguf | 8B | `cortex run llama3.1:gguf` | -| mistral | 7b-gguf | 7B | `cortex run mistral:7b-gguf` | -| mixtral | 7x8b-gguf | 46.7B | `cortex run mixtral:7x8b-gguf` | -| openhermes-2.5 | 7b-gguf | 7B | `cortex run openhermes-2.5:7b-gguf`| -| phi3 | medium-gguf | 14B - 4k ctx len | `cortex run phi3:medium-gguf` | -| phi3 | mini-gguf | 3.82B - 4k ctx len| `cortex run phi3:mini-gguf` | -| qwen2 | 7b-gguf | 7B | `cortex run qwen2:7b-gguf` | -| tinyllama | 1b-gguf | 1.1B | `cortex run tinyllama:1b-gguf` | +| Model ID | Variant (Branch) | Model size | +|------------------|------------------|-------------------| +| codestral | 22b-gguf | 22B | +| command-r | 35b-gguf | 35B | +| gemma | 7b-gguf | 7B | +| llama3 | gguf | 8B | +| llama3.1 | gguf | 8B | +| mistral | 7b-gguf | 7B | +| mixtral | 7x8b-gguf | 46.7B | +| openhermes-2.5 | 7b-gguf | 7B | +| phi3 | medium-gguf | 14B - 4k ctx len | +| phi3 | mini-gguf | 3.82B - 4k ctx len| +| qwen2 | 7b-gguf | 7B | +| tinyllama | 1b-gguf | 1.1B | + ### `ONNX` -| Model ID | Variant (Branch) | Model size | CLI command | -|------------------|------------------|-------------------|------------------------------------| -| gemma | 7b-onnx | 7B | `cortex run gemma:7b-onnx` | -| llama3 | onnx | 8B | `cortex run llama3:onnx` | -| mistral | 7b-onnx | 7B | `cortex run mistral:7b-onnx` | -| openhermes-2.5 | 7b-onnx | 7B | `cortex run openhermes-2.5:7b-onnx`| -| phi3 | mini-onnx | 3.82B - 4k ctx len| `cortex run phi3:mini-onnx` | -| phi3 | medium-onnx | 14B - 4k ctx len | `cortex run phi3:medium-onnx` | +| Model ID | Variant (Branch) | Model size | +|------------------|------------------|-------------------| +| gemma | 7b-onnx | 7B | +| llama3 | onnx | 8B | +| mistral | 7b-onnx | 7B | +| openhermes-2.5 | 7b-onnx | 7B | +| phi3 | mini-onnx | 3.82B - 4k ctx len| +| phi3 | medium-onnx | 14B - 4k ctx len | + ### `TensorRT-LLM` -| Model ID | Variant (Branch) | Model size | CLI command | -|------------------|-------------------------------|-------------------|------------------------------------| -| llama3 | 8b-tensorrt-llm-windows-ampere | 8B | `cortex run llama3:8b-tensorrt-llm-windows-ampere` | -| llama3 | 8b-tensorrt-llm-linux-ampere | 8B | `cortex run llama3:8b-tensorrt-llm-linux-ampere` | -| llama3 | 8b-tensorrt-llm-linux-ada | 8B | `cortex run llama3:8b-tensorrt-llm-linux-ada`| -| llama3 | 8b-tensorrt-llm-windows-ada | 8B | `cortex run llama3:8b-tensorrt-llm-windows-ada` | -| mistral | 7b-tensorrt-llm-linux-ampere | 7B | `cortex run mistral:7b-tensorrt-llm-linux-ampere`| -| mistral | 7b-tensorrt-llm-windows-ampere | 7B | `cortex run mistral:7b-tensorrt-llm-windows-ampere` | -| mistral | 7b-tensorrt-llm-linux-ada | 7B | `cortex run mistral:7b-tensorrt-llm-linux-ada`| -| mistral | 7b-tensorrt-llm-windows-ada | 7B | `cortex run mistral:7b-tensorrt-llm-windows-ada` | -| openhermes-2.5 | 7b-tensorrt-llm-windows-ampere | 7B | `cortex run openhermes-2.5:7b-tensorrt-llm-windows-ampere`| -| openhermes-2.5 | 7b-tensorrt-llm-windows-ada | 7B | `cortex run openhermes-2.5:7b-tensorrt-llm-windows-ada`| -| openhermes-2.5 | 7b-tensorrt-llm-linux-ada | 7B | `cortex run openhermes-2.5:7b-tensorrt-llm-linux-ada`| +| Model ID | Variant (Branch) | Model size | +|------------------|-------------------------------|-------------------| +| llama3 | 8b-tensorrt-llm-windows-ampere | 8B | +| llama3 | 8b-tensorrt-llm-linux-ampere | 8B | +| llama3 | 8b-tensorrt-llm-linux-ada | 8B | +| llama3 | 8b-tensorrt-llm-windows-ada | 8B | +| mistral | 7b-tensorrt-llm-linux-ampere | 7B | +| mistral | 7b-tensorrt-llm-windows-ampere | 7B | +| mistral | 7b-tensorrt-llm-linux-ada | 7B | +| mistral | 7b-tensorrt-llm-windows-ada | 7B | +| openhermes-2.5 | 7b-tensorrt-llm-windows-ampere | 7B | +| openhermes-2.5 | 7b-tensorrt-llm-windows-ada | 7B | +| openhermes-2.5 | 7b-tensorrt-llm-linux-ada | 7B | > **Note**: > You should have at least 8 GB of RAM available to run the 7B models, 16 GB to run the 14B models, and 32 GB to run the 32B models. From 25527451fe10bac0718d7a85de0ea083ae273ed4 Mon Sep 17 00:00:00 2001 From: irfanpena Date: Thu, 5 Sep 2024 14:18:23 +0700 Subject: [PATCH 03/21] nits --- platform/README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/platform/README.md b/platform/README.md index 8c3c04d33..635dc3b18 100644 --- a/platform/README.md +++ b/platform/README.md @@ -11,7 +11,7 @@ > ⚠️ **Cortex Platform is Coming Soon!** ## About -Cortex Platform is a fully database-driven application built on top of [`cortex.cpp`](), designed as an OpenAI API equivalent. It supports multiple engines, multi-user functionality, and operates entirely through an API. +Cortex Platform is a fully database-driven application built on top of [`cortex.cpp`](https://github.com/janhq/cortex.cpp), designed as an OpenAI API equivalent. It supports multiple engines, multi-user functionality, and operates entirely through stateful API endpoints. ## Cortex Engines Cortex Platform supports the following engines: From 0347b7023a9a5a9df1b54d35f7a1c53dc7eeea05 Mon Sep 17 00:00:00 2001 From: irfanpena Date: Thu, 5 Sep 2024 15:08:51 +0700 Subject: [PATCH 04/21] nits --- platform/README.md | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/platform/README.md b/platform/README.md index 635dc3b18..37242f907 100644 --- a/platform/README.md +++ b/platform/README.md @@ -8,7 +8,7 @@ - Changelog - Bug reports - Discord

-> ⚠️ **Cortex Platform is Coming Soon!** +> ⚠️ **Cortex Platform is under development** ## About Cortex Platform is a fully database-driven application built on top of [`cortex.cpp`](https://github.com/janhq/cortex.cpp), designed as an OpenAI API equivalent. It supports multiple engines, multi-user functionality, and operates entirely through stateful API endpoints. @@ -87,8 +87,13 @@ Here are example of models that you can use based on each supported engine: > You should have at least 8 GB of RAM available to run the 7B models, 16 GB to run the 14B models, and 32 GB to run the 32B models. ## Cortex Platfrom API -Cortex Platform has the stateful API that runs at `localhost:1337`. +Cortex Platform only support the following stateful API endpoints: +- Messages +- Threads +- Assistants + +Here are some examples of the available stateful endpoints: ### Create Message ```bash curl --request POST \ From 2a6adbf71de0005b40cfbae02555af6d7ba51066 Mon Sep 17 00:00:00 2001 From: irfanpena Date: Fri, 6 Sep 2024 08:07:54 +0700 Subject: [PATCH 05/21] Added the installation based on discord --- README.md | 4 ++-- platform/README.md | 20 ++++++++++++++++---- 2 files changed, 18 insertions(+), 6 deletions(-) diff --git a/README.md b/README.md index 33a1c0e11..f2d80a258 100644 --- a/README.md +++ b/README.md @@ -36,8 +36,8 @@ sudo apt install cortex-engine **Coming Soon!** ### Libraries -- [cortex.js](https://github.com/janhq/cortex.js) -- [cortex.py](https://github.com/janhq/cortex-python) +- [cortex.cpp.js](https://github.com/janhq/cortex.js) +- [cortex.cpp.py](https://github.com/janhq/cortex-python) ### Build from Source diff --git a/platform/README.md b/platform/README.md index 37242f907..c0949d9dd 100644 --- a/platform/README.md +++ b/platform/README.md @@ -20,17 +20,29 @@ Cortex Platform supports the following engines: - [`cortex.tensorrt-llm`](https://github.com/janhq/cortex.tensorrt-llm): `cortex.tensorrt-llm` is a C++ inference library designed for NVIDIA GPUs. It incorporates NVIDIA’s TensorRT-LLM for GPU-accelerated inference. ## Installation + +> **Note**: +> To install the Cortex Platform, clone our [repository](). It includes everything you need for installation using Docker and Helm. + ### Docker -**Coming Soon!** +```bash +docker compose up +``` ### Helm -**Coming Soon!** +```bash +helm install cortex-platform +``` ### Yarn -**Coming Soon!** +```bash +yarn install cortex-platform +``` ### Libraries -**Coming Soon!** +- [cortex.js]() +- [cortex.py]() + ### Build from Source **Coming Soon!** From af7035ffb4c2946cf83d06cfc76fdb4f01c8643c Mon Sep 17 00:00:00 2001 From: irfanpena Date: Fri, 6 Sep 2024 08:10:08 +0700 Subject: [PATCH 06/21] nits --- platform/README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/platform/README.md b/platform/README.md index c0949d9dd..3a7569f9b 100644 --- a/platform/README.md +++ b/platform/README.md @@ -8,7 +8,7 @@ - Changelog - Bug reports - Discord

-> ⚠️ **Cortex Platform is under development** +> ⚠️ **Cortex Platform is under development. This documentation outlines the intended behavior of Cortex Platform, which may not yet be fully implemented in the codebase.** ## About Cortex Platform is a fully database-driven application built on top of [`cortex.cpp`](https://github.com/janhq/cortex.cpp), designed as an OpenAI API equivalent. It supports multiple engines, multi-user functionality, and operates entirely through stateful API endpoints. From 6be68492d9b99454721ee5a704553896d44b64a2 Mon Sep 17 00:00:00 2001 From: irfanpena Date: Fri, 6 Sep 2024 09:05:12 +0700 Subject: [PATCH 07/21] cortex-engine->cortex.cpp --- README.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/README.md b/README.md index f2d80a258..f1c215842 100644 --- a/README.md +++ b/README.md @@ -22,15 +22,15 @@ Cortex supports the following engines: ## Installation ### MacOs ```bash -brew install cortex-engine +brew install cortex.cpp ``` ### Windows ```bash -winget install cortex-engine +winget install cortex.cpp ``` ### Linux ```bash -sudo apt install cortex-engine +sudo apt install cortex.cpp ``` ### Docker **Coming Soon!** From 3b35dd9b33d8d298c7aedb4e86a1f0f2b913790f Mon Sep 17 00:00:00 2001 From: irfanpena Date: Fri, 6 Sep 2024 10:30:43 +0700 Subject: [PATCH 08/21] use the current banner instead --- platform/README.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/platform/README.md b/platform/README.md index 3a7569f9b..e3c63dfa4 100644 --- a/platform/README.md +++ b/platform/README.md @@ -1,7 +1,7 @@ # Cortex - +

Documentation - API Reference From da195cc93205b8258e6d8e99db7308267a8edda7 Mon Sep 17 00:00:00 2001 From: irfanpena Date: Mon, 9 Sep 2024 13:52:40 +0700 Subject: [PATCH 09/21] Simplify and update the cortex.cpp readme --- README.md | 105 +++++------------ platform/README.md | 282 +++++++++++++++++++++++++-------------------- 2 files changed, 183 insertions(+), 204 deletions(-) diff --git a/README.md b/README.md index f1c215842..77b8605f1 100644 --- a/README.md +++ b/README.md @@ -8,16 +8,15 @@ - Changelog - Bug reports - Discord

-> ⚠️ **Cortex is currently in Development**: Expect breaking changes and bugs! +> ⚠️ **CortexCPP is currently in Development. This documentation outlines the intended behavior of Cortex Platform, which may not yet be fully implemented in the codebase.** ## About -Cortex is a C++ AI engine that comes with a Docker-like command-line interface and client libraries. It supports running AI models using `ONNX`, `TensorRT-LLM`, and `llama.cpp` engines. Cortex can function as a standalone server or be integrated as a library. +Cortex is a C++ AI engine that comes with a Docker-like command-line interface and client libraries. Cortex can function as a standalone server or be integrated as a library. -## Cortex Engines Cortex supports the following engines: -- [`cortex.llamacpp`](https://github.com/janhq/cortex.llamacpp): `cortex.llamacpp` library is a C++ inference tool that can be dynamically loaded by any server at runtime. We use this engine to support GGUF inference with GGUF models. The `llama.cpp` is optimized for performance on both CPU and GPU. -- [`cortex.onnx` Repository](https://github.com/janhq/cortex.onnx): `cortex.onnx` is a C++ inference library for Windows that leverages `onnxruntime-genai` and uses DirectML to provide GPU acceleration across a wide range of hardware and drivers, including AMD, Intel, NVIDIA, and Qualcomm GPUs. -- [`cortex.tensorrt-llm`](https://github.com/janhq/cortex.tensorrt-llm): `cortex.tensorrt-llm` is a C++ inference library designed for NVIDIA GPUs. It incorporates NVIDIA’s TensorRT-LLM for GPU-accelerated inference. +- [`cortex.llamacpp`](https://github.com/janhq/cortex.llamacpp) +- [`cortex.onnx` Repository](https://github.com/janhq/cortex.onnx) +- [`cortex.tensorrt-llm`](https://github.com/janhq/cortex.tensorrt-llm) ## Installation ### MacOs @@ -36,8 +35,8 @@ sudo apt install cortex.cpp **Coming Soon!** ### Libraries -- [cortex.cpp.js](https://github.com/janhq/cortex.js) -- [cortex.cpp.py](https://github.com/janhq/cortex-python) +- [cortex.js](https://github.com/janhq/cortex.js) +- [cortex.py](https://github.com/janhq/cortex-python) ### Build from Source @@ -72,9 +71,6 @@ cortex # Start a model cortex run [model_id] - -# Chat with a model -cortex chat [model_id] ``` ## Model Library Cortex supports a list of models available on [Cortex Hub](https://huggingface.co/cortexso). @@ -123,73 +119,30 @@ Here are example of models that you can use based on each supported engine: > You should have at least 8 GB of RAM available to run the 7B models, 16 GB to run the 14B models, and 32 GB to run the 32B models. ## Cortex CLI Commands + +| Command Description | Command Example | +|------------------------------------|---------------------------------------------------------------------| +| **Start Cortex Server** | `cortex` | +| **Chat with a Model** | `cortex chat [options] [model_id] [message]` | +| **Embeddings** | `cortex embeddings [options] [model_id] [message]` | +| **Pull a Model** | `cortex pull ` | +| **Download and Start a Model** | `cortex run [options] [model_id]:[engine]` | +| **Get Model Details** | `cortex models get ` | +| **List Models** | `cortex models list [options]` | +| **Delete a Model** | `cortex models delete ` | +| **Start a Model** | `cortex models start [model_id]` | +| **Stop a Model** | `cortex models stop ` | +| **Update a Model** | `cortex models update [options] ` | +| **Get Engine Details** | `cortex engines get ` | +| **Install an Engine** | `cortex engines install [options]` | +| **List Engines** | `cortex engines list [options]` | +| **Uninnstall an Engine** | `cortex engines uninstall [options]` | +| **Show Model Information** | `cortex ps` | +| **Update Cortex** | `cortex update [options]` | + > **Note**: > For a more detailed CLI Reference documentation, please see [here](https://cortex.so/docs/cli). -### Start Cortex Server -```bash -cortex -``` -### Chat with a Model -```bash -cortex chat [options] [model_id] [message] -``` -### Embeddings -```bash -cortex embeddings [options] [model_id] [message] -``` -### Pull a Model -```bash -cortex pull -``` -> This command can also pulls Hugging Face's models. -### Download and Start a Model -```bash -cortex run [options] [model_id]:[engine] -``` -### Get a Model Details -```bash -cortex models get -``` -### List Models -```bash -cortex models list [options] -``` -### Remove a Model -```bash -cortex models remove -``` -### Start a Model -```bash -cortex models start [model_id] -``` -### Stop a Model -```bash -cortex models stop -``` -### Update a Model Config -```bash -cortex models update [options] -``` -### Get an Engine Details -```bash -cortex engines get -``` -### Install an Engine -```bash -cortex engines install [options] -``` -### List Engines -```bash -cortex engines list [options] -``` -### Set an Engine Config -```bash -cortex engines set -``` -### Show Model Information -```bash -cortex ps -``` + ## REST API Cortex has a REST API that runs at `localhost:1337`. diff --git a/platform/README.md b/platform/README.md index e3c63dfa4..77b8605f1 100644 --- a/platform/README.md +++ b/platform/README.md @@ -4,184 +4,210 @@

- Documentation - API Reference + Documentation - API Reference - Changelog - Bug reports - Discord

-> ⚠️ **Cortex Platform is under development. This documentation outlines the intended behavior of Cortex Platform, which may not yet be fully implemented in the codebase.** +> ⚠️ **CortexCPP is currently in Development. This documentation outlines the intended behavior of Cortex Platform, which may not yet be fully implemented in the codebase.** ## About -Cortex Platform is a fully database-driven application built on top of [`cortex.cpp`](https://github.com/janhq/cortex.cpp), designed as an OpenAI API equivalent. It supports multiple engines, multi-user functionality, and operates entirely through stateful API endpoints. +Cortex is a C++ AI engine that comes with a Docker-like command-line interface and client libraries. Cortex can function as a standalone server or be integrated as a library. -## Cortex Engines -Cortex Platform supports the following engines: -- [`cortex.llamacpp`](https://github.com/janhq/cortex.llamacpp): `cortex.llamacpp` library is a C++ inference tool that can be dynamically loaded by any server at runtime. We use this engine to support GGUF inference with GGUF models. The `llama.cpp` is optimized for performance on both CPU and GPU. -- [`cortex.onnx` Repository](https://github.com/janhq/cortex.onnx): `cortex.onnx` is a C++ inference library for Windows that leverages `onnxruntime-genai` and uses DirectML to provide GPU acceleration across a wide range of hardware and drivers, including AMD, Intel, NVIDIA, and Qualcomm GPUs. -- [`cortex.tensorrt-llm`](https://github.com/janhq/cortex.tensorrt-llm): `cortex.tensorrt-llm` is a C++ inference library designed for NVIDIA GPUs. It incorporates NVIDIA’s TensorRT-LLM for GPU-accelerated inference. +Cortex supports the following engines: +- [`cortex.llamacpp`](https://github.com/janhq/cortex.llamacpp) +- [`cortex.onnx` Repository](https://github.com/janhq/cortex.onnx) +- [`cortex.tensorrt-llm`](https://github.com/janhq/cortex.tensorrt-llm) ## Installation - -> **Note**: -> To install the Cortex Platform, clone our [repository](). It includes everything you need for installation using Docker and Helm. - -### Docker +### MacOs +```bash +brew install cortex.cpp +``` +### Windows ```bash -docker compose up +winget install cortex.cpp ``` +### Linux +```bash +sudo apt install cortex.cpp +``` +### Docker +**Coming Soon!** + +### Libraries +- [cortex.js](https://github.com/janhq/cortex.js) +- [cortex.py](https://github.com/janhq/cortex-python) + +### Build from Source + +To install Cortex from the source, follow the steps below: + +1. Clone the Cortex repository [here](https://github.com/janhq/cortex/tree/dev). +2. Navigate to the `platform` folder. +3. Open the terminal and run the following command to build the Cortex project: -### Helm ```bash -helm install cortex-platform +npx nest build ``` -### Yarn +4. Make the `command.js` executable: + ```bash -yarn install cortex-platform +chmod +x '[path-to]/cortex/platform/dist/src/command.js' ``` -### Libraries -- [cortex.js]() -- [cortex.py]() +5. Link the package globally: +```bash +npm link +``` -### Build from Source -**Coming Soon!** ## Quickstart -**Coming Soon!** +To run and chat with a model in Cortex: +```bash +# Start the Cortex server +cortex +# Start a model +cortex run [model_id] +``` ## Model Library -Cortex Platform supports a list of models available on [Cortex Hub](https://huggingface.co/cortexso). +Cortex supports a list of models available on [Cortex Hub](https://huggingface.co/cortexso). Here are example of models that you can use based on each supported engine: ### `llama.cpp` -| Model ID | Variant (Branch) | Model size | -|------------------|------------------|-------------------| -| codestral | 22b-gguf | 22B | -| command-r | 35b-gguf | 35B | -| gemma | 7b-gguf | 7B | -| llama3 | gguf | 8B | -| llama3.1 | gguf | 8B | -| mistral | 7b-gguf | 7B | -| mixtral | 7x8b-gguf | 46.7B | -| openhermes-2.5 | 7b-gguf | 7B | -| phi3 | medium-gguf | 14B - 4k ctx len | -| phi3 | mini-gguf | 3.82B - 4k ctx len| -| qwen2 | 7b-gguf | 7B | -| tinyllama | 1b-gguf | 1.1B | - +| Model ID | Variant (Branch) | Model size | CLI command | +|------------------|------------------|-------------------|------------------------------------| +| codestral | 22b-gguf | 22B | `cortex run codestral:22b-gguf` | +| command-r | 35b-gguf | 35B | `cortex run command-r:35b-gguf` | +| gemma | 7b-gguf | 7B | `cortex run gemma:7b-gguf` | +| llama3 | gguf | 8B | `cortex run llama3:gguf` | +| llama3.1 | gguf | 8B | `cortex run llama3.1:gguf` | +| mistral | 7b-gguf | 7B | `cortex run mistral:7b-gguf` | +| mixtral | 7x8b-gguf | 46.7B | `cortex run mixtral:7x8b-gguf` | +| openhermes-2.5 | 7b-gguf | 7B | `cortex run openhermes-2.5:7b-gguf`| +| phi3 | medium-gguf | 14B - 4k ctx len | `cortex run phi3:medium-gguf` | +| phi3 | mini-gguf | 3.82B - 4k ctx len| `cortex run phi3:mini-gguf` | +| qwen2 | 7b-gguf | 7B | `cortex run qwen2:7b-gguf` | +| tinyllama | 1b-gguf | 1.1B | `cortex run tinyllama:1b-gguf` | ### `ONNX` -| Model ID | Variant (Branch) | Model size | -|------------------|------------------|-------------------| -| gemma | 7b-onnx | 7B | -| llama3 | onnx | 8B | -| mistral | 7b-onnx | 7B | -| openhermes-2.5 | 7b-onnx | 7B | -| phi3 | mini-onnx | 3.82B - 4k ctx len| -| phi3 | medium-onnx | 14B - 4k ctx len | - +| Model ID | Variant (Branch) | Model size | CLI command | +|------------------|------------------|-------------------|------------------------------------| +| gemma | 7b-onnx | 7B | `cortex run gemma:7b-onnx` | +| llama3 | onnx | 8B | `cortex run llama3:onnx` | +| mistral | 7b-onnx | 7B | `cortex run mistral:7b-onnx` | +| openhermes-2.5 | 7b-onnx | 7B | `cortex run openhermes-2.5:7b-onnx`| +| phi3 | mini-onnx | 3.82B - 4k ctx len| `cortex run phi3:mini-onnx` | +| phi3 | medium-onnx | 14B - 4k ctx len | `cortex run phi3:medium-onnx` | ### `TensorRT-LLM` -| Model ID | Variant (Branch) | Model size | -|------------------|-------------------------------|-------------------| -| llama3 | 8b-tensorrt-llm-windows-ampere | 8B | -| llama3 | 8b-tensorrt-llm-linux-ampere | 8B | -| llama3 | 8b-tensorrt-llm-linux-ada | 8B | -| llama3 | 8b-tensorrt-llm-windows-ada | 8B | -| mistral | 7b-tensorrt-llm-linux-ampere | 7B | -| mistral | 7b-tensorrt-llm-windows-ampere | 7B | -| mistral | 7b-tensorrt-llm-linux-ada | 7B | -| mistral | 7b-tensorrt-llm-windows-ada | 7B | -| openhermes-2.5 | 7b-tensorrt-llm-windows-ampere | 7B | -| openhermes-2.5 | 7b-tensorrt-llm-windows-ada | 7B | -| openhermes-2.5 | 7b-tensorrt-llm-linux-ada | 7B | +| Model ID | Variant (Branch) | Model size | CLI command | +|------------------|-------------------------------|-------------------|------------------------------------| +| llama3 | 8b-tensorrt-llm-windows-ampere | 8B | `cortex run llama3:8b-tensorrt-llm-windows-ampere` | +| llama3 | 8b-tensorrt-llm-linux-ampere | 8B | `cortex run llama3:8b-tensorrt-llm-linux-ampere` | +| llama3 | 8b-tensorrt-llm-linux-ada | 8B | `cortex run llama3:8b-tensorrt-llm-linux-ada`| +| llama3 | 8b-tensorrt-llm-windows-ada | 8B | `cortex run llama3:8b-tensorrt-llm-windows-ada` | +| mistral | 7b-tensorrt-llm-linux-ampere | 7B | `cortex run mistral:7b-tensorrt-llm-linux-ampere`| +| mistral | 7b-tensorrt-llm-windows-ampere | 7B | `cortex run mistral:7b-tensorrt-llm-windows-ampere` | +| mistral | 7b-tensorrt-llm-linux-ada | 7B | `cortex run mistral:7b-tensorrt-llm-linux-ada`| +| mistral | 7b-tensorrt-llm-windows-ada | 7B | `cortex run mistral:7b-tensorrt-llm-windows-ada` | +| openhermes-2.5 | 7b-tensorrt-llm-windows-ampere | 7B | `cortex run openhermes-2.5:7b-tensorrt-llm-windows-ampere`| +| openhermes-2.5 | 7b-tensorrt-llm-windows-ada | 7B | `cortex run openhermes-2.5:7b-tensorrt-llm-windows-ada`| +| openhermes-2.5 | 7b-tensorrt-llm-linux-ada | 7B | `cortex run openhermes-2.5:7b-tensorrt-llm-linux-ada`| > **Note**: > You should have at least 8 GB of RAM available to run the 7B models, 16 GB to run the 14B models, and 32 GB to run the 32B models. -## Cortex Platfrom API -Cortex Platform only support the following stateful API endpoints: +## Cortex CLI Commands + +| Command Description | Command Example | +|------------------------------------|---------------------------------------------------------------------| +| **Start Cortex Server** | `cortex` | +| **Chat with a Model** | `cortex chat [options] [model_id] [message]` | +| **Embeddings** | `cortex embeddings [options] [model_id] [message]` | +| **Pull a Model** | `cortex pull ` | +| **Download and Start a Model** | `cortex run [options] [model_id]:[engine]` | +| **Get Model Details** | `cortex models get ` | +| **List Models** | `cortex models list [options]` | +| **Delete a Model** | `cortex models delete ` | +| **Start a Model** | `cortex models start [model_id]` | +| **Stop a Model** | `cortex models stop ` | +| **Update a Model** | `cortex models update [options] ` | +| **Get Engine Details** | `cortex engines get ` | +| **Install an Engine** | `cortex engines install [options]` | +| **List Engines** | `cortex engines list [options]` | +| **Uninnstall an Engine** | `cortex engines uninstall [options]` | +| **Show Model Information** | `cortex ps` | +| **Update Cortex** | `cortex update [options]` | -- Messages -- Threads -- Assistants +> **Note**: +> For a more detailed CLI Reference documentation, please see [here](https://cortex.so/docs/cli). + +## REST API +Cortex has a REST API that runs at `localhost:1337`. -Here are some examples of the available stateful endpoints: -### Create Message +### Pull a Model ```bash curl --request POST \ - --url http://127.0.0.1:1337/v1/threads/__THREAD_ID__/messages \ - --header 'Content-Type: application/json' \ - --data '{ - "role": "user", - "content": "Tell me a joke" -}' + --url http://localhost:1337/v1/models/{model_id}/pull ``` -### Create Assistant +### Start a Model ```bash curl --request POST \ - --url http://127.0.0.1:1337/v1/assistants \ + --url http://localhost:1337/v1/models/{model_id}/start \ --header 'Content-Type: application/json' \ --data '{ - "id": "jan", - "avatar": "", - "name": "Jan", - "description": "A default assistant that can use all downloaded models", - "model": "", - "instructions": "", - "tools": [], - "metadata": {}, - "top_p": "0.7", - "temperature": "0.7" + "prompt_template": "system\n{system_message}\nuser\n{prompt}\nassistant", + "stop": [], + "ngl": 4096, + "ctx_len": 4096, + "cpu_threads": 10, + "n_batch": 2048, + "caching_enabled": true, + "grp_attn_n": 1, + "grp_attn_w": 512, + "mlock": false, + "flash_attn": true, + "cache_type": "f16", + "use_mmap": true, + "engine": "cortex.llamacpp" }' ``` -### Create Thread +### Chat with a Model ```bash -curl --request POST \ - --url http://127.0.0.1:1337/v1/threads \ - --header 'Content-Type: application/json' \ - --data '{ - "assistants": [ +curl http://localhost:1337/v1/chat/completions \ +-H "Content-Type: application/json" \ +-d '{ + "model": "", + "messages": [ { - "id": "thread_123", - "avatar": "https://example.com/avatar.png", - "name": "Virtual Helper", - "model": "mistral", - "instructions": "Assist with customer queries and provide information based on the company database.", - "tools": [ - { - "name": "Knowledge Retrieval", - "settings": { - "source": "internal", - "endpoint": "https://api.example.com/knowledge" - } - } - ], - "description": "This assistant helps with customer support by retrieving relevant information.", - "metadata": { - "department": "support", - "version": "1.0" - }, - "object": "assistant", - "temperature": 0.7, - "top_p": 0.9, - "created_at": 1622470423, - "response_format": { - "format": "json" - }, - "tool_resources": { - "resources": [ - "database1", - "database2" - ] - } - } - ] + "role": "user", + "content": "Hello" + }, + ], + "model": "mistral", + "stream": true, + "max_tokens": 1, + "stop": [ + null + ], + "frequency_penalty": 1, + "presence_penalty": 1, + "temperature": 1, + "top_p": 1 }' ``` -> **Note**: Check our [API documentation](https://cortex.so/api-reference) for a full list of available stateful endpoints. +### Stop a Model +```bash +curl --request POST \ + --url http://localhost:1337/v1/models/mistral/stop +``` + + +> **Note**: Check our [API documentation](https://cortex.so/api-reference) for a full list of available endpoints. ## Contact Support - For support, please file a [GitHub ticket](https://github.com/janhq/cortex/issues/new/choose). From c4863ff28463850d74141493cb0129327471cb4b Mon Sep 17 00:00:00 2001 From: irfanpena Date: Mon, 9 Sep 2024 14:49:50 +0700 Subject: [PATCH 10/21] Update the CortexCPP readme --- README.md | 61 +++++++++++++++++++++++++------------------------------ 1 file changed, 28 insertions(+), 33 deletions(-) diff --git a/README.md b/README.md index 77b8605f1..279927750 100644 --- a/README.md +++ b/README.md @@ -1,4 +1,4 @@ -# Cortex +# CortexCPP

cortex-cpplogo

@@ -8,31 +8,22 @@ - Changelog - Bug reports - Discord

-> ⚠️ **CortexCPP is currently in Development. This documentation outlines the intended behavior of Cortex Platform, which may not yet be fully implemented in the codebase.** +> ⚠️ **CortexCPP is currently in Development. This documentation outlines the intended behavior of CortexCPP, which may not yet be fully implemented in the codebase.** ## About -Cortex is a C++ AI engine that comes with a Docker-like command-line interface and client libraries. Cortex can function as a standalone server or be integrated as a library. +CortexCPP is a C++ AI engine featuring a Docker-like command-line interface and client libraries. It can run as a standalone server or be embedded as a library, allowing you to run AI locally on your computer. -Cortex supports the following engines: +CortexCPP supports the following engines: - [`cortex.llamacpp`](https://github.com/janhq/cortex.llamacpp) - [`cortex.onnx` Repository](https://github.com/janhq/cortex.onnx) - [`cortex.tensorrt-llm`](https://github.com/janhq/cortex.tensorrt-llm) ## Installation -### MacOs -```bash -brew install cortex.cpp -``` -### Windows -```bash -winget install cortex.cpp -``` -### Linux -```bash -sudo apt install cortex.cpp -``` -### Docker -**Coming Soon!** +To install CortexCPP, download the installer for your operating system from the following options: +- Stable Version +- Beta Version +- Nightly Version + ### Libraries - [cortex.js](https://github.com/janhq/cortex.js) @@ -40,20 +31,24 @@ sudo apt install cortex.cpp ### Build from Source -To install Cortex from the source, follow the steps below: +To install CortexCPP from the source, follow the steps below: -1. Clone the Cortex repository [here](https://github.com/janhq/cortex/tree/dev). -2. Navigate to the `platform` folder. -3. Open the terminal and run the following command to build the Cortex project: +1. Clone the CortexCPP repository [here](https://github.com/janhq/cortex.cpp). +2. Navigate to the `engine > vcpkg` folder. +3. Configure the vpkg: ```bash -npx nest build +cd vcpkg +./bootstrap-vcpkg.bat +vcpkg install ``` - -4. Make the `command.js` executable: +4. Use Visual Studio with the C++ development kit to build the project using the files generated in the `vcpkg` folder. +5. Build the CortexCPP inside the `engine` folder: ```bash -chmod +x '[path-to]/cortex/platform/dist/src/command.js' +mkdir build +cd build +cmake .. -DBUILD_SHARED_LIBS=OFF -DCMAKE_TOOLCHAIN_FILE=path_to_vcpkg_folder/vcpkg/scripts/buildsystems/vcpkg.cmake -DVCPKG_TARGET_TRIPLET=x64-windows-static ``` 5. Link the package globally: @@ -64,16 +59,16 @@ npm link ## Quickstart -To run and chat with a model in Cortex: +To run and chat with a model in CortexCPP: ```bash -# Start the Cortex server +# Start the CortexCPP server cortex # Start a model cortex run [model_id] ``` ## Model Library -Cortex supports a list of models available on [Cortex Hub](https://huggingface.co/cortexso). +CortexCPP supports a list of models available on [Cortex Hub](https://huggingface.co/cortexso). Here are example of models that you can use based on each supported engine: ### `llama.cpp` @@ -118,11 +113,11 @@ Here are example of models that you can use based on each supported engine: > **Note**: > You should have at least 8 GB of RAM available to run the 7B models, 16 GB to run the 14B models, and 32 GB to run the 32B models. -## Cortex CLI Commands +## CortexCPP CLI Commands | Command Description | Command Example | |------------------------------------|---------------------------------------------------------------------| -| **Start Cortex Server** | `cortex` | +| **Start CortexCPP Server** | `cortex` | | **Chat with a Model** | `cortex chat [options] [model_id] [message]` | | **Embeddings** | `cortex embeddings [options] [model_id] [message]` | | **Pull a Model** | `cortex pull ` | @@ -138,13 +133,13 @@ Here are example of models that you can use based on each supported engine: | **List Engines** | `cortex engines list [options]` | | **Uninnstall an Engine** | `cortex engines uninstall [options]` | | **Show Model Information** | `cortex ps` | -| **Update Cortex** | `cortex update [options]` | +| **Update CortexCPP** | `cortex update [options]` | > **Note**: > For a more detailed CLI Reference documentation, please see [here](https://cortex.so/docs/cli). ## REST API -Cortex has a REST API that runs at `localhost:1337`. +CortexCPP has a REST API that runs at `localhost:3928`. ### Pull a Model ```bash From 8671572ba5a8016304344fdcc39d8a77986daa6b Mon Sep 17 00:00:00 2001 From: irfanpena Date: Mon, 9 Sep 2024 14:55:42 +0700 Subject: [PATCH 11/21] Update Overview and Remove Cortex Platform readme --- README.md | 4 +- platform/README.md | 217 --------------------------------------------- 2 files changed, 2 insertions(+), 219 deletions(-) delete mode 100644 platform/README.md diff --git a/README.md b/README.md index 279927750..995ad62ce 100644 --- a/README.md +++ b/README.md @@ -11,7 +11,7 @@ > ⚠️ **CortexCPP is currently in Development. This documentation outlines the intended behavior of CortexCPP, which may not yet be fully implemented in the codebase.** ## About -CortexCPP is a C++ AI engine featuring a Docker-like command-line interface and client libraries. It can run as a standalone server or be embedded as a library, allowing you to run AI locally on your computer. +CortexCPP is a Local AI engine that is used to run and customize LLMs. CortexCPP can be deployed as a standalone server, or integrated into apps like [Jan.ai](https://jan.ai/). CortexCPP supports the following engines: - [`cortex.llamacpp`](https://github.com/janhq/cortex.llamacpp) @@ -67,7 +67,7 @@ cortex # Start a model cortex run [model_id] ``` -## Model Library +## Built-in Model Library CortexCPP supports a list of models available on [Cortex Hub](https://huggingface.co/cortexso). Here are example of models that you can use based on each supported engine: diff --git a/platform/README.md b/platform/README.md deleted file mode 100644 index 77b8605f1..000000000 --- a/platform/README.md +++ /dev/null @@ -1,217 +0,0 @@ -# Cortex -

- cortex-cpplogo -

- -

- Documentation - API Reference - - Changelog - Bug reports - Discord -

- -> ⚠️ **CortexCPP is currently in Development. This documentation outlines the intended behavior of Cortex Platform, which may not yet be fully implemented in the codebase.** - -## About -Cortex is a C++ AI engine that comes with a Docker-like command-line interface and client libraries. Cortex can function as a standalone server or be integrated as a library. - -Cortex supports the following engines: -- [`cortex.llamacpp`](https://github.com/janhq/cortex.llamacpp) -- [`cortex.onnx` Repository](https://github.com/janhq/cortex.onnx) -- [`cortex.tensorrt-llm`](https://github.com/janhq/cortex.tensorrt-llm) - -## Installation -### MacOs -```bash -brew install cortex.cpp -``` -### Windows -```bash -winget install cortex.cpp -``` -### Linux -```bash -sudo apt install cortex.cpp -``` -### Docker -**Coming Soon!** - -### Libraries -- [cortex.js](https://github.com/janhq/cortex.js) -- [cortex.py](https://github.com/janhq/cortex-python) - -### Build from Source - -To install Cortex from the source, follow the steps below: - -1. Clone the Cortex repository [here](https://github.com/janhq/cortex/tree/dev). -2. Navigate to the `platform` folder. -3. Open the terminal and run the following command to build the Cortex project: - -```bash -npx nest build -``` - -4. Make the `command.js` executable: - -```bash -chmod +x '[path-to]/cortex/platform/dist/src/command.js' -``` - -5. Link the package globally: - -```bash -npm link -``` - - -## Quickstart -To run and chat with a model in Cortex: -```bash -# Start the Cortex server -cortex - -# Start a model -cortex run [model_id] -``` -## Model Library -Cortex supports a list of models available on [Cortex Hub](https://huggingface.co/cortexso). - -Here are example of models that you can use based on each supported engine: -### `llama.cpp` -| Model ID | Variant (Branch) | Model size | CLI command | -|------------------|------------------|-------------------|------------------------------------| -| codestral | 22b-gguf | 22B | `cortex run codestral:22b-gguf` | -| command-r | 35b-gguf | 35B | `cortex run command-r:35b-gguf` | -| gemma | 7b-gguf | 7B | `cortex run gemma:7b-gguf` | -| llama3 | gguf | 8B | `cortex run llama3:gguf` | -| llama3.1 | gguf | 8B | `cortex run llama3.1:gguf` | -| mistral | 7b-gguf | 7B | `cortex run mistral:7b-gguf` | -| mixtral | 7x8b-gguf | 46.7B | `cortex run mixtral:7x8b-gguf` | -| openhermes-2.5 | 7b-gguf | 7B | `cortex run openhermes-2.5:7b-gguf`| -| phi3 | medium-gguf | 14B - 4k ctx len | `cortex run phi3:medium-gguf` | -| phi3 | mini-gguf | 3.82B - 4k ctx len| `cortex run phi3:mini-gguf` | -| qwen2 | 7b-gguf | 7B | `cortex run qwen2:7b-gguf` | -| tinyllama | 1b-gguf | 1.1B | `cortex run tinyllama:1b-gguf` | -### `ONNX` -| Model ID | Variant (Branch) | Model size | CLI command | -|------------------|------------------|-------------------|------------------------------------| -| gemma | 7b-onnx | 7B | `cortex run gemma:7b-onnx` | -| llama3 | onnx | 8B | `cortex run llama3:onnx` | -| mistral | 7b-onnx | 7B | `cortex run mistral:7b-onnx` | -| openhermes-2.5 | 7b-onnx | 7B | `cortex run openhermes-2.5:7b-onnx`| -| phi3 | mini-onnx | 3.82B - 4k ctx len| `cortex run phi3:mini-onnx` | -| phi3 | medium-onnx | 14B - 4k ctx len | `cortex run phi3:medium-onnx` | -### `TensorRT-LLM` -| Model ID | Variant (Branch) | Model size | CLI command | -|------------------|-------------------------------|-------------------|------------------------------------| -| llama3 | 8b-tensorrt-llm-windows-ampere | 8B | `cortex run llama3:8b-tensorrt-llm-windows-ampere` | -| llama3 | 8b-tensorrt-llm-linux-ampere | 8B | `cortex run llama3:8b-tensorrt-llm-linux-ampere` | -| llama3 | 8b-tensorrt-llm-linux-ada | 8B | `cortex run llama3:8b-tensorrt-llm-linux-ada`| -| llama3 | 8b-tensorrt-llm-windows-ada | 8B | `cortex run llama3:8b-tensorrt-llm-windows-ada` | -| mistral | 7b-tensorrt-llm-linux-ampere | 7B | `cortex run mistral:7b-tensorrt-llm-linux-ampere`| -| mistral | 7b-tensorrt-llm-windows-ampere | 7B | `cortex run mistral:7b-tensorrt-llm-windows-ampere` | -| mistral | 7b-tensorrt-llm-linux-ada | 7B | `cortex run mistral:7b-tensorrt-llm-linux-ada`| -| mistral | 7b-tensorrt-llm-windows-ada | 7B | `cortex run mistral:7b-tensorrt-llm-windows-ada` | -| openhermes-2.5 | 7b-tensorrt-llm-windows-ampere | 7B | `cortex run openhermes-2.5:7b-tensorrt-llm-windows-ampere`| -| openhermes-2.5 | 7b-tensorrt-llm-windows-ada | 7B | `cortex run openhermes-2.5:7b-tensorrt-llm-windows-ada`| -| openhermes-2.5 | 7b-tensorrt-llm-linux-ada | 7B | `cortex run openhermes-2.5:7b-tensorrt-llm-linux-ada`| - -> **Note**: -> You should have at least 8 GB of RAM available to run the 7B models, 16 GB to run the 14B models, and 32 GB to run the 32B models. - -## Cortex CLI Commands - -| Command Description | Command Example | -|------------------------------------|---------------------------------------------------------------------| -| **Start Cortex Server** | `cortex` | -| **Chat with a Model** | `cortex chat [options] [model_id] [message]` | -| **Embeddings** | `cortex embeddings [options] [model_id] [message]` | -| **Pull a Model** | `cortex pull ` | -| **Download and Start a Model** | `cortex run [options] [model_id]:[engine]` | -| **Get Model Details** | `cortex models get ` | -| **List Models** | `cortex models list [options]` | -| **Delete a Model** | `cortex models delete ` | -| **Start a Model** | `cortex models start [model_id]` | -| **Stop a Model** | `cortex models stop ` | -| **Update a Model** | `cortex models update [options] ` | -| **Get Engine Details** | `cortex engines get ` | -| **Install an Engine** | `cortex engines install [options]` | -| **List Engines** | `cortex engines list [options]` | -| **Uninnstall an Engine** | `cortex engines uninstall [options]` | -| **Show Model Information** | `cortex ps` | -| **Update Cortex** | `cortex update [options]` | - -> **Note**: -> For a more detailed CLI Reference documentation, please see [here](https://cortex.so/docs/cli). - -## REST API -Cortex has a REST API that runs at `localhost:1337`. - -### Pull a Model -```bash -curl --request POST \ - --url http://localhost:1337/v1/models/{model_id}/pull -``` - -### Start a Model -```bash -curl --request POST \ - --url http://localhost:1337/v1/models/{model_id}/start \ - --header 'Content-Type: application/json' \ - --data '{ - "prompt_template": "system\n{system_message}\nuser\n{prompt}\nassistant", - "stop": [], - "ngl": 4096, - "ctx_len": 4096, - "cpu_threads": 10, - "n_batch": 2048, - "caching_enabled": true, - "grp_attn_n": 1, - "grp_attn_w": 512, - "mlock": false, - "flash_attn": true, - "cache_type": "f16", - "use_mmap": true, - "engine": "cortex.llamacpp" -}' -``` - -### Chat with a Model -```bash -curl http://localhost:1337/v1/chat/completions \ --H "Content-Type: application/json" \ --d '{ - "model": "", - "messages": [ - { - "role": "user", - "content": "Hello" - }, - ], - "model": "mistral", - "stream": true, - "max_tokens": 1, - "stop": [ - null - ], - "frequency_penalty": 1, - "presence_penalty": 1, - "temperature": 1, - "top_p": 1 -}' -``` - -### Stop a Model -```bash -curl --request POST \ - --url http://localhost:1337/v1/models/mistral/stop -``` - - -> **Note**: Check our [API documentation](https://cortex.so/api-reference) for a full list of available endpoints. - -## Contact Support -- For support, please file a [GitHub ticket](https://github.com/janhq/cortex/issues/new/choose). -- For questions, join our Discord [here](https://discord.gg/FTk2MvZwJH). -- For long-form inquiries, please email [hello@jan.ai](mailto:hello@jan.ai). - - From af180e485088099ca2649469cf961d10a42a8cc3 Mon Sep 17 00:00:00 2001 From: irfanpena Date: Mon, 9 Sep 2024 14:56:40 +0700 Subject: [PATCH 12/21] nits --- README.md | 6 ------ 1 file changed, 6 deletions(-) diff --git a/README.md b/README.md index 995ad62ce..1965ce377 100644 --- a/README.md +++ b/README.md @@ -51,12 +51,6 @@ cd build cmake .. -DBUILD_SHARED_LIBS=OFF -DCMAKE_TOOLCHAIN_FILE=path_to_vcpkg_folder/vcpkg/scripts/buildsystems/vcpkg.cmake -DVCPKG_TARGET_TRIPLET=x64-windows-static ``` -5. Link the package globally: - -```bash -npm link -``` - ## Quickstart To run and chat with a model in CortexCPP: From f38f6780427394204f48f81666ac6a00f2cd7f4f Mon Sep 17 00:00:00 2001 From: irfanpena Date: Mon, 9 Sep 2024 15:01:27 +0700 Subject: [PATCH 13/21] Separate installer for each version --- README.md | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/README.md b/README.md index 1965ce377..5c736716c 100644 --- a/README.md +++ b/README.md @@ -21,8 +21,20 @@ CortexCPP supports the following engines: ## Installation To install CortexCPP, download the installer for your operating system from the following options: - Stable Version + - Windows + - Mac + - Linux (Debian) + - Linux (Fedora) - Beta Version + - Windows + - Mac + - Linux (Debian) + - Linux (Fedora) - Nightly Version + - Windows + - Mac + - Linux (Debian) + - Linux (Fedora) ### Libraries From 425f641dddcb154cc5599a94ea59ca6e377a1136 Mon Sep 17 00:00:00 2001 From: irfanpena Date: Mon, 9 Sep 2024 15:03:47 +0700 Subject: [PATCH 14/21] Update the build from source steps --- README.md | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/README.md b/README.md index 5c736716c..e7a034cd8 100644 --- a/README.md +++ b/README.md @@ -54,15 +54,14 @@ cd vcpkg ./bootstrap-vcpkg.bat vcpkg install ``` -4. Use Visual Studio with the C++ development kit to build the project using the files generated in the `vcpkg` folder. -5. Build the CortexCPP inside the `engine` folder: +4. Build the CortexCPP inside the `build` folder: ```bash mkdir build cd build cmake .. -DBUILD_SHARED_LIBS=OFF -DCMAKE_TOOLCHAIN_FILE=path_to_vcpkg_folder/vcpkg/scripts/buildsystems/vcpkg.cmake -DVCPKG_TARGET_TRIPLET=x64-windows-static ``` - +5. Use Visual Studio with the C++ development kit to build the project using the files generated in the `build` folder. ## Quickstart To run and chat with a model in CortexCPP: From 7bcd47ba3b9a34d7ff902593ad43bc5e9207995c Mon Sep 17 00:00:00 2001 From: irfanpena Date: Mon, 9 Sep 2024 15:07:20 +0700 Subject: [PATCH 15/21] nits --- README.md | 30 +++++++++++++++--------------- 1 file changed, 15 insertions(+), 15 deletions(-) diff --git a/README.md b/README.md index e7a034cd8..7ab60c9d0 100644 --- a/README.md +++ b/README.md @@ -20,21 +20,21 @@ CortexCPP supports the following engines: ## Installation To install CortexCPP, download the installer for your operating system from the following options: -- Stable Version - - Windows - - Mac - - Linux (Debian) - - Linux (Fedora) -- Beta Version - - Windows - - Mac - - Linux (Debian) - - Linux (Fedora) -- Nightly Version - - Windows - - Mac - - Linux (Debian) - - Linux (Fedora) +- **Stable Version** + - [Windows]() + - [Mac]() + - [Linux (Debian)]() + - [Linux (Fedora)]() +- **Beta Version** + - [Windows]() + - [Mac]() + - [Linux (Debian)]() + - [Linux (Fedora)]() +- **Nightly Version** + - [Windows]() + - [Mac]() + - [Linux (Debian)]() + - [Linux (Fedora)]() ### Libraries From ba71e2bcd94fad62169106ad945c271ebed3b17d Mon Sep 17 00:00:00 2001 From: irfanpena Date: Mon, 9 Sep 2024 15:14:42 +0700 Subject: [PATCH 16/21] nits --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 7ab60c9d0..612824ef7 100644 --- a/README.md +++ b/README.md @@ -15,7 +15,7 @@ CortexCPP is a Local AI engine that is used to run and customize LLMs. CortexCPP CortexCPP supports the following engines: - [`cortex.llamacpp`](https://github.com/janhq/cortex.llamacpp) -- [`cortex.onnx` Repository](https://github.com/janhq/cortex.onnx) +- [`cortex.onnx`](https://github.com/janhq/cortex.onnx) - [`cortex.tensorrt-llm`](https://github.com/janhq/cortex.tensorrt-llm) ## Installation From fb145214116335e79c448d49595f58586b631acd Mon Sep 17 00:00:00 2001 From: irfanpena Date: Mon, 9 Sep 2024 16:26:19 +0700 Subject: [PATCH 17/21] 1337 -> 3928 --- README.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/README.md b/README.md index 612824ef7..c34655bdd 100644 --- a/README.md +++ b/README.md @@ -149,13 +149,13 @@ CortexCPP has a REST API that runs at `localhost:3928`. ### Pull a Model ```bash curl --request POST \ - --url http://localhost:1337/v1/models/{model_id}/pull + --url http://localhost:3928/v1/models/{model_id}/pull ``` ### Start a Model ```bash curl --request POST \ - --url http://localhost:1337/v1/models/{model_id}/start \ + --url http://localhost:3928/v1/models/{model_id}/start \ --header 'Content-Type: application/json' \ --data '{ "prompt_template": "system\n{system_message}\nuser\n{prompt}\nassistant", @@ -177,7 +177,7 @@ curl --request POST \ ### Chat with a Model ```bash -curl http://localhost:1337/v1/chat/completions \ +curl http://localhost:3928/v1/chat/completions \ -H "Content-Type: application/json" \ -d '{ "model": "", @@ -203,7 +203,7 @@ curl http://localhost:1337/v1/chat/completions \ ### Stop a Model ```bash curl --request POST \ - --url http://localhost:1337/v1/models/mistral/stop + --url http://localhost:3928/v1/models/mistral/stop ``` From ea84b2d1ebd5af6aa1f427d53193d48f4b855b2e Mon Sep 17 00:00:00 2001 From: irfanpena Date: Tue, 10 Sep 2024 11:34:10 +0700 Subject: [PATCH 18/21] CortexCPP -> Cortex.cpp --- README.md | 30 +++++++++++++++--------------- 1 file changed, 15 insertions(+), 15 deletions(-) diff --git a/README.md b/README.md index c34655bdd..be63f2a1d 100644 --- a/README.md +++ b/README.md @@ -1,4 +1,4 @@ -# CortexCPP +# Cortex.cpp

cortex-cpplogo

@@ -8,18 +8,18 @@ - Changelog - Bug reports - Discord

-> ⚠️ **CortexCPP is currently in Development. This documentation outlines the intended behavior of CortexCPP, which may not yet be fully implemented in the codebase.** +> ⚠️ **Cortex.cpp is currently in Development. This documentation outlines the intended behavior of Cortex, which may not yet be fully implemented in the codebase.** ## About -CortexCPP is a Local AI engine that is used to run and customize LLMs. CortexCPP can be deployed as a standalone server, or integrated into apps like [Jan.ai](https://jan.ai/). +Cortex.cpp is a Local AI engine that is used to run and customize LLMs. Cortex can be deployed as a standalone server, or integrated into apps like [Jan.ai](https://jan.ai/). -CortexCPP supports the following engines: +Cortex supports the following engines: - [`cortex.llamacpp`](https://github.com/janhq/cortex.llamacpp) - [`cortex.onnx`](https://github.com/janhq/cortex.onnx) - [`cortex.tensorrt-llm`](https://github.com/janhq/cortex.tensorrt-llm) ## Installation -To install CortexCPP, download the installer for your operating system from the following options: +To install Cortex, download the installer for your operating system from the following options: - **Stable Version** - [Windows]() - [Mac]() @@ -43,9 +43,9 @@ To install CortexCPP, download the installer for your operating system from the ### Build from Source -To install CortexCPP from the source, follow the steps below: +To install Cortex from the source, follow the steps below: -1. Clone the CortexCPP repository [here](https://github.com/janhq/cortex.cpp). +1. Clone the Cortex.cpp repository [here](https://github.com/janhq/cortex.cpp). 2. Navigate to the `engine > vcpkg` folder. 3. Configure the vpkg: @@ -54,7 +54,7 @@ cd vcpkg ./bootstrap-vcpkg.bat vcpkg install ``` -4. Build the CortexCPP inside the `build` folder: +4. Build the Cortex inside the `build` folder: ```bash mkdir build @@ -64,16 +64,16 @@ cmake .. -DBUILD_SHARED_LIBS=OFF -DCMAKE_TOOLCHAIN_FILE=path_to_vcpkg_folder/vcp 5. Use Visual Studio with the C++ development kit to build the project using the files generated in the `build` folder. ## Quickstart -To run and chat with a model in CortexCPP: +To run and chat with a model in Cortex: ```bash -# Start the CortexCPP server +# Start the Cortex server cortex # Start a model cortex run [model_id] ``` ## Built-in Model Library -CortexCPP supports a list of models available on [Cortex Hub](https://huggingface.co/cortexso). +Cortex.cpp supports a list of models available on [Cortex Hub](https://huggingface.co/cortexso). Here are example of models that you can use based on each supported engine: ### `llama.cpp` @@ -118,11 +118,11 @@ Here are example of models that you can use based on each supported engine: > **Note**: > You should have at least 8 GB of RAM available to run the 7B models, 16 GB to run the 14B models, and 32 GB to run the 32B models. -## CortexCPP CLI Commands +## Cortex.cpp CLI Commands | Command Description | Command Example | |------------------------------------|---------------------------------------------------------------------| -| **Start CortexCPP Server** | `cortex` | +| **Start Cortex Server** | `cortex` | | **Chat with a Model** | `cortex chat [options] [model_id] [message]` | | **Embeddings** | `cortex embeddings [options] [model_id] [message]` | | **Pull a Model** | `cortex pull ` | @@ -138,13 +138,13 @@ Here are example of models that you can use based on each supported engine: | **List Engines** | `cortex engines list [options]` | | **Uninnstall an Engine** | `cortex engines uninstall [options]` | | **Show Model Information** | `cortex ps` | -| **Update CortexCPP** | `cortex update [options]` | +| **Update Cortex** | `cortex update [options]` | > **Note**: > For a more detailed CLI Reference documentation, please see [here](https://cortex.so/docs/cli). ## REST API -CortexCPP has a REST API that runs at `localhost:3928`. +Cortex.cpp has a REST API that runs at `localhost:3928`. ### Pull a Model ```bash From 01981b9ea2f6f65a1871c93411883ae3c09dbad1 Mon Sep 17 00:00:00 2001 From: irfanpena Date: Tue, 10 Sep 2024 15:42:58 +0700 Subject: [PATCH 19/21] nits --- README.md | 45 +++++++++++++++++++++++++-------------------- 1 file changed, 25 insertions(+), 20 deletions(-) diff --git a/README.md b/README.md index be63f2a1d..94e2cbee5 100644 --- a/README.md +++ b/README.md @@ -13,28 +13,28 @@ ## About Cortex.cpp is a Local AI engine that is used to run and customize LLMs. Cortex can be deployed as a standalone server, or integrated into apps like [Jan.ai](https://jan.ai/). -Cortex supports the following engines: +Cortex.cpp supports the following engines: - [`cortex.llamacpp`](https://github.com/janhq/cortex.llamacpp) - [`cortex.onnx`](https://github.com/janhq/cortex.onnx) - [`cortex.tensorrt-llm`](https://github.com/janhq/cortex.tensorrt-llm) ## Installation -To install Cortex, download the installer for your operating system from the following options: +To install Cortex.cpp, download the installer for your operating system from the following options: - **Stable Version** - - [Windows]() - - [Mac]() - - [Linux (Debian)]() - - [Linux (Fedora)]() + - [Windows](https://github.com/janhq/cortex.cpp/releases) + - [Mac](https://github.com/janhq/cortex.cpp/releases) + - [Linux (Debian)](https://github.com/janhq/cortex.cpp/releases) + - [Linux (Fedora)](https://github.com/janhq/cortex.cpp/releases) - **Beta Version** - - [Windows]() - - [Mac]() - - [Linux (Debian)]() - - [Linux (Fedora)]() + - [Windows](https://github.com/janhq/cortex.cpp/releases) + - [Mac](https://github.com/janhq/cortex.cpp/releases) + - [Linux (Debian)](https://github.com/janhq/cortex.cpp/releases) + - [Linux (Fedora)](https://github.com/janhq/cortex.cpp/releases) - **Nightly Version** - - [Windows]() - - [Mac]() - - [Linux (Debian)]() - - [Linux (Fedora)]() + - [Windows](https://github.com/janhq/cortex.cpp/releases) + - [Mac](https://github.com/janhq/cortex.cpp/releases) + - [Linux (Debian)](https://github.com/janhq/cortex.cpp/releases) + - [Linux (Fedora)](https://github.com/janhq/cortex.cpp/releases) ### Libraries @@ -43,7 +43,7 @@ To install Cortex, download the installer for your operating system from the fol ### Build from Source -To install Cortex from the source, follow the steps below: +To install Cortex.cpp from the source, follow the steps below: 1. Clone the Cortex.cpp repository [here](https://github.com/janhq/cortex.cpp). 2. Navigate to the `engine > vcpkg` folder. @@ -54,7 +54,7 @@ cd vcpkg ./bootstrap-vcpkg.bat vcpkg install ``` -4. Build the Cortex inside the `build` folder: +4. Build the Cortex.cpp inside the `build` folder: ```bash mkdir build @@ -62,11 +62,16 @@ cd build cmake .. -DBUILD_SHARED_LIBS=OFF -DCMAKE_TOOLCHAIN_FILE=path_to_vcpkg_folder/vcpkg/scripts/buildsystems/vcpkg.cmake -DVCPKG_TARGET_TRIPLET=x64-windows-static ``` 5. Use Visual Studio with the C++ development kit to build the project using the files generated in the `build` folder. +6. Verify that Cortex.cpp is installed correctly by getting help information. +```sh +# Get the help information +cortex -h +``` ## Quickstart -To run and chat with a model in Cortex: +To run and chat with a model in Cortex.cpp: ```bash -# Start the Cortex server +# Start the Cortex.cpp server cortex # Start a model @@ -122,7 +127,7 @@ Here are example of models that you can use based on each supported engine: | Command Description | Command Example | |------------------------------------|---------------------------------------------------------------------| -| **Start Cortex Server** | `cortex` | +| **Start Cortex.cpp Server** | `cortex` | | **Chat with a Model** | `cortex chat [options] [model_id] [message]` | | **Embeddings** | `cortex embeddings [options] [model_id] [message]` | | **Pull a Model** | `cortex pull ` | @@ -138,7 +143,7 @@ Here are example of models that you can use based on each supported engine: | **List Engines** | `cortex engines list [options]` | | **Uninnstall an Engine** | `cortex engines uninstall [options]` | | **Show Model Information** | `cortex ps` | -| **Update Cortex** | `cortex update [options]` | +| **Update Cortex.cpp** | `cortex update [options]` | > **Note**: > For a more detailed CLI Reference documentation, please see [here](https://cortex.so/docs/cli). From 1d434cf096841197d20dc560834e7d7f899ae529 Mon Sep 17 00:00:00 2001 From: irfanpena Date: Wed, 11 Sep 2024 16:32:57 +0700 Subject: [PATCH 20/21] change the engine names --- README.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/README.md b/README.md index 94e2cbee5..44bb3ff9d 100644 --- a/README.md +++ b/README.md @@ -14,9 +14,9 @@ Cortex.cpp is a Local AI engine that is used to run and customize LLMs. Cortex can be deployed as a standalone server, or integrated into apps like [Jan.ai](https://jan.ai/). Cortex.cpp supports the following engines: -- [`cortex.llamacpp`](https://github.com/janhq/cortex.llamacpp) -- [`cortex.onnx`](https://github.com/janhq/cortex.onnx) -- [`cortex.tensorrt-llm`](https://github.com/janhq/cortex.tensorrt-llm) +- [`llamacpp`](https://github.com/janhq/cortex.llamacpp) +- [`onnx`](https://github.com/janhq/cortex.onnx) +- [`tensorrt-llm`](https://github.com/janhq/cortex.tensorrt-llm) ## Installation To install Cortex.cpp, download the installer for your operating system from the following options: @@ -176,7 +176,7 @@ curl --request POST \ "flash_attn": true, "cache_type": "f16", "use_mmap": true, - "engine": "cortex.llamacpp" + "engine": "llamacpp" }' ``` From a7e1de336dd2dfca98bfb0ae1209ee4925411e73 Mon Sep 17 00:00:00 2001 From: irfanpena Date: Thu, 12 Sep 2024 12:43:06 +0700 Subject: [PATCH 21/21] Updated per feedbacks, except the PORT --- README.md | 247 ++++++++++++++++++++++++++++++++++++++++++++---------- 1 file changed, 201 insertions(+), 46 deletions(-) diff --git a/README.md b/README.md index 44bb3ff9d..84d46d84b 100644 --- a/README.md +++ b/README.md @@ -3,6 +3,15 @@ cortex-cpplogo

+

+ + GitHub commit activity + Github Last Commit + Github Contributors + GitHub closed issues + Discord +

+

Documentation - API Reference - Changelog - Bug reports - Discord @@ -13,61 +22,130 @@ ## About Cortex.cpp is a Local AI engine that is used to run and customize LLMs. Cortex can be deployed as a standalone server, or integrated into apps like [Jan.ai](https://jan.ai/). -Cortex.cpp supports the following engines: +Cortex.cpp is a multi-engine that uses `llama.cpp` as the default engine but also supports the following: - [`llamacpp`](https://github.com/janhq/cortex.llamacpp) - [`onnx`](https://github.com/janhq/cortex.onnx) - [`tensorrt-llm`](https://github.com/janhq/cortex.tensorrt-llm) ## Installation To install Cortex.cpp, download the installer for your operating system from the following options: -- **Stable Version** - - [Windows](https://github.com/janhq/cortex.cpp/releases) - - [Mac](https://github.com/janhq/cortex.cpp/releases) - - [Linux (Debian)](https://github.com/janhq/cortex.cpp/releases) - - [Linux (Fedora)](https://github.com/janhq/cortex.cpp/releases) -- **Beta Version** - - [Windows](https://github.com/janhq/cortex.cpp/releases) - - [Mac](https://github.com/janhq/cortex.cpp/releases) - - [Linux (Debian)](https://github.com/janhq/cortex.cpp/releases) - - [Linux (Fedora)](https://github.com/janhq/cortex.cpp/releases) -- **Nightly Version** - - [Windows](https://github.com/janhq/cortex.cpp/releases) - - [Mac](https://github.com/janhq/cortex.cpp/releases) - - [Linux (Debian)](https://github.com/janhq/cortex.cpp/releases) - - [Linux (Fedora)](https://github.com/janhq/cortex.cpp/releases) + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Version TypeWindowsMacOSLinux
Stable (Recommended) + + + cortexcpp.exe + + + + + Intel + + + + + M1/M2/M3/M4 + + + + + cortexcpp.deb + + + + + cortexcpp.AppImage + +
Beta Build + + + cortexcpp.exe + + + + + Intel + + + + + M1/M2/M3/M4 + + + + + cortexcpp.deb + + + + + cortexcpp.AppImage + +
Nightly Build + + + cortexcpp.exe + + + + + Intel + + + + + M1/M2/M3/M4 + + + + + cortexcpp.deb + + + + + cortexcpp.AppImage + +
+ +> **Note**: +> You can also build Cortex.cpp from source by following the steps [here](#build-from-source). ### Libraries - [cortex.js](https://github.com/janhq/cortex.js) - [cortex.py](https://github.com/janhq/cortex-python) -### Build from Source - -To install Cortex.cpp from the source, follow the steps below: - -1. Clone the Cortex.cpp repository [here](https://github.com/janhq/cortex.cpp). -2. Navigate to the `engine > vcpkg` folder. -3. Configure the vpkg: - -```bash -cd vcpkg -./bootstrap-vcpkg.bat -vcpkg install -``` -4. Build the Cortex.cpp inside the `build` folder: - -```bash -mkdir build -cd build -cmake .. -DBUILD_SHARED_LIBS=OFF -DCMAKE_TOOLCHAIN_FILE=path_to_vcpkg_folder/vcpkg/scripts/buildsystems/vcpkg.cmake -DVCPKG_TARGET_TRIPLET=x64-windows-static -``` -5. Use Visual Studio with the C++ development kit to build the project using the files generated in the `build` folder. -6. Verify that Cortex.cpp is installed correctly by getting help information. - -```sh -# Get the help information -cortex -h -``` ## Quickstart To run and chat with a model in Cortex.cpp: ```bash @@ -75,7 +153,7 @@ To run and chat with a model in Cortex.cpp: cortex # Start a model -cortex run [model_id] +cortex run :[engine_name] ``` ## Built-in Model Library Cortex.cpp supports a list of models available on [Cortex Hub](https://huggingface.co/cortexso). @@ -145,7 +223,7 @@ Here are example of models that you can use based on each supported engine: | **Show Model Information** | `cortex ps` | | **Update Cortex.cpp** | `cortex update [options]` | -> **Note**: +> **Note** > For a more detailed CLI Reference documentation, please see [here](https://cortex.so/docs/cli). ## REST API @@ -211,8 +289,85 @@ curl --request POST \ --url http://localhost:3928/v1/models/mistral/stop ``` +> **Note** +> Check our [API documentation](https://cortex.so/api-reference) for a full list of available endpoints. + +## Build from Source + +### Windows +1. Clone the Cortex.cpp repository [here](https://github.com/janhq/cortex.cpp). +2. Navigate to the `engine > vcpkg` folder. +3. Configure the vpkg: + +```bash +cd vcpkg +./bootstrap-vcpkg.bat +vcpkg install +``` +4. Build the Cortex.cpp inside the `build` folder: + +```bash +mkdir build +cd build +cmake .. -DBUILD_SHARED_LIBS=OFF -DCMAKE_TOOLCHAIN_FILE=path_to_vcpkg_folder/vcpkg/scripts/buildsystems/vcpkg.cmake -DVCPKG_TARGET_TRIPLET=x64-windows-static +``` +5. Use Visual Studio with the C++ development kit to build the project using the files generated in the `build` folder. +6. Verify that Cortex.cpp is installed correctly by getting help information. + +```sh +# Get the help information +cortex -h +``` +### MacOS +1. Clone the Cortex.cpp repository [here](https://github.com/janhq/cortex.cpp). +2. Navigate to the `engine > vcpkg` folder. +3. Configure the vpkg: -> **Note**: Check our [API documentation](https://cortex.so/api-reference) for a full list of available endpoints. +```bash +cd vcpkg +./bootstrap-vcpkg.sh +vcpkg install +``` +4. Build the Cortex.cpp inside the `build` folder: + +```bash +mkdir build +cd build +cmake .. -DCMAKE_TOOLCHAIN_FILE=path_to_vcpkg_folder/vcpkg/scripts/buildsystems/vcpkg.cmake +make -j4 +``` +5. Use Visual Studio with the C++ development kit to build the project using the files generated in the `build` folder. +6. Verify that Cortex.cpp is installed correctly by getting help information. + +```sh +# Get the help information +cortex -h +``` +### Linux +1. Clone the Cortex.cpp repository [here](https://github.com/janhq/cortex.cpp). +2. Navigate to the `engine > vcpkg` folder. +3. Configure the vpkg: + +```bash +cd vcpkg +./bootstrap-vcpkg.sh +vcpkg install +``` +4. Build the Cortex.cpp inside the `build` folder: + +```bash +mkdir build +cd build +cmake .. -DCMAKE_TOOLCHAIN_FILE=path_to_vcpkg_folder/vcpkg/scripts/buildsystems/vcpkg.cmake +make -j4 +``` +5. Use Visual Studio with the C++ development kit to build the project using the files generated in the `build` folder. +6. Verify that Cortex.cpp is installed correctly by getting help information. + +```sh +# Get the help information +cortex -h +``` ## Contact Support - For support, please file a [GitHub ticket](https://github.com/janhq/cortex/issues/new/choose).