diff --git a/README.md b/README.md index 6b21c4448..84d46d84b 100644 --- a/README.md +++ b/README.md @@ -1,83 +1,162 @@ -# Cortex +# Cortex.cpp

cortex-cpplogo

+

+ + GitHub commit activity + Github Last Commit + Github Contributors + GitHub closed issues + Discord +

+

Documentation - API Reference - - Changelog - Bug reports - Discord + - Changelog - Bug reports - Discord

-> ⚠️ **Cortex is currently in Development**: Expect breaking changes and bugs! +> ⚠️ **Cortex.cpp is currently in Development. This documentation outlines the intended behavior of Cortex, which may not yet be fully implemented in the codebase.** ## About -Cortex is a C++ AI engine that comes with a Docker-like command-line interface and client libraries. It supports running AI models using `ONNX`, `TensorRT-LLM`, and `llama.cpp` engines. Cortex can function as a standalone server or be integrated as a library. +Cortex.cpp is a Local AI engine that is used to run and customize LLMs. Cortex can be deployed as a standalone server, or integrated into apps like [Jan.ai](https://jan.ai/). -## Cortex Engines -Cortex supports the following engines: -- [`cortex.llamacpp`](https://github.com/janhq/cortex.llamacpp): `cortex.llamacpp` library is a C++ inference tool that can be dynamically loaded by any server at runtime. We use this engine to support GGUF inference with GGUF models. The `llama.cpp` is optimized for performance on both CPU and GPU. -- [`cortex.onnx` Repository](https://github.com/janhq/cortex.onnx): `cortex.onnx` is a C++ inference library for Windows that leverages `onnxruntime-genai` and uses DirectML to provide GPU acceleration across a wide range of hardware and drivers, including AMD, Intel, NVIDIA, and Qualcomm GPUs. -- [`cortex.tensorrt-llm`](https://github.com/janhq/cortex.tensorrt-llm): `cortex.tensorrt-llm` is a C++ inference library designed for NVIDIA GPUs. It incorporates NVIDIA’s TensorRT-LLM for GPU-accelerated inference. +Cortex.cpp is a multi-engine that uses `llama.cpp` as the default engine but also supports the following: +- [`llamacpp`](https://github.com/janhq/cortex.llamacpp) +- [`onnx`](https://github.com/janhq/cortex.onnx) +- [`tensorrt-llm`](https://github.com/janhq/cortex.tensorrt-llm) ## Installation -### MacOs -```bash -brew install cortex-engine -``` -### Windows -```bash -winget install cortex-engine -``` -### Linux -```bash -sudo apt install cortex-engine -``` -### Docker -**Coming Soon!** +To install Cortex.cpp, download the installer for your operating system from the following options: -### Libraries -- [cortex.js](https://github.com/janhq/cortex.js) -- [cortex.py](https://github.com/janhq/cortex-python) - -### Build from Source - -To install Cortex from the source, follow the steps below: - -1. Clone the Cortex repository [here](https://github.com/janhq/cortex/tree/dev). -2. Navigate to the `platform` folder. -3. Open the terminal and run the following command to build the Cortex project: - -```bash -npx nest build -``` - -4. Make the `command.js` executable: + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Version TypeWindowsMacOSLinux
Stable (Recommended) + + + cortexcpp.exe + + + + + Intel + + + + + M1/M2/M3/M4 + + + + + cortexcpp.deb + + + + + cortexcpp.AppImage + +
Beta Build + + + cortexcpp.exe + + + + + Intel + + + + + M1/M2/M3/M4 + + + + + cortexcpp.deb + + + + + cortexcpp.AppImage + +
Nightly Build + + + cortexcpp.exe + + + + + Intel + + + + + M1/M2/M3/M4 + + + + + cortexcpp.deb + + + + + cortexcpp.AppImage + +
-```bash -chmod +x '[path-to]/cortex/platform/dist/src/command.js' -``` +> **Note**: +> You can also build Cortex.cpp from source by following the steps [here](#build-from-source). -5. Link the package globally: - -```bash -npm link -``` +### Libraries +- [cortex.js](https://github.com/janhq/cortex.js) +- [cortex.py](https://github.com/janhq/cortex-python) ## Quickstart -To run and chat with a model in Cortex: +To run and chat with a model in Cortex.cpp: ```bash -# Start the Cortex server +# Start the Cortex.cpp server cortex # Start a model -cortex run [model_id] - -# Chat with a model -cortex chat [model_id] +cortex run :[engine_name] ``` -## Model Library -Cortex supports a list of models available on [Cortex Hub](https://huggingface.co/cortexso). +## Built-in Model Library +Cortex.cpp supports a list of models available on [Cortex Hub](https://huggingface.co/cortexso). Here are example of models that you can use based on each supported engine: ### `llama.cpp` @@ -122,87 +201,44 @@ Here are example of models that you can use based on each supported engine: > **Note**: > You should have at least 8 GB of RAM available to run the 7B models, 16 GB to run the 14B models, and 32 GB to run the 32B models. -## Cortex CLI Commands -> **Note**: +## Cortex.cpp CLI Commands + +| Command Description | Command Example | +|------------------------------------|---------------------------------------------------------------------| +| **Start Cortex.cpp Server** | `cortex` | +| **Chat with a Model** | `cortex chat [options] [model_id] [message]` | +| **Embeddings** | `cortex embeddings [options] [model_id] [message]` | +| **Pull a Model** | `cortex pull ` | +| **Download and Start a Model** | `cortex run [options] [model_id]:[engine]` | +| **Get Model Details** | `cortex models get ` | +| **List Models** | `cortex models list [options]` | +| **Delete a Model** | `cortex models delete ` | +| **Start a Model** | `cortex models start [model_id]` | +| **Stop a Model** | `cortex models stop ` | +| **Update a Model** | `cortex models update [options] ` | +| **Get Engine Details** | `cortex engines get ` | +| **Install an Engine** | `cortex engines install [options]` | +| **List Engines** | `cortex engines list [options]` | +| **Uninnstall an Engine** | `cortex engines uninstall [options]` | +| **Show Model Information** | `cortex ps` | +| **Update Cortex.cpp** | `cortex update [options]` | + +> **Note** > For a more detailed CLI Reference documentation, please see [here](https://cortex.so/docs/cli). -### Start Cortex Server -```bash -cortex -``` -### Chat with a Model -```bash -cortex chat [options] [model_id] [message] -``` -### Embeddings -```bash -cortex embeddings [options] [model_id] [message] -``` -### Pull a Model -```bash -cortex pull -``` -> This command can also pulls Hugging Face's models. -### Download and Start a Model -```bash -cortex run [options] [model_id]:[engine] -``` -### Get a Model Details -```bash -cortex models get -``` -### List Models -```bash -cortex models list [options] -``` -### Remove a Model -```bash -cortex models remove -``` -### Start a Model -```bash -cortex models start [model_id] -``` -### Stop a Model -```bash -cortex models stop -``` -### Update a Model Config -```bash -cortex models update [options] -``` -### Get an Engine Details -```bash -cortex engines get -``` -### Install an Engine -```bash -cortex engines install [options] -``` -### List Engines -```bash -cortex engines list [options] -``` -### Set an Engine Config -```bash -cortex engines set -``` -### Show Model Information -```bash -cortex ps -``` + ## REST API -Cortex has a REST API that runs at `localhost:1337`. +Cortex.cpp has a REST API that runs at `localhost:3928`. ### Pull a Model ```bash curl --request POST \ - --url http://localhost:1337/v1/models/{model_id}/pull + --url http://localhost:3928/v1/models/{model_id}/pull ``` ### Start a Model ```bash curl --request POST \ - --url http://localhost:1337/v1/models/{model_id}/start \ + --url http://localhost:3928/v1/models/{model_id}/start \ --header 'Content-Type: application/json' \ --data '{ "prompt_template": "system\n{system_message}\nuser\n{prompt}\nassistant", @@ -218,13 +254,13 @@ curl --request POST \ "flash_attn": true, "cache_type": "f16", "use_mmap": true, - "engine": "cortex.llamacpp" + "engine": "llamacpp" }' ``` ### Chat with a Model ```bash -curl http://localhost:1337/v1/chat/completions \ +curl http://localhost:3928/v1/chat/completions \ -H "Content-Type: application/json" \ -d '{ "model": "", @@ -250,11 +286,88 @@ curl http://localhost:1337/v1/chat/completions \ ### Stop a Model ```bash curl --request POST \ - --url http://localhost:1337/v1/models/mistral/stop + --url http://localhost:3928/v1/models/mistral/stop ``` +> **Note** +> Check our [API documentation](https://cortex.so/api-reference) for a full list of available endpoints. + +## Build from Source + +### Windows +1. Clone the Cortex.cpp repository [here](https://github.com/janhq/cortex.cpp). +2. Navigate to the `engine > vcpkg` folder. +3. Configure the vpkg: -> **Note**: Check our [API documentation](https://cortex.so/api-reference) for a full list of available endpoints. +```bash +cd vcpkg +./bootstrap-vcpkg.bat +vcpkg install +``` +4. Build the Cortex.cpp inside the `build` folder: + +```bash +mkdir build +cd build +cmake .. -DBUILD_SHARED_LIBS=OFF -DCMAKE_TOOLCHAIN_FILE=path_to_vcpkg_folder/vcpkg/scripts/buildsystems/vcpkg.cmake -DVCPKG_TARGET_TRIPLET=x64-windows-static +``` +5. Use Visual Studio with the C++ development kit to build the project using the files generated in the `build` folder. +6. Verify that Cortex.cpp is installed correctly by getting help information. + +```sh +# Get the help information +cortex -h +``` +### MacOS +1. Clone the Cortex.cpp repository [here](https://github.com/janhq/cortex.cpp). +2. Navigate to the `engine > vcpkg` folder. +3. Configure the vpkg: + +```bash +cd vcpkg +./bootstrap-vcpkg.sh +vcpkg install +``` +4. Build the Cortex.cpp inside the `build` folder: + +```bash +mkdir build +cd build +cmake .. -DCMAKE_TOOLCHAIN_FILE=path_to_vcpkg_folder/vcpkg/scripts/buildsystems/vcpkg.cmake +make -j4 +``` +5. Use Visual Studio with the C++ development kit to build the project using the files generated in the `build` folder. +6. Verify that Cortex.cpp is installed correctly by getting help information. + +```sh +# Get the help information +cortex -h +``` +### Linux +1. Clone the Cortex.cpp repository [here](https://github.com/janhq/cortex.cpp). +2. Navigate to the `engine > vcpkg` folder. +3. Configure the vpkg: + +```bash +cd vcpkg +./bootstrap-vcpkg.sh +vcpkg install +``` +4. Build the Cortex.cpp inside the `build` folder: + +```bash +mkdir build +cd build +cmake .. -DCMAKE_TOOLCHAIN_FILE=path_to_vcpkg_folder/vcpkg/scripts/buildsystems/vcpkg.cmake +make -j4 +``` +5. Use Visual Studio with the C++ development kit to build the project using the files generated in the `build` folder. +6. Verify that Cortex.cpp is installed correctly by getting help information. + +```sh +# Get the help information +cortex -h +``` ## Contact Support - For support, please file a [GitHub ticket](https://github.com/janhq/cortex/issues/new/choose). diff --git a/platform/README.md b/platform/README.md deleted file mode 100644 index 660664159..000000000 --- a/platform/README.md +++ /dev/null @@ -1,142 +0,0 @@ -# Cortex -

- cortex-cpplogo -

- -

- Documentation - API Reference - - Changelog - Bug reports - Discord -

- -> ⚠️ **Cortex is currently in Development**: Expect breaking changes and bugs! - -## About -Cortex is an OpenAI-compatible AI engine that developers can use to build LLM apps. It is packaged with a Docker-inspired command-line interface and client libraries. It can be used as a standalone server or imported as a library. - -## Cortex Engines -Cortex supports the following engines: -- [`cortex.llamacpp`](https://github.com/janhq/cortex.llamacpp): `cortex.llamacpp` library is a C++ inference tool that can be dynamically loaded by any server at runtime. We use this engine to support GGUF inference with GGUF models. The `llama.cpp` is optimized for performance on both CPU and GPU. -- [`cortex.onnx` Repository](https://github.com/janhq/cortex.onnx): `cortex.onnx` is a C++ inference library for Windows that leverages `onnxruntime-genai` and uses DirectML to provide GPU acceleration across a wide range of hardware and drivers, including AMD, Intel, NVIDIA, and Qualcomm GPUs. -- [`cortex.tensorrt-llm`](https://github.com/janhq/cortex.tensorrt-llm): `cortex.tensorrt-llm` is a C++ inference library designed for NVIDIA GPUs. It incorporates NVIDIA’s TensorRT-LLM for GPU-accelerated inference. - -## Quicklinks - -- [Homepage](https://cortex.so/) -- [Docs](https://cortex.so/docs/) - -## Quickstart -### Prerequisites -- **OS**: - - MacOSX 13.6 or higher. - - Windows 10 or higher. - - Ubuntu 22.04 and later. -- **Dependencies**: - - **Node.js**: Version 18 and above is required to run the installation. - - **NPM**: Needed to manage packages. - - **CPU Instruction Sets**: Available for download from the [Cortex GitHub Releases](https://github.com/janhq/cortex/releases) page. - - **OpenMPI**: Required for Linux. Install by using the following command: - ```bash - sudo apt install openmpi-bin libopenmpi-dev - ``` - -> Visit [Quickstart](https://cortex.so/docs/quickstart) to get started. - -### NPM -``` bash -# Install using NPM -npm i -g cortexso -# Run model -cortex run mistral -# To uninstall globally using NPM -npm uninstall -g cortexso -``` - -### Homebrew -``` bash -# Install using Brew -brew install cortexso -# Run model -cortex run mistral -# To uninstall using Brew -brew uninstall cortexso -``` -> You can also install Cortex using the Cortex Installer available on [GitHub Releases](https://github.com/janhq/cortex/releases). - -## Cortex Server -```bash -cortex serve - -# Output -# Started server at http://localhost:1337 -# Swagger UI available at http://localhost:1337/api -``` - -You can now access the Cortex API server at `http://localhost:1337`, -and the Swagger UI at `http://localhost:1337/api`. - -## Build from Source - -To install Cortex from the source, follow the steps below: - -1. Clone the Cortex repository [here](https://github.com/janhq/cortex/tree/dev). -2. Navigate to the `cortex-js` folder. -3. Open the terminal and run the following command to build the Cortex project: - -```bash -npx nest build -``` - -4. Make the `command.js` executable: - -```bash -chmod +x '[path-to]/cortex/cortex-js/dist/src/command.js' -``` - -5. Link the package globally: - -```bash -npm link -``` - -## Cortex CLI Commands - -The following CLI commands are currently available. -See [CLI Reference Docs](https://cortex.so/docs/cli) for more information. - -```bash - - serve Providing API endpoint for Cortex backend. - chat Send a chat request to a model. - init|setup Init settings and download cortex's dependencies. - ps Show running models and their status. - kill Kill running cortex processes. - pull|download Download a model. Working with HuggingFace model id. - run [options] EXPERIMENTAL: Shortcut to start a model and chat. - models Subcommands for managing models. - models list List all available models. - models pull Download a specified model. - models remove Delete a specified model. - models get Retrieve the configuration of a specified model. - models start Start a specified model. - models stop Stop a specified model. - models update Update the configuration of a specified model. - benchmark Benchmark and analyze the performance of a specific AI model using your system. - presets Show all the available model presets within Cortex. - telemetry Retrieve telemetry logs for monitoring and analysis. - embeddings Creates an embedding vector representing the input text. - engines Subcommands for managing engines. - engines get Get an engine details. - engines list Get all the available Cortex engines. - engines init Setup and download the required dependencies to run cortex engines. - configs Subcommands for managing configurations. - configs get Get a configuration details. - configs list Get all the available configurations. - configs set Set a configuration. -``` - -## Contact Support -- For support, please file a [GitHub ticket](https://github.com/janhq/cortex/issues/new/choose). -- For questions, join our Discord [here](https://discord.gg/FTk2MvZwJH). -- For long-form inquiries, please email [hello@jan.ai](mailto:hello@jan.ai). - -