feat(grammar): add llama3.1 schema #3015

mudler · 2024-07-26T15:14:35Z

Description

This PR allows LocalAI to generate BNF rules to constrain the LLM output to generate valid JSON in the llama3.1 format. It also introduces a series of refactoring as such it is easier now to extend to other schemas to specifically generate a syntax.

The new schema, allows to force the LLM output to this format: <function=example_function_name>{{"example_name": "example_value"}}</function> which is common for Llama 3.1 function calling.

How it works: to enable this behavior, set schema_type to llama3.1, for instance:

function:
  grammar:
    schema_type: llama3.1 # or JSON is supported too (json)

Forces the LLM to return function tool calls always with the llama3.1 format.

It keeps the behavior also with mixed grammar, I've tested with:

context_size: 8192
f16: true
function:
  disable_no_action: true
  grammar:
    #disable: true
    no_mixed_free_string: true
    mixed_mode: true
    schema_type: llama3.1 # or JSON is supported too (json)
  response_regex:
  - <function=(?P<name>\w+)>(?P<arguments>.*)</function>
mmap: true
name: meta-llama-3.1-8b-instruct
parameters:
  model: Meta-Llama-3.1-8B-Instruct.Q4_K_M.gguf
stopwords:
- <|im_end|>
- <dummy32000>
- <|eot_id|>
- <|end_of_text|>
template:
  chat: |
    <|begin_of_text|>{{.Input }}
    <|start_header_id|>assistant<|end_header_id|>
  chat_message: |
    <|start_header_id|>{{if eq .RoleName "assistant"}}assistant{{else if eq .RoleName "system"}}system{{else if eq .RoleName "tool"}}tool{{else if eq .RoleName "user"}}user{{end}}<|end_header_id|>

    {{ if .FunctionCall -}}
    Function call:
    {{ else if eq .RoleName "tool" -}}
    Function response:
    {{ end -}}
    {{ if .Content -}}
    {{.Content -}}
    {{ else if .FunctionCall -}}
    {{ toJson .FunctionCall -}}
    {{ end -}}
    <|eot_id|>
  completion: |
    {{.Input}}
  function: |
    <|start_header_id|>system<|end_header_id|>

    You have access to the following functions:

    {{range .Functions}}
    Use the function '{{.Name}}' to '{{.Description}}'
    {{toJson .Parameters}}
    {{end}}

    Think very carefully before calling functions.
    If a you choose to call a function ONLY reply in the following format with no prefix or suffix:

    <function=example_function_name>{{`{{"example_name": "example_value"}}`}}</function>

    Reminder:
    - If looking for real time information use relevant functions before falling back to searching on internet
    - Function calls MUST follow the specified format, start with <function= and end with </function>
    - Required parameters MUST be specified
    - Only call one function at a time
    - Put the entire function call reply on one line
    <|eot_id|>
    {{.Input }}
    <|start_header_id|>assistant<|end_header_id|>

Notes for Reviewers

Signed commits

Yes, I signed my commits.

netlify · 2024-07-26T15:14:50Z

✅ Deploy Preview for localai ready!

Name	Link
🔨 Latest commit	`a46d533`
🔍 Latest deploy log	https://app.netlify.com/sites/localai/deploys/66a3c7ea469a69000879b506
😎 Deploy Preview	https://deploy-preview-3015--localai.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

dave-gray101

I think the merging of grammars down makes sense. llama 3.1 compatibility enhancements are a boon as well!

…9.3 by renovate (#24494) This PR contains the following updates: | Package | Update | Change | |---|---|---| | [docker.io/localai/localai](https://togithub.com/mudler/LocalAI) | patch | `v2.19.2-aio-cpu` -> `v2.19.3-aio-cpu` | | [docker.io/localai/localai](https://togithub.com/mudler/LocalAI) | patch | `v2.19.2-aio-gpu-nvidia-cuda-11` -> `v2.19.3-aio-gpu-nvidia-cuda-11` | | [docker.io/localai/localai](https://togithub.com/mudler/LocalAI) | patch | `v2.19.2-aio-gpu-nvidia-cuda-12` -> `v2.19.3-aio-gpu-nvidia-cuda-12` | | [docker.io/localai/localai](https://togithub.com/mudler/LocalAI) | patch | `v2.19.2-cublas-cuda11-ffmpeg-core` -> `v2.19.3-cublas-cuda11-ffmpeg-core` | | [docker.io/localai/localai](https://togithub.com/mudler/LocalAI) | patch | `v2.19.2-cublas-cuda11-core` -> `v2.19.3-cublas-cuda11-core` | | [docker.io/localai/localai](https://togithub.com/mudler/LocalAI) | patch | `v2.19.2-cublas-cuda12-ffmpeg-core` -> `v2.19.3-cublas-cuda12-ffmpeg-core` | | [docker.io/localai/localai](https://togithub.com/mudler/LocalAI) | patch | `v2.19.2-cublas-cuda12-core` -> `v2.19.3-cublas-cuda12-core` | | [docker.io/localai/localai](https://togithub.com/mudler/LocalAI) | patch | `v2.19.2-ffmpeg-core` -> `v2.19.3-ffmpeg-core` | | [docker.io/localai/localai](https://togithub.com/mudler/LocalAI) | patch | `v2.19.2` -> `v2.19.3` | --- > [!WARNING] > Some dependencies could not be looked up. Check the Dependency Dashboard for more information. --- ### Release Notes <details> <summary>mudler/LocalAI (docker.io/localai/localai)</summary> ### [`v2.19.3`](https://togithub.com/mudler/LocalAI/releases/tag/v2.19.3) [Compare Source](https://togithub.com/mudler/LocalAI/compare/v2.19.2...v2.19.3)  ##### What's Changed ##### Bug fixes 🐛 - fix(gallery): do not attempt to delete duplicate files by [@mudler](https://togithub.com/mudler) in [mudler/LocalAI#3031 - fix(gallery): do clear out errors once displayed by [@mudler](https://togithub.com/mudler) in [mudler/LocalAI#3033 ##### Exciting New Features 🎉 - feat(grammar): add llama3.1 schema by [@mudler](https://togithub.com/mudler) in [mudler/LocalAI#3015 ##### 🧠 Models - models(gallery): add llama3.1-claude by [@mudler](https://togithub.com/mudler) in [mudler/LocalAI#3005 - models(gallery): add darkidol llama3.1 by [@mudler](https://togithub.com/mudler) in [mudler/LocalAI#3008 - models(gallery): add gemmoy by [@mudler](https://togithub.com/mudler) in [mudler/LocalAI#3009 - chore: add function calling template for llama 3.1 models by [@mudler](https://togithub.com/mudler) in [mudler/LocalAI#3010 - chore: models(gallery): ⬆️ update checksum by [@localai-bot](https://togithub.com/localai-bot) in [mudler/LocalAI#3013 - models(gallery): add mistral-nemo by [@mudler](https://togithub.com/mudler) in [mudler/LocalAI#3019 - models(gallery): add llama3.1-8b-fireplace2 by [@mudler](https://togithub.com/mudler) in [mudler/LocalAI#3018 - models(gallery): add lumimaid-v0.2-12b by [@mudler](https://togithub.com/mudler) in [mudler/LocalAI#3020 - models(gallery): add darkidol-llama-3.1-8b-instruct-1.1-uncensored-iq… by [@mudler](https://togithub.com/mudler) in [mudler/LocalAI#3021 - models(gallery): add meta-llama-3.1-8b-instruct-abliterated by [@mudler](https://togithub.com/mudler) in [mudler/LocalAI#3022 - models(gallery): add llama-3.1-70b-japanese-instruct-2407 by [@mudler](https://togithub.com/mudler) in [mudler/LocalAI#3023 - models(gallery): add llama-3.1-8b-instruct-fei-v1-uncensored by [@mudler](https://togithub.com/mudler) in [mudler/LocalAI#3024 - models(gallery): add openbuddy-llama3.1-8b-v22.1-131k by [@mudler](https://togithub.com/mudler) in [mudler/LocalAI#3025 - models(gallery): add lumimaid-8b by [@mudler](https://togithub.com/mudler) in [mudler/LocalAI#3026 - models(gallery): add llama3 with enforced functioncall with grammars by [@mudler](https://togithub.com/mudler) in [mudler/LocalAI#3027 - chore(model-gallery): ⬆️ update checksum by [@localai-bot](https://togithub.com/localai-bot) in [mudler/LocalAI#3036 ##### 👒 Dependencies - chore: ⬆️ Update ggerganov/llama.cpp by [@localai-bot](https://togithub.com/localai-bot) in [mudler/LocalAI#3003 - chore: ⬆️ Update ggerganov/llama.cpp by [@localai-bot](https://togithub.com/localai-bot) in [mudler/LocalAI#3012 - chore: ⬆️ Update ggerganov/llama.cpp by [@localai-bot](https://togithub.com/localai-bot) in [mudler/LocalAI#3016 - chore: ⬆️ Update ggerganov/llama.cpp by [@localai-bot](https://togithub.com/localai-bot) in [mudler/LocalAI#3030 - chore: ⬆️ Update ggerganov/whisper.cpp by [@localai-bot](https://togithub.com/localai-bot) in [mudler/LocalAI#3029 - chore: ⬆️ Update ggerganov/llama.cpp by [@localai-bot](https://togithub.com/localai-bot) in [mudler/LocalAI#3034 ##### Other Changes - docs: ⬆️ update docs version mudler/LocalAI by [@localai-bot](https://togithub.com/localai-bot) in [mudler/LocalAI#3002 - refactor: break down json grammar parser in different files by [@mudler](https://togithub.com/mudler) in [mudler/LocalAI#3004 - fix: PR title tag for checksum checker script workflow by [@dave-gray101](https://togithub.com/dave-gray101) in [mudler/LocalAI#3014 **Full Changelog**: mudler/LocalAI@v2.19.2...v2.19.3 </details> --- ### Configuration 📅 **Schedule**: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined). 🚦 **Automerge**: Enabled. ♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox. 🔕 **Ignore**: Close this PR and you won't be reminded about these updates again. --- - [ ] If you want to rebase/retry this PR, check this box --- This PR has been generated by [Renovate Bot](https://togithub.com/renovatebot/renovate).

mudler added the enhancement New feature or request label Jul 26, 2024

mudler added 5 commits July 26, 2024 17:15

wip

7408ff0

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

get rid of panics

b5e4589

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

expose it properly from the config

0b2b2f2

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

Simplify

c6e15bf

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

forgot to commit

b4f57a8

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

mudler force-pushed the llama3.1-grammar branch from 07d38aa to 7050246 Compare July 26, 2024 15:15

Remove focus on test

201adb2

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

mudler force-pushed the llama3.1-grammar branch from 7050246 to 201adb2 Compare July 26, 2024 15:16

Small fixups

a46d533

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

dave-gray101 approved these changes Jul 26, 2024

View reviewed changes

mudler merged commit 2169c34 into master Jul 26, 2024
31 checks passed

mudler deleted the llama3.1-grammar branch July 26, 2024 18:11

mudler mentioned this pull request Jul 27, 2024

models(gallery): add llama3 with enforced functioncall with grammars #3027

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(grammar): add llama3.1 schema #3015

feat(grammar): add llama3.1 schema #3015

mudler commented Jul 26, 2024 •

edited

Loading

netlify bot commented Jul 26, 2024 •

edited

Loading

dave-gray101 left a comment

feat(grammar): add llama3.1 schema #3015

feat(grammar): add llama3.1 schema #3015

Conversation

mudler commented Jul 26, 2024 • edited Loading

netlify bot commented Jul 26, 2024 • edited Loading

✅ Deploy Preview for localai ready!

dave-gray101 left a comment

Choose a reason for hiding this comment

mudler commented Jul 26, 2024 •

edited

Loading

netlify bot commented Jul 26, 2024 •

edited

Loading