Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEAT]: Add "keep_alive" Parameter Configuration in UI for Ollama Models #1902

Closed
MrSimonC opened this issue Jul 19, 2024 · 3 comments · Fixed by #1920
Closed

[FEAT]: Add "keep_alive" Parameter Configuration in UI for Ollama Models #1902

MrSimonC opened this issue Jul 19, 2024 · 3 comments · Fixed by #1920
Assignees
Labels
enhancement New feature or request feature request

Comments

@MrSimonC
Copy link
Contributor

What would you like to see?

Dear Maintainers,

I hope this message finds you well. First and foremost, I would like to express my gratitude for your continuous efforts in maintaining and improving Anything LLM. Your work is highly appreciated by the community.

I am writing to request the addition of a feature that would greatly enhance the usability and flexibility of the application, specifically when using Ollama models. Currently, the Ollama models support a "keep_alive" parameter, as documented here on ollama and here in langchain. (I've seen in the source code that you use langchain as a wrapper around the call to ollama which is why I'm providing that link.)

The "keep_alive" parameter is useful for keeping ollama models in memory longer - and as of now, there is no option to configure this parameter directly from the UI.

Problem:
The model always defaults to a 5-minute load in memory only. This behavior can be observed using the "ollama ps" command once you start a chat in Anything LLM. This default setting leads to the model unloading from memory after 5 minutes, which is not ideal for scenarios requiring persistent connections. (Additionally, I've tested the OLLAMA_KEEP_ALIVE environment variable implemented by Ollama does not get respected when using the API/chat endpoint from Anything LLM). I've also tried the curl http://localhost:11434/api/generate -d '{"model": "llama3", "keep_alive": -1}' command (docs) but a new call from AnythingLLM seems to override this setting and default back to 5 mins.

Feature Request:

  1. Add "keep_alive" Configuration in UI:
    • Introduce ani nput field in the ollama-specific UI that allows users to enter a value into the "keep_alive" parameter when configuring Ollama models.
    • Optionally, provide a brief description or tooltip explaining the purpose and impact of the "keep_alive" parameter?

Benefits:

  • Improved Performance: Allowing users to configure the "keep_alive" parameter can help maintain persistent connections, reducing latency and improving overall performance when making requests with a >5 min break in between.
  • Enhanced Flexibility: Users can tailor the behavior of the Ollama models to better suit their specific use cases.
  • Resolve Current Limitation: Address the current limitation where the model defaults to a 5-minute memory load.

I believe this feature will be a valuable addition to Anything LLM and will enhance the user experience for many. Thank you for considering this request. If you need any further information or assistance, please do not hesitate to contact me.

Best regards,
Simon

@MrSimonC MrSimonC added enhancement New feature or request feature request labels Jul 19, 2024
@MrSimonC
Copy link
Contributor Author

Also, if it helps, I'm currently running Anything LLM in a Docker container in WSL2 and accompanied by a direct native install of Ollama in WSL 2 on Windows (due to my company restrictions on installing files).

@MrSimonC
Copy link
Contributor Author

Update: I've been searching and found this is a dup of #1588 which has been moved to #1585

@timothycarambat
Copy link
Member

Very well written, thank you. Will move the conversation to this issue since its more action-oriented.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request feature request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants