Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prompt playground #355

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open

Prompt playground #355

wants to merge 4 commits into from

Conversation

zilto
Copy link
Collaborator

@zilto zilto commented Sep 6, 2024

Allows to load traces from tracked Burr application with OpenLLMetry traces.

You can load previous chat interactions and try new prompts with multiple LLM providers via LiteLLM.

image

To launch the app

OPENAI_API_KEY=sk-... ANTHROPIC_API_KEY=sk-ant-... streamlit run burr/integrations/playground/app.py

Features:

  • 3 providers at once
  • load interactions directly from local Burr tracking
  • load a specific interaction via the URL params
  • the playground interactions themselves are logged using Burr under burr-playground

Limitations:

  • streamlit makes it very inconvenient to call LLMs asynchronously
  • currently no streamlined way to support generic API key input

@elijahbenizzy
Copy link
Contributor

elijahbenizzy commented Sep 25, 2024

OK, looking good. A few minor UI points after playing around:

  1. Model not selected doesn't run it -- maybe we should be able to schose an arbitrary number of them? Or just grey it out if it's not selected?
  2. Launch button is ina. weird place, I think it should be below the provider/model selection
  3. Selecting a model resets everything -- it shouldn't reset the other ones -- we shoud have a reset button
  4. We should have a progress button if it's running
  5. The third one never seems to work for me...
  6. Tab to link to the burr UI (iframe?) -- we should be able to see the trace it came from + the trace it led to
    See what I'm working with:
image

Copy link
Contributor

@elijahbenizzy elijahbenizzy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some thoughts here -- a few nits on the structure then feedback on the UI



@st.cache_data
def instrument(provider: str):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can replace this all with init_instruments(), right?

msg = f"Couldn't instrument {provider}. Try installing `opentelemetry-instrumenation-{provider}"

if msg:
print(msg)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logger.exception



@action(reads=["history"], writes=["history"])
def generate_answer(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure the value of a single-node burr app, I think it might confuse people.

The standard pattern is to break this into two -- one that processes the input, and one that outputs the result of querying the LLM.

We also could have one per model we're evaluating, but that's a bit more complex.

@@ -0,0 +1,303 @@
import litellm
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add README saying this is experimental + a bit of instructions. Also could be a tab on the app?

@@ -0,0 +1,303 @@
import litellm
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this is an integration -- it's more a "tool"? Maybe it should live somewhere else?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants