Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Validator Hub Telemetry #559

Merged
merged 13 commits into from
Feb 13, 2024
Merged

Validator Hub Telemetry #559

merged 13 commits into from
Feb 13, 2024

Conversation

thekaranacharya
Copy link
Contributor

@thekaranacharya thekaranacharya commented Jan 26, 2024

What is this about?

Adds manual instrumentation using the OpenTelemetry Python SDK to capture anonymous information and usage metadata and send it to a private OpenSearch sink for analysis.

Following is all what we capture:

  • User ID
  • The unique ID of the Guard object
  • The LLM provider API name
  • Boolean values that reflect whether custom reask prompts and instructions were used (not the actual content of the prompts and instructions)
  • The names of the validators utilized - including both in-house and custom-made (Just the name of the validator)
  • The on_fail actions used on each of the validator e.g. fix, reask, refrain, noop, etc.
  • The result of each validator: pass/fail (Again, just the outcome, not the content)
  • The number of times reask was performed in each guard call

Following are some of the insights we get from the raw traces data:

  • Number of uses per validator, guard and user, LLM Provider, outcome and on_fail action
  • Number of times a custom reask prompt or instruction was provided
  • Most popular validator, on_fail action and LLM Provider

Why do we need this?

  • Observability is crucial for understanding and maintaining complex distributed systems. As modern applications evolve to use microservices and other distributed architectures, it becomes challenging to trace the flow of requests and identify issues that may arise across various services. Observability tools like OpenTelemetry provide a unified way to collect and analyze data, allowing teams to monitor performance, troubleshoot problems, and optimize system behavior. This helps in maintaining a high level of reliability and performance in dynamic and distributed environments.
  • This manual instrumentation helps to observe the metadata, gain valuable insights and find any bugs, and improve iteratively.

@thekaranacharya thekaranacharya marked this pull request as ready for review January 29, 2024 17:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants