Skip to content

Releases: timoklimmer/powerproxy-aoai

v0.10.0

17 May 22:47
Compare
Choose a tag to compare

What's Changed

  • Load balancing across deployments (in addition to load balancing across endpoints)
  • Option to restrict access to certain deployments/models per client
  • Deployment/model-dependent TPM limits per client
  • Built-in config file validation incl. script to validate a configuration file
  • Troubleshooting improvements
  • Updated tests and API version updates, script to run all tests at once against an arbitrary PowerProxy endpoint
  • Switch to ruff, automatic formatting and imports sorting on save
  • Improved linter configuration
  • Several bugfixes
  • Several dependency version bumps (fastapi, tiktoken, azure-identity, redis)
  • Documentation updates

Breaking Changes

  • The fixed_client option in the configuration has been replaced by the uses_entra_id_auth option. See the example configuration in config.example.yaml for more details.
  • The schema of the custom table used by the LogUsageToLogAnalytics has changed to include the virtual deployment and real deployment besides the endpoint

New Contributors

Full Changelog: v0.9.2...v0.10.0

v0.9.2

21 Mar 07:32
Compare
Choose a tag to compare

What's Changed

  • Several bugfixes and version bumps

Full Changelog: v0.9.1...v0.9.2

v0.9.1

29 Feb 17:06
Compare
Choose a tag to compare

What's Changed

  • Bump fastapi from 0.109.2 to 0.110.0 by @dependabot in #51
  • Bump redis[hiredis] from 5.0.1 to 5.0.2 by @dependabot in #55
  • fix(deployment): added collection endpoint id to collection rule by @cponfick in #52
  • fix: completion_token are not always part of the response by @cponfick in #56
  • fix: remove content-length if transfer-encoding is present by @sergey124 in #54

New Contributors

Full Changelog: v0.9.0...v0.9.1

v0.9.0

23 Feb 13:27
Compare
Choose a tag to compare

What's Changed

  • Handling of 500 HTTP code responses from AOAI endpoints (treated similarly as 429s)
  • Optimized handling of non_streaming_fraction parameter
  • Several version bumps
  • Test updates
  • Documentation updates

Full Changelog: v0.8.1...v0.9.0

v0.8.1

12 Dec 16:35
Compare
Choose a tag to compare
  • Fixes a problem with streaming responses where chunks are not returned immediately.
  • Increased read timeout to give Azure OpenAI more time to generate the next chunk

Full Changelog: v0.8.0...v0.8.1

v0.8.0

11 Dec 21:35
Compare
Choose a tag to compare

What's Changed

  • Usage limiting now uses Redis Cache to support multiple workers serving requests
  • Waiting time improvements for endpoints at capacity
  • Dockerfile: fix "Permission denied: ../logs" from log plugin by @Johann-Foerster in #35

New Contributors

Full Changelog: v0.7.0...v0.8.0

v0.7.0

08 Dec 12:23
Compare
Choose a tag to compare

What's Changed

  • Smart load balancing feature, support for multiple endpoints
  • Schema changes for Log Analytics usage table
  • Depoyment improvements
  • Documentation updates
  • Version bumps

Full Changelog: v0.6.1...v0.7.0

v0.6.1

28 Nov 17:31
Compare
Choose a tag to compare

Added Configuration Updates section to readme file.

Full Changelog: v0.6.0...v0.6.1

v0.6.0

28 Nov 17:26
Compare
Choose a tag to compare

What's Changed

  • Export/import PowerProxy configuration from/to Container App deployment to enable config changes without re-deployment and integration into external permission management systems
  • Output of PowerProxy URL at end of deployment script
  • max_tokens_per_minute_k -> max_tokens_per_minute_in_k
  • No extra containerapp extension for Azure CLI any more, as is part of default Azure CLI now
  • Azure deployment uses YAML-formatted configuration instead of mix of YAML and JSON, removed --config-string parameter
  • Updated samples to use openai Python package version 1.3.5
  • Several other version bumps
  • Bugfixes
  • Documentation updates

Full Changelog: v0.5.0...v0.6.0

v0.5.0

08 Nov 16:18
Compare
Choose a tag to compare

What's Changed

  • Deployment improvements, no more manual steps required for deployment (except adjusting config files)
  • Bump langchain from 0.0.310 to 0.0.311 by @dependabot in #24
  • Bump azure-identity from 1.14.0 to 1.14.1 by @dependabot in #26
  • Bump fastapi from 0.103.2 to 0.104.0 by @dependabot in #27
  • Bump azure-identity from 1.14.1 to 1.15.0 by @dependabot in #28
  • Bump httpx from 0.25.0 to 0.25.1 by @dependabot in #30
  • Bump fastapi from 0.104.0 to 0.104.1 by @dependabot in #29
  • Bump uvicorn[standard] from 0.23.2 to 0.24.0.post1 by @dependabot in #31

Full Changelog: v0.4.0...v0.5.0