Releases: timoklimmer/powerproxy-aoai
Releases · timoklimmer/powerproxy-aoai
v0.10.0
What's Changed
- Load balancing across deployments (in addition to load balancing across endpoints)
- Option to restrict access to certain deployments/models per client
- Deployment/model-dependent TPM limits per client
- Built-in config file validation incl. script to validate a configuration file
- Troubleshooting improvements
- Updated tests and API version updates, script to run all tests at once against an arbitrary PowerProxy endpoint
- Switch to ruff, automatic formatting and imports sorting on save
- Improved linter configuration
- Several bugfixes
- Several dependency version bumps (fastapi, tiktoken, azure-identity, redis)
- Documentation updates
Breaking Changes
- The fixed_client option in the configuration has been replaced by the uses_entra_id_auth option. See the example configuration in config.example.yaml for more details.
- The schema of the custom table used by the LogUsageToLogAnalytics has changed to include the virtual deployment and real deployment besides the endpoint
New Contributors
- @mashastroganova made their first contribution in #61
Full Changelog: v0.9.2...v0.10.0
v0.9.2
v0.9.1
What's Changed
- Bump fastapi from 0.109.2 to 0.110.0 by @dependabot in #51
- Bump redis[hiredis] from 5.0.1 to 5.0.2 by @dependabot in #55
- fix(deployment): added collection endpoint id to collection rule by @cponfick in #52
- fix: completion_token are not always part of the response by @cponfick in #56
- fix: remove content-length if transfer-encoding is present by @sergey124 in #54
New Contributors
- @cponfick made their first contribution in #52
- @sergey124 made their first contribution in #54
Full Changelog: v0.9.0...v0.9.1
v0.9.0
What's Changed
- Handling of 500 HTTP code responses from AOAI endpoints (treated similarly as 429s)
- Optimized handling of non_streaming_fraction parameter
- Several version bumps
- Test updates
- Documentation updates
Full Changelog: v0.8.1...v0.9.0
v0.8.1
- Fixes a problem with streaming responses where chunks are not returned immediately.
- Increased read timeout to give Azure OpenAI more time to generate the next chunk
Full Changelog: v0.8.0...v0.8.1
v0.8.0
What's Changed
- Usage limiting now uses Redis Cache to support multiple workers serving requests
- Waiting time improvements for endpoints at capacity
- Dockerfile: fix "Permission denied: ../logs" from log plugin by @Johann-Foerster in #35
New Contributors
- @Johann-Foerster made their first contribution in #35
Full Changelog: v0.7.0...v0.8.0
v0.7.0
What's Changed
- Smart load balancing feature, support for multiple endpoints
- Schema changes for Log Analytics usage table
- Depoyment improvements
- Documentation updates
- Version bumps
Full Changelog: v0.6.1...v0.7.0
v0.6.1
Added Configuration Updates section to readme file.
Full Changelog: v0.6.0...v0.6.1
v0.6.0
What's Changed
- Export/import PowerProxy configuration from/to Container App deployment to enable config changes without re-deployment and integration into external permission management systems
- Output of PowerProxy URL at end of deployment script
- max_tokens_per_minute_k -> max_tokens_per_minute_in_k
- No extra containerapp extension for Azure CLI any more, as is part of default Azure CLI now
- Azure deployment uses YAML-formatted configuration instead of mix of YAML and JSON, removed --config-string parameter
- Updated samples to use openai Python package version 1.3.5
- Several other version bumps
- Bugfixes
- Documentation updates
Full Changelog: v0.5.0...v0.6.0
v0.5.0
What's Changed
- Deployment improvements, no more manual steps required for deployment (except adjusting config files)
- Bump langchain from 0.0.310 to 0.0.311 by @dependabot in #24
- Bump azure-identity from 1.14.0 to 1.14.1 by @dependabot in #26
- Bump fastapi from 0.103.2 to 0.104.0 by @dependabot in #27
- Bump azure-identity from 1.14.1 to 1.15.0 by @dependabot in #28
- Bump httpx from 0.25.0 to 0.25.1 by @dependabot in #30
- Bump fastapi from 0.104.0 to 0.104.1 by @dependabot in #29
- Bump uvicorn[standard] from 0.23.2 to 0.24.0.post1 by @dependabot in #31
Full Changelog: v0.4.0...v0.5.0