Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] Cross Region Inference #535

Open
1 of 2 tasks
statefb opened this issue Sep 17, 2024 · 2 comments
Open
1 of 2 tasks

[Feature Request] Cross Region Inference #535

statefb opened this issue Sep 17, 2024 · 2 comments
Labels
enhancement New feature or request

Comments

@statefb
Copy link
Contributor

statefb commented Sep 17, 2024

Describe the solution you'd like

Some region does not support some models.

Why the solution needed

To enable inference for unsupported regions.

Additional context

Implementation guideline

  • Only limited regions supports inference profile (currently us and eu only)
    • we can check if the region supports by aws bedrock list-inference-profiles --profile <profile-name>
  • Due to the architecture to routing requests, latency happens (see doc).
    • Add enableBedrockCrossRegionInference option to cdk.json with the default value false?
  • Even the value is set to true, if the bedrockRegion is not included in the supported region, model id without cross region inference should be chosen. Also warning log should be added.

Related PR: #531 (reverted)
Related Issues: #508, #527

Implementation feasibility

Are you willing to discuss the solution with us, decide on the approach, and assist with the implementation?

  • Yes
  • No
@statefb statefb added the enhancement New feature or request label Sep 17, 2024
chm10 added a commit to chm10/bedrock-claude-chat that referenced this issue Sep 17, 2024
…les#535)

This commit introduces the initial setup for supporting cross-region inference in the Bedrock Chat application. The changes include:

- Added `is_region_supported_for_inference` function in `utils.py` to check if a region supports inference.
- Modified `get_bedrock_client` function in `utils.py` to use cross-region inference if enabled and the region is supported.
- Updated `get_model_id` function in `bedrock.py` to use the base model ID for cross-region inference if enabled, the region is supported, and the model is included in `CROSS_REGION_INFERENCE_MODELS`. If any of these conditions are not met, it falls back to using the local model ID and logs a warning.
- Added `enableBedrockCrossRegionInference` option to `cdk.json` with the default value set to `false`.

These changes lay the foundation for enabling cross-region inference in the Bedrock Chat application. The feature can be enabled or disabled using the `enableBedrockCrossRegionInference` configuration option in `cdk.json`.
chm10 added a commit to chm10/bedrock-claude-chat that referenced this issue Sep 17, 2024
chm10 added a commit to chm10/bedrock-claude-chat that referenced this issue Sep 17, 2024
@juan-abia
Copy link

If I do cross region inference I get this error:
Provider eu model does not support chat.

Any idea how to solve it?

@axelpina
Copy link

If I do cross region inference I get this error: Provider eu model does not support chat.

Any idea how to solve it?

There's a PR with a fix in progress but it hasn't been merged: #536

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants