Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Connection pooling support for AWSSigV4 clients/connection in data source #6114

Closed
Tracked by #5838
bandinib-amzn opened this issue Mar 11, 2024 · 0 comments
Closed
Tracked by #5838
Assignees
Labels
enhancement New feature or request

Comments

@bandinib-amzn
Copy link
Member

Overview

There are some use cases where you may need multiple instances of the client. You can easily do that by calling new Client() as many times as you need, but you will lose all the benefits of using one single client, such as the long living connections and the connection pool handling. Connection pooling is an important concept when working with OpenSearch to efficiently manage and reuse network connections.

Multi data source client management provides efficient way to manage multiple clients/connection efficiently, and not consume all the memory.

  • For data sources with different endpoint, use client Pooling (E.g. LRU cache)
  • For data sources with same endpoint, but different user, use connection pooling strategy (child client) provided by opensearch-js

Problem Statement

Currently, we are using opensearch-js version 2.3.1 to sign requests using sigv4. For legacy datasource client, we are using legacy client package elasticsearch.js. As elasticsearch.js doesn't provide option to sign requests using sigv4, we used used a third party Connection handler for Amazon ES, HttpAmazonESConnector. HttpAmazonESConnector is no longer being maintained. Therefore, for 2.8.0, while adding support for serverless we used a drop-in replacement package of http-aws-es, to add support of configurable service name.

While both libraries provides AWS SigV4 support, it currently lacks an efficient connection pooling mechanism for SigV4 (Signature Version 4) authentication method.

SigV4 requires clients to include a cryptographic signature in each HTTP request to authenticate and authorize the request. This signature is computed using a combination of the client's AWS access key, secret key, region and other request-specific information. Current architecture does not allow updating credentials for existing client in both opensearch-js client and legacy client. As a result, every HTTP request to AWS services entails the costly process of creating new client. This leads to resource inefficiency, and performance bottlenecks, particularly in scenarios with high request rates.

Proposed Solution

For opensearch-js:

We have recently released opensearch-js 2.6.0 client to npm. 2.6.0 client has inherited AwsSigV4 in .child. I propose to upgrade the opensearch-js 2.6.0 and refactor data source plugin to create child client if endpoint/node is same.

Tasks:
[ ] Upgrade @opensearch/opensearch@2.6.0 which supports AwsSigV4 in .child
[ ] Add support to create child client for AWS sigv4 auth type, similar to basic auth
[ ] Modify client caching mechanism
[ ] For 2.x backports, update dependency version for alias @opensearch-project/opensearch-next

For Legacy client:

We are using HttpAmazonESConnector from http-aws-es. As it is no longer being maintained in upstream, I propose to import the HttpAmazonESConnector class and it's UT in OpenSearch-Dashboards repo and maintain ourselves.

[ ] Copy HttpAmazonESConnector
[ ] Add child support in HttpAmazonESConnector
[ ] Add support to create child client for AWS sigv4 auth type, similar to basic auth
[ ] Modify client caching mechanism

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant