Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Remote Store] Add netty dependencies in repository-s3 plugin #7216

Merged

Conversation

raghuvanshraj
Copy link
Contributor

@raghuvanshraj raghuvanshraj commented Apr 18, 2023

Credits: @vikasvb90 and @itiyamas for the design and core implementations of the feature.

Description

Adding netty and AWS v2 SDK dependencies to use for parallel multipart async uploads

Issues Resolved

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Commits are signed per the DCO using --signoff
  • Commit changes are listed out in CHANGELOG.md file (See: Changelog)

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@raghuvanshraj raghuvanshraj changed the title Add AWS v2 SDK and netty dependencies [Remote Store] Added AWS v2 SDK and netty dependencies in repository-s3 plugin Apr 19, 2023
@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@dblock
Copy link
Member

dblock commented Apr 20, 2023

Are both AWS SDK v1 and v2 required now? I would have expected an upgrade.

@vikasvb90
Copy link
Contributor

vikasvb90 commented Apr 20, 2023

Both are not required but the plan was to first integrate it with remote store, let it bake for some time and then migrate other users like snapshot and deprecate v1. Intention was to reduce the blast radius. Remote store will only use v2 sdk though.
@raghuvanshraj lets plan to migrate other remote store paths to v2 as well.

@dblock
Copy link
Member

dblock commented Apr 21, 2023

Both are not required but the plan was to first integrate it with remote store, let it bake for some time and then migrate other users like snapshot and deprecate v1. Intention was to reduce the blast radius. Remote store will only use v2 sdk though. @raghuvanshraj lets plan to migrate other remote store paths to v2 as well.

If we need both in the source code let's rename aws to aws_v1 for consistency? But really maybe we can do an upgrade project-wide before this task? How do others feel about carrying both versions of the dependency? @nknize @reta

@vikasvb90
Copy link
Contributor

Both are not required but the plan was to first integrate it with remote store, let it bake for some time and then migrate other users like snapshot and deprecate v1. Intention was to reduce the blast radius. Remote store will only use v2 sdk though. @raghuvanshraj lets plan to migrate other remote store paths to v2 as well.

If we need both in the source code let's rename aws to aws_v1 for consistency? But really maybe we can do an upgrade project-wide before this task? How do others feel about carrying both versions of the dependency? @nknize @reta

@dblock I agree with you that we should migrate to aws_v2 as soon as possible. Our perf runs also resulted in significant improvement in throughput (~15%) and we were able to reap the benefits of multi-part parallel upload with the help of async IO.
But since It's a critical change and there could be repository plugins outside of OpenSearch project which might be using synchronous abstractions as well as aws v1 SDK, it will come as a surprise to them during upgrade if we get rid of aws v1 sdk. So, if we plan to remove v1 sdk first, we will probably have to follow the deprecation route where we need to deprecate v1 sdk as well as synchronous abstractions and let it bake for some time. And later during the next major version bump we get rid of this. This might take a long time and that's why we thought of keeping both sdks for now. One thing we can do right away is to mark synchronous abstractions Deprecated.
Regarding migration of existing remote paths in OpenSearch like snapshot, I am open to suggestions but I am still inclined to take that as a separate task because it is currently being used in production and using v2 in remote store feature will give us a sign off to proceed further.
cc: @shwetathareja @gauravruhela @sachinpkale

@reta
Copy link
Collaborator

reta commented Apr 22, 2023

How do others feel about carrying both versions of the dependency

I lean towards supporting AWS SDK v2 only. The option to support both SDKs seems to add quite a burden for both users and maintainers:

  • we have to make sure that all SDKv2 dependencies are optional so no to break the existing users (but we have to bundle everything in ZIP distribution anyway)
  • we have to provide a way to configure which SDK to use
  • we have to extend the security policy in a way to support both SDKs (even if only one of them could be used at a time)

But since It's a critical change and there could be repository plugins outside of OpenSearch project which might be using synchronous abstractions as well as aws v1 SDK, it will come as a surprise to them during upgrade if we get rid of aws v1 sdk.

That's a valid point, we do publish the repository-s3 artifacts. What if we branch this plugin to let say repository-s3-v2 (or alike)? We do not install these plugins by default (only bundle them) so anyone could choose.

@vikasvb90
Copy link
Contributor

@reta

we have to make sure that all SDKv2 dependencies are optional so no to break the existing users (but we have to bundle everything in ZIP distribution anyway)

How do you think making addition of SDKv2 dependencies mandatory will break existing users?
Please look at plugin PRs in meta issue #6703 . The change of dependencies is in repository-s3 plugin and we have verified that both SDKs v1 and v2 can work in conjunction. Also, v2 is used in async flows and v1 in synchronous flows. And it is in OpenSearch that we are taking the decision of whether to use synchronous or async flows based on the support present in plugin.

we have to provide a way to configure which SDK to use

It won't be just configuration based control as even OpenSearch needs to handle async flow differently. Therefore, there are new abstractions defined for async flow which uses v2 sdk. Meaning, there is a clear separation of v1 flows and v2 flows and both are defined, implemented and can be used independently.

we have to extend the security policy in a way to support both SDKs (even if only one of them could be used at a time)

Again, this is already done and PRs for the same are raised and tagged against Meta issue for reference.

That's a valid point, we do publish the repository-s3 artifacts. What if we branch this plugin to let say repository-s3-v2 (or alike)? We do not install these plugins by default (only bundle them) so anyone could choose.

If I am not wrong, it is modules which are installed by default and not plugins. So, even repository-s3 is an optional package.

We have added new optional abstractions in OpenSearch and respective implementations in repository-s3. By adding new implementations of SDKv2 in repository-s3, I don't think that we are breaking anything for existing users. Please correct me if I am missing something here. We have carried out perf as well for both v1 and v2 SDKs working together in different flows.
Bundling repository-s3 with SDKv2 should ideally have no impact on existing users.

@reta
Copy link
Collaborator

reta commented Apr 22, 2023

thanks @raghuvanshraj

How do you think making addition of SDKv2 dependencies mandatory will break existing users?

The artifacts we publish do bring transitively a lot of dependencies. Now, we used to depend on AWS v1 but would depend on AWS v2 as well. We do not know how the plugin dependencies are being used in the wild, I have seen many variations, but here are a few:

  • one could exclude all transitive deps and use manual dependency - it will break for all v2 flows
  • one could rely on AWS SDK usage from the classpath resolution - it will use v2 instead of v1 to surprise

This is all speculation, I cannot prove or disprove that, but it is worth to think about.

The change of dependencies is in repository-s3 plugin and we have verified that both SDKs v1 and v2 can work in conjunction

I don't think we need to use different SDKs for different flows, the v2 should easily cover both sync and async flows [1], could you may be hint why AWS v1 is still needed and cannot be replaced with AWS v2?

[1] https://docs.aws.amazon.com/sdk-for-java/latest/developer-guide/migration-whats-different.html

@vikasvb90
Copy link
Contributor

vikasvb90 commented Apr 22, 2023

@reta ,
@vikasvb90 here :)

This is all speculation, I cannot prove or disprove that, but it is worth to think about.

AWS SDK doc says that we can use both side by side. Example in this doc uses two different services but services are only implemented as modules of the global AWS SDKs and not as separate SDKs. https://docs.aws.amazon.com/sdk-for-java/latest/developer-guide/migration-side-by-side.html
And as I said in previous comment that both versions can be used in the same run or in fact as part of the same request. So far, I haven't seen any conflict in the dependencies on using the two together. From what I have seen, both SDKs are based on different fundamentals. One uses Apache http client and the other one uses NettyNioAsyncHttpClient and accordingly all other dependencies are laid out in v1 and v2. Therefore, on the basis of these points so far it doesn't look like we should have a problem.
When you say one could exclude and one could rely, where exactly are you referring to. Just to understand, are you referring to a different plugin alongside repository-s3 plugin where a user does these experiments? Even then I am not sure how one can end up creating conflicts as both SDKs have different underlying platform dependencies.

I don't think we need to use different SDKs for different flows, the v2 should easily cover both sync and async flows [1], could you may be hint why AWS v1 is still needed and cannot be replaced with AWS v2?

Yes, we don't need to use sync flows at all. The point was that sync flows are old flows which use SDKv1 and are used in snapshot path currently. We can integrate remote store flows with sdk v2 first and then in a separate task migrate snapshot code and get rid of SDKv1 from repository-s3 plugin. But we still need to continue to support sdk v1 as a version in OpenSearch and slowly deprecate it as other implementation of repository plugin or some other plugin might still be using it.

@reta
Copy link
Collaborator

reta commented Apr 22, 2023

When you say one could exclude and one could rely, where exactly are you referring to

As I mentioned - I do not have prove of that, otherwise I would point to the projects directly. I could easily craft the examples but that would not be fair.

Yes, we don't need to use sync flows at all.

I think this basically answers it all - we could remove v1 altogether, afaik the fact that S3 repository implementation uses SDK v1 is implementation detail , it does not leak through public API (the only single class AmazonS3Reference that could be made package-private as well).

The point was that sync flows are old flows which use SDKv1 and are used in snapshot path currently

This is what we have to tackle first I believe, prepare the foundation before building on top.

@raghuvanshraj raghuvanshraj changed the title [Remote Store] Added AWS v2 SDK and netty dependencies in repository-s3 plugin [Remote Store] Add AWS v2 SDK and netty dependencies in repository-s3 plugin Apr 25, 2023
@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@raghuvanshraj raghuvanshraj marked this pull request as ready for review April 25, 2023 07:18
@raghuvanshraj raghuvanshraj changed the title [Remote Store] Add AWS v2 SDK and netty dependencies in repository-s3 plugin [Remote Store] Add netty dependencies in repository-s3 plugin Jun 14, 2023
@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

Signed-off-by: Raghuvansh Raj <raghraaj@amazon.com>
@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

  • RESULT: UNSTABLE ❕
  • TEST FAILURES:
      1 org.opensearch.search.backpressure.SearchBackpressureIT.testSearchShardTaskCancellationWithHighCpu
      1 org.opensearch.remotestore.SegmentReplicationUsingRemoteStoreIT.testPitCreatedOnReplica
      1 org.opensearch.remotestore.SegmentReplicationUsingRemoteStoreIT.testDropPrimaryDuringReplication

@gbbafna gbbafna merged commit 1f4d0df into opensearch-project:main Jun 15, 2023
@gbbafna gbbafna added the backport 2.x Backport to 2.x branch label Jun 15, 2023
opensearch-trigger-bot bot pushed a commit that referenced this pull request Jun 15, 2023
Signed-off-by: Raghuvansh Raj <raghraaj@amazon.com>
(cherry picked from commit 1f4d0df)
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
kartg pushed a commit that referenced this pull request Jun 15, 2023
…#8072)

(cherry picked from commit 1f4d0df)

Signed-off-by: Raghuvansh Raj <raghraaj@amazon.com>
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
gaiksaya pushed a commit to gaiksaya/OpenSearch that referenced this pull request Jun 26, 2023
…arch-project#7216) (opensearch-project#8072)

(cherry picked from commit 1f4d0df)

Signed-off-by: Raghuvansh Raj <raghraaj@amazon.com>
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
imRishN pushed a commit to imRishN/OpenSearch that referenced this pull request Jun 27, 2023
…arch-project#7216)

Signed-off-by: Raghuvansh Raj <raghraaj@amazon.com>
Signed-off-by: Rishab Nahata <rnnahata@amazon.com>
shiv0408 pushed a commit to Gaurav614/OpenSearch that referenced this pull request Apr 25, 2024
…arch-project#7216)

Signed-off-by: Raghuvansh Raj <raghraaj@amazon.com>
Signed-off-by: Shivansh Arora <hishiv@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x Backport to 2.x branch skip-changelog
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants