Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Remote Store] Update index with Remote Store based settings once all shard copies have moved over to remote store enabled nodes #13252

Closed
shourya035 opened this issue Apr 17, 2024 · 0 comments · Fixed by #13316
Assignees
Labels
enhancement Enhancement or improvement to existing feature or request Storage:Remote

Comments

@shourya035
Copy link
Member

shourya035 commented Apr 17, 2024

Is your feature request related to a problem? Please describe

Today during docrep to remote store migration, we rely on node attributes to perform dual mode replication when shard copies of an index are scattered over both docrep as well as remote nodes (Dual replication details are here: #12413)

As soon as all shard copies of an index have moved over to the remote enabled nodes, we will ensure that the remote store based index settings are applied to the index which will mark the shard migration phase as complete. We would be applying the following settings to the index:

  • index.remote_store.enabled to true
  • index.replication.type to SEGMENT
  • index.remote_store.segment.repository with the segment repository name
  • index.remote_store.translog.repository with the translog repository name

In addition to this, we will also set the correct remote store path based settings (Ref: #13155) in the index metadata alongwith the above mentioned settings

Describe the solution you'd like

To achieve this, we will be utilizing the existing logic of IndexMetadataUpdater which executes after AllocationService has made it's decisions on moving around or failing/starting shard copies. This logic would:

  • Fetch the names of each index whose shard copy has been marked as STARTED
  • Refers to the RoutingTable from the incoming cluster state and checks if all shards for that index are in STARTED state and all those shard copies are hosted on remote nodes
  • If so, mutates the metadata in-place and adds it back to the ClusterState, which is then persisted and published to the data nodes by the cluster manager

Related component

Storage:Remote

Describe alternatives you've considered

We considered using the following other alternatives to achieve this:

  • Creating a custom ClusterStateListener. This would have been executed during a routing table change in the cluster state, which would end up doing the same thing that we are doing on the ShardStartedClusterStateTaskExecutor logic currently. The downside of this would be that if we miss the specific cluster state changed event (because of high pending tasks on the cluster manager, or a cluster manager reboot), we will end up missing the chance to update the index settings. We would have had to design a separate API to reconcile the metadata of all the indices, running the checks to ensure if all shard copies are in remote enabled nodes and update the indices which has been missed within that API logic.
  • Creating a custom IndexEventListener. The downsides of this approach are the same as that of the ClusterStateListener approach with a separate reconciliation API needed to perform the tasks on the indices which has been missed due to a cluster manager unavailability.

Additional context

No response

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Enhancement or improvement to existing feature or request Storage:Remote
Projects
Status: ✅ Done
1 participant