Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

storage controller: after shard split, some shards end up with heatmap uploads disabled. #8189

Closed
jcsp opened this issue Jun 27, 2024 · 0 comments · Fixed by #8197
Closed
Assignees
Labels
c/storage/controller Component: Storage Controller p/high High priority: use for bugs that need prompt attention, such as crashes or possible corruptions t/bug Issue Type: Bug

Comments

@jcsp
Copy link
Contributor

jcsp commented Jun 27, 2024

At the start of do_tenant_shard_split, we drop any secondary location for the parent shards. The reconciler uses presence of secondary locations as a condition for enabling heatmaps.

On the pageserver, child shards inherit their configuration from parents, but the storage controller assumes the child's ObservedState is the same as the parent's config from the prepare phase. The result is that some child shards end up with inaccurate ObservedState, and until something next migrates or restarts, those tenant shards aren't uploading heatmaps, so their secondary locations are downloading everything that was resident at the moment of the split (including ancestor layers which are often cleaned up shortly after the split).

@jcsp jcsp added t/bug Issue Type: Bug c/storage/controller Component: Storage Controller p/high High priority: use for bugs that need prompt attention, such as crashes or possible corruptions labels Jun 27, 2024
@jcsp jcsp self-assigned this Jun 28, 2024
jcsp added a commit that referenced this issue Jun 28, 2024
jcsp added a commit that referenced this issue Jun 28, 2024
jcsp added a commit that referenced this issue Jun 28, 2024
jcsp added a commit that referenced this issue Jun 28, 2024
@jcsp jcsp closed this as completed in b8bbaaf Jun 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c/storage/controller Component: Storage Controller p/high High priority: use for bugs that need prompt attention, such as crashes or possible corruptions t/bug Issue Type: Bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant