Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhance SegmentPartitionMetadataManager to handle new segment #11585

Conversation

Jackie-Jiang
Copy link
Contributor

Similar to #10466, we need to be more graceful for new segment (newly pushed or new consuming segment) because it is common that they don't have all replicas available in external view.
This PR adds the new segment into the partition info only when all the replicas are available

@Jackie-Jiang Jackie-Jiang added enhancement multi-stage Related to the multi-stage query engine labels Sep 13, 2023
@codecov-commenter
Copy link

codecov-commenter commented Sep 13, 2023

Codecov Report

Merging #11585 (dee7410) into master (011c072) will decrease coverage by 0.01%.
The diff coverage is 63.33%.

@@             Coverage Diff              @@
##             master   #11585      +/-   ##
============================================
- Coverage     63.04%   63.04%   -0.01%     
  Complexity     1108     1108              
============================================
  Files          2326     2326              
  Lines        124882   124926      +44     
  Branches      19065    19075      +10     
============================================
+ Hits          78731    78757      +26     
- Misses        40544    40564      +20     
+ Partials       5607     5605       -2     
Flag Coverage Δ
integration <0.01% <0.00%> (ø)
integration1 <0.01% <0.00%> (ø)
integration2 0.00% <0.00%> (ø)
java-11 63.01% <63.33%> (-0.01%) ⬇️
java-17 62.89% <63.33%> (+<0.01%) ⬆️
java-20 62.91% <63.33%> (+<0.01%) ⬆️
temurin 63.04% <63.33%> (-0.01%) ⬇️
unittests 63.03% <63.33%> (-0.01%) ⬇️
unittests1 67.44% <100.00%> (ø)
unittests2 14.50% <61.66%> (+0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Changed Coverage Δ
...mentpartition/SegmentPartitionMetadataManager.java 73.61% <62.71%> (-2.39%) ⬇️
.../apache/pinot/core/routing/TablePartitionInfo.java 88.88% <100.00%> (ø)

... and 8 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

Copy link
Contributor

@walterddr walterddr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Comment on lines +200 to +204
// If the new segment is not the first segment of a partition, add it only if it won't reduce the fully
// replicated servers. It is common that a new segment (newly pushed, or a new consuming segment) doesn't have
// all the replicas available yet, and we want to exclude it from the partition info until all the replicas
// are available.
//noinspection SlowListContainsAll
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this means any request with partitioned computation will miss a segment until it is fully replicated? what if a commit time of a consuming segment is on average 1hr and in that case:

  1. we wont be able to access consuming segment data until 1hr later when it commits + fully replicated to all replicas
  2. we will be able to access consuming segment as long as all replica servers starts consuming (but results will be inconsistent depending which partition replica we dispatch the request to

which of the above is true?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Segment is count as new for up to 5 minutes after pushed/created, so committing segment won't be counted as new segment. New consuming segment is created after the previous consuming segment is committed.

if (partitionId == INVALID_PARTITION_ID) {
segmentsWithInvalidPartition.add(segment);
continue;
}
// Process new segments in the end
if (InstanceSelector.isNewSegment(segmentInfo._pushTimeMs, currentTimeMs)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it possible that a new segment is with invalid partitionID? should we move this above the invalid id check?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New segment should have valid partition id. Segment ZK metadata won't be changed for new segment within the life cycle

@Jackie-Jiang Jackie-Jiang merged commit ae77667 into apache:master Sep 13, 2023
21 checks passed
@Jackie-Jiang Jackie-Jiang deleted the segment_partition_metadata_new_segment branch September 13, 2023 21:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement multi-stage Related to the multi-stage query engine
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants