-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
make deep store upload retry async with configurable parallelism #12017
Conversation
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## master #12017 +/- ##
============================================
- Coverage 61.61% 61.49% -0.13%
- Complexity 207 1153 +946
============================================
Files 2385 2389 +4
Lines 129214 129845 +631
Branches 20003 20086 +83
============================================
+ Hits 79613 79845 +232
- Misses 43801 44175 +374
- Partials 5800 5825 +25
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
@@ -914,9 +915,9 @@ public void testCommitSegmentMetadata() { | |||
/** | |||
* Test cases for fixing LLC segment by uploading to segment store if missing | |||
*/ | |||
@Test | |||
@Test(timeOut = 30_000L) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do not use timeout. We usually use TestUtils.waitForCondition()
...java/org/apache/pinot/controller/helix/core/realtime/PinotLLCRealtimeSegmentManagerTest.java
Outdated
Show resolved
Hide resolved
...ain/java/org/apache/pinot/controller/helix/core/realtime/PinotLLCRealtimeSegmentManager.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM otherwise
...java/org/apache/pinot/controller/helix/core/realtime/PinotLLCRealtimeSegmentManagerTest.java
Outdated
Show resolved
Hide resolved
...java/org/apache/pinot/controller/helix/core/realtime/PinotLLCRealtimeSegmentManagerTest.java
Outdated
Show resolved
Hide resolved
...ain/java/org/apache/pinot/controller/helix/core/realtime/PinotLLCRealtimeSegmentManager.java
Outdated
Show resolved
Hide resolved
@Jackie-Jiang Thanks for the reviews! addressed the remaining comments |
Relates to: #11874
This PR moves the deep store upload retry to be an asynchronous operation. By default, the thread pool that performs the uploads is created with a single thread.
A new config is introduced to allow for increasing the size of the thread pool used for retrying segment uploads:
controller.realtime.segment.deepStoreUploadRetry.parallelism
(default is 1).A new metric is emitted to track the upload retry queue size, which can be used to ensure we can have visibility into scenarios where segment upload retry cannot keep up with new segments that are missing their deepstore copy:
LLC_SEGMENTS_DEEP_STORE_UPLOAD_RETRY_QUEUE_SIZE
Tested via cluster deployment, edited ZK to remove segment urls to trigger uploads and verified segments were uploaded.
tags:
enhancement