-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Release all segments of a table in releaseAndRemoveAllSegments method #12297
Release all segments of a table in releaseAndRemoveAllSegments method #12297
Conversation
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## master #12297 +/- ##
============================================
+ Coverage 61.69% 61.73% +0.03%
Complexity 207 207
============================================
Files 2424 2424
Lines 132340 132520 +180
Branches 20436 20483 +47
============================================
+ Hits 81651 81808 +157
- Misses 44693 44715 +22
- Partials 5996 5997 +1
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
cb5a820
to
3ed9d68
Compare
|
The
No we didn't take a thread dump but agree that we should have taken one. Basically, it timed out while waiting for the consumer thread to exit. The consumer thread may still be in |
pinot-core/src/main/java/org/apache/pinot/core/data/manager/BaseTableDataManager.java
Show resolved
Hide resolved
3ed9d68
to
2abd892
Compare
Another qq - Will the same release method be called if the table is dropped / deleted ? I ask because we start the executor service with threads = numSegments When the call is being made during deployment / shutdown, it is fine to spawn these many threads which could be a very high multiple of numCores. But during steady state, if this code is invoked then spawning so many threads can affect query performance of other tables on the host. |
LGTM overall. @mcvsubbu / @vvivekiyer / @somandal - in case one of you wants to take a look |
Also, it will be good to update the PR description with a bit more details on how we noticed the problem. |
pinot-core/src/main/java/org/apache/pinot/core/data/manager/BaseTableDataManager.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggest using a feature flag to guard this enhancement for the following reasons:
- The change will make the segment release async, so not sure if it has side effect
- It can consume extra CPU resources thus impacting query performance
If the intention is to accelerate the server shutdown, we can consider parallel the table data manager shutdown in HelixInstanceDataManager.shutDown()
, at which moment we know server is not serving queries
pinot-core/src/main/java/org/apache/pinot/core/data/manager/BaseTableDataManager.java
Outdated
Show resolved
Hide resolved
Yes, the intention is to accelerate the server shutdown. Paralleling table data manager shutdown can help in one way, whereas there is another scenario that there is some certain table with too many segments which require to be shut down one by one (and unfortunately this is the scenario we met). I can add the parallelism to the table level shutdown as well as changing the thread pool to a fixed sized thread pool. |
2abd892
to
4f05fd7
Compare
Comments have been addressed in the latest push and the description is updated. Please take a look. Thanks! |
4f05fd7
to
a796baa
Compare
This PR attempts to asynchronously release all the segments of a table in
releaseAndRemoveAllSegments
method. The reason of doing that by submitting multiple threads is that the table could have too many open segments, which will take time to release them one by one (especially for consuming segments). Thus, introduce the fixed thread pool to release them in order to speed up the stopping behavior.The
releaseAndRemoveAllSegments
method can be invoked by two scenarios:This PR also adds the fixed thread pool to shut down table managers asynchronously. This can also help speed up the shutdown process for the server.