Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

need to synchronize replacing upsert segment #12105

Merged
merged 2 commits into from
Dec 6, 2023

Conversation

klsince
Copy link
Contributor

@klsince klsince commented Dec 6, 2023

For upsert table, replacing segment could happen in two threads concurrently, but there are multiple steps to replace upsert segments and not thread safe. This PR changes to do them with segmentLock to be thread safe. In fact, the partitionUpsertMetadataManager.addSegment/replaceSegment takes segmentLock inside anyway.

For upsert table, when consuming thread commits a newly built immutable segment, it needs to replace the old mutable segment, in order to update the upsert states to point to the new immutable segment. But committing segment can take longer than expected. When committing takes long to complete, server decides to download the segment as a way to help catch up ingestion quicker. The HelixTaskExecutor thread, to download and replace segment, tries to stop the consuming thread by setting a _shouldStop flag, but consuming thread doesn't check the flag once in the middle of replacing the segment. The HelixTaskExecutor thread waits up to 10min and then continues to replace the segment. Now we would get two threads replacing the same upsert segment concurrently.

In comparison, for non-upsert RT tables, it's thread safe to add or replace segments as the operation is made atomic; for OFFLINE tables, the segmentLock is taken to add or replace segment.

@codecov-commenter
Copy link

codecov-commenter commented Dec 6, 2023

Codecov Report

Attention: 22 lines in your changes are missing coverage. Please review.

Comparison is base (49804a4) 61.68% compared to head (7b4a3ad) 61.67%.
Report is 1 commits behind head on master.

Files Patch % Lines
...ata/manager/realtime/RealtimeTableDataManager.java 0.00% 17 Missing ⚠️
...cal/upsert/BasePartitionUpsertMetadataManager.java 54.54% 3 Missing and 2 partials ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##             master   #12105      +/-   ##
============================================
- Coverage     61.68%   61.67%   -0.02%     
+ Complexity     1152     1146       -6     
============================================
  Files          2391     2391              
  Lines        130211   130218       +7     
  Branches      20141    20142       +1     
============================================
- Hits          80320    80307      -13     
- Misses        44048    44061      +13     
- Partials       5843     5850       +7     
Flag Coverage Δ
custom-integration1 <0.01% <0.00%> (ø)
integration <0.01% <0.00%> (ø)
integration1 <0.01% <0.00%> (ø)
integration2 ?
java-11 61.63% <21.42%> (+26.56%) ⬆️
java-21 61.55% <21.42%> (-0.02%) ⬇️
skip-bytebuffers-false 61.66% <21.42%> (-0.02%) ⬇️
skip-bytebuffers-true 61.53% <21.42%> (+<0.01%) ⬆️
temurin 61.67% <21.42%> (-0.02%) ⬇️
unittests 61.66% <21.42%> (-0.02%) ⬇️
unittests1 46.94% <0.00%> (-0.02%) ⬇️
unittests2 27.59% <21.42%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@klsince klsince merged commit 3811240 into apache:master Dec 6, 2023
19 checks passed
@klsince klsince deleted the sync_upsert_replace_segment branch December 6, 2023 22:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants