Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize snapshot flow to only snapshot segments which have updates #13285

Merged
merged 7 commits into from
Jun 11, 2024

Conversation

tibrewalpratik17
Copy link
Collaborator

@tibrewalpratik17 tibrewalpratik17 commented May 31, 2024

label:
optimization
enhancement
upsert

Change 1

Inspired from @klsince's work #12976.
This patch enhances the doTakeSnapshot flow to not snapshot all segments in a given partition but only the ones which have been updated since last-snapshot taken. This particularly improves scenarios where the number of segments per partition is high. doTakeSnapshot workflow runs before a new consuming segment starts consumption and directly introduces ingestion lag before starting consumption.

Change 2

This patch also reorders the takeSnapshot and removeDeletedPrimaryKeys flow putting the latter before the first in case of deletedKeysTTL set. This way all the keys and validDocIDs that got removed in removeDeletedPrimaryKeys will be snapshotted immediately rather than one commit cycle later.

We were seeing scenerios where the snapshot flow time taken went upto 30s in case of some tables.
Screenshot 2024-06-06 at 6 21 45 PM

Change 3

We enable snapshotting during server restart for partial-upsert tables before the first consuming segment. This was not done before with the assumption that not all segments are loaded but in case of partial-upsert tables we don't start consumption unless all data is loaded. This saves one segment commit cycle for snapshotting in case of enabling snapshots for tables or after server restart.
The below screenshot shows a dip after server restart and it takes one commit cyle to recover snapshots again.
Screenshot 2024-06-07 at 7 48 58 PM

@codecov-commenter
Copy link

codecov-commenter commented May 31, 2024

Codecov Report

Attention: Patch coverage is 28.57143% with 20 lines in your changes missing coverage. Please review.

Project coverage is 62.11%. Comparing base (59551e4) to head (c0c2785).
Report is 608 commits behind head on master.

Files Patch % Lines
...cal/upsert/BasePartitionUpsertMetadataManager.java 36.36% 14 Missing ⚠️
...a/manager/realtime/RealtimeSegmentDataManager.java 0.00% 5 Missing ⚠️
...org/apache/pinot/spi/config/table/TableConfig.java 0.00% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##             master   #13285      +/-   ##
============================================
+ Coverage     61.75%   62.11%   +0.35%     
+ Complexity      207      198       -9     
============================================
  Files          2436     2548     +112     
  Lines        133233   139979    +6746     
  Branches      20636    21735    +1099     
============================================
+ Hits          82274    86941    +4667     
- Misses        44911    46447    +1536     
- Partials       6048     6591     +543     
Flag Coverage Δ
custom-integration1 <0.01% <0.00%> (-0.01%) ⬇️
integration <0.01% <0.00%> (-0.01%) ⬇️
integration1 <0.01% <0.00%> (-0.01%) ⬇️
integration2 0.00% <0.00%> (ø)
java-11 35.28% <0.00%> (-26.43%) ⬇️
java-21 62.00% <28.57%> (+0.37%) ⬆️
skip-bytebuffers-false 62.09% <28.57%> (+0.34%) ⬆️
skip-bytebuffers-true 61.97% <28.57%> (+34.24%) ⬆️
temurin 62.11% <28.57%> (+0.35%) ⬆️
unittests 62.10% <28.57%> (+0.35%) ⬆️
unittests1 46.67% <0.00%> (-0.22%) ⬇️
unittests2 27.72% <28.57%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@tibrewalpratik17 tibrewalpratik17 marked this pull request as ready for review June 6, 2024 18:03
@tibrewalpratik17
Copy link
Collaborator Author

cc @klsince @Jackie-Jiang

@klsince klsince merged commit d91ad73 into apache:master Jun 11, 2024
20 checks passed
@tibrewalpratik17 tibrewalpratik17 deleted the optimize_snapshotting branch June 14, 2024 21:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants