Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cover the race condition for upsert compaction #12346

Merged
merged 5 commits into from
Feb 2, 2024

Conversation

snleee
Copy link
Contributor

@snleee snleee commented Jan 31, 2024

The current code can face the race condition when the following condition is met:

  1. Segment upload is complete for segment replacement and segment zk metadata is changed to point the new segment.
  2. Server is still loading the segment (e.g. if offheap upsert is enabled, segment reload can take a while)
  3. If minion picks up the compaction task at the above step, minion will download the new segment (replaced) while it will fetch the validDocId bitmap for old segment.

To avoid this scenario, we added the extra crc check on the minion side. Minion will check if the crc from downloaded segment and crc from server api for valid doc id bitmap are the same.

Changes from the PR:

  1. Add the new API for fetching validDocId bitmap. the response is changed to JSON to include extra information like crc.
  2. Wired the new API & added tests
  3. Added the crc check on the minion task executor
  4. Added the crc check on the task generator

Release Note on the backward incompatibility

This PR is introducing backward incompatibility for UpsertCompactionTask. Previously, we allowed to configure the compaction task without the snapshot enabled. We found that using in-memory based validDocIds is a bit dangerous as it will not give us the consistency (e.g. fetching validDocIds bitmap while the server is restarting & updating validDocIds).

We now enforce the enableSnapshot=true for UpsertCompactionTask if the advanced customer wants to run the compaction with the in-memory validDocId bitmap.

{
  "upsertConfig": {
    "mode": "FULL",
    "enableSnapshot": true
  }
}
...
"task": {
  "taskTypeConfigsMap": {
    "UpsertCompactionTask": {
      "schedule": "0 */5 * ? * *",
      "bufferTimePeriod": "7d",
      "invalidRecordsThresholdPercent": "30",
      "invalidRecordsThresholdCount": "100000",
      "invalidDocIdsType": "SNAPSHOT/IN_MEMORY/IN_MEMORY_WITH_DELETE"
    }
  }
}

Also, we allow to configure invalidDocIdsType to UpsertCompactionTask for advanced user.

  1. snapshot: Default validDocIds type. This indicates that the validDocIds bitmap is loaded from the snapshot from the Pinot segment. UpsertConfig's enableSnapshot must be enabled for this type.
  2. onHeap: the validDocIds bitmap will be fetched from the server.
  3. onHeapWithDelete: the validDocIds bitmap will be fetched from the server. This will also take account into the deleted documents. UpsertConfig's deleteRecordColumn must be provided for this type.

@snleee snleee force-pushed the fix-race-condition-upsert branch 5 times, most recently from 90c5517 to 35a1ed1 Compare January 31, 2024 21:34
@codecov-commenter
Copy link

codecov-commenter commented Jan 31, 2024

Codecov Report

Attention: 105 lines in your changes are missing coverage. Please review.

Comparison is base (3ab2834) 61.69% compared to head (cac50bd) 61.69%.
Report is 10 commits behind head on master.

Files Patch % Lines
...t/controller/util/ServerSegmentMetadataReader.java 0.00% 25 Missing ⚠️
...che/pinot/server/api/resources/TablesResource.java 62.26% 14 Missing and 6 partials ⚠️
...upsertcompaction/UpsertCompactionTaskExecutor.java 0.00% 16 Missing ⚠️
...psertcompaction/UpsertCompactionTaskGenerator.java 31.81% 15 Missing ⚠️
...n/restlet/resources/ValidDocIdsBitmapResponse.java 0.00% 10 Missing ⚠️
...he/pinot/segment/local/utils/TableConfigUtils.java 11.11% 6 Missing and 2 partials ⚠️
...mmon/restlet/resources/ValidDocIdMetadataInfo.java 0.00% 5 Missing ⚠️
...inot/common/restlet/resources/ValidDocIdsType.java 0.00% 4 Missing ⚠️
...oller/api/resources/PinotTableRestletResource.java 0.00% 1 Missing ⚠️
...che/pinot/plugin/minion/tasks/MinionTaskUtils.java 0.00% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##             master   #12346      +/-   ##
============================================
- Coverage     61.69%   61.69%   -0.01%     
  Complexity      207      207              
============================================
  Files          2422     2426       +4     
  Lines        132217   132667     +450     
  Branches      20408    20499      +91     
============================================
+ Hits          81573    81848     +275     
- Misses        44656    44820     +164     
- Partials       5988     5999      +11     
Flag Coverage Δ
custom-integration1 <0.01% <0.00%> (-0.01%) ⬇️
integration <0.01% <0.00%> (-0.01%) ⬇️
integration1 <0.01% <0.00%> (-0.01%) ⬇️
integration2 0.00% <0.00%> (ø)
java-11 34.94% <0.00%> (-26.71%) ⬇️
java-21 61.57% <28.08%> (+<0.01%) ⬆️
skip-bytebuffers-false 61.66% <28.08%> (-0.01%) ⬇️
skip-bytebuffers-true 61.56% <28.08%> (+<0.01%) ⬆️
temurin 61.69% <28.08%> (-0.01%) ⬇️
unittests 61.68% <28.08%> (-0.01%) ⬇️
unittests1 46.91% <0.00%> (+0.02%) ⬆️
unittests2 27.69% <28.08%> (+0.03%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

The current code can face the race condition when the
following condition is met:

1. Segment upload is complete for segment replacement
   and segment zk metadata is changed to point the new
   segment.
2. Server is still loading the segment (e.g. if offheap
   upsert is enabled, segment reload can take a while)
3. If minion picks up the compaction task at the above
   step, minion will download the new segment (replaced)
   while it will fetch the validDocId bitmap for old
   segment.

To avoid this scenario, we added the extra crc check
on the minion side. Minion will check if the crc from
downloaded segment and crc from server api for valid
doc id bitmap are the same.

Changes from the PR:
1. Add the new API for fetching validDocId bitmap. the
   response is changed to JSON to include extra
   information like crc.
2. Wired the new API & added tests
3. Added the crc check on the minion task executor
4. Added the crc check on the task generator
Response.Status.BAD_REQUEST);
}

// Adopt the same logic as the query execution to get the valid doc ids. 'FilterPlanNode.run()'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not introduced in this PR, but this logic is incorrect. We cannot read the valid doc ids from memory because it might be modified during ingestion, and if the server re-consume the data (e.g. restart), some valid docs might be missing. We should use the valid doc ids from the snapshot. Same for the valid doc ids metadata api

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added validDocIdType. Can you review the code?

@@ -160,6 +170,20 @@ public List<PinotTaskConfig> generateTasks(List<TableConfig> tableConfigs) {
return pinotTaskConfigs;
}

private String getDefaultValidDocId(UpsertConfig upsertConfig) {
Copy link
Contributor Author

@snleee snleee Feb 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Jackie-Jiang I found that the snapshot is only available if enableSnapshot=true.

Should we perform upsert at the best effort or do we want to enforce enableSnapshot=true?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

  1. throw error if enableSnapshot is not set to true.
  2. throw error if queryableDocId type is set while upsert delete column is not enabled.

@snleee snleee force-pushed the fix-race-condition-upsert branch 2 times, most recently from 981a4fd to 2ec8c54 Compare February 1, 2024 19:45
@@ -52,6 +54,7 @@ public class UpsertCompactionTaskGenerator extends BaseTaskGenerator {
private static final String DEFAULT_BUFFER_PERIOD = "7d";
private static final double DEFAULT_INVALID_RECORDS_THRESHOLD_PERCENT = 0.0;
private static final long DEFAULT_INVALID_RECORDS_THRESHOLD_COUNT = 0;
private static final String DEFAULT_VALID_DOC_ID_TYPE = "validDocIdsSnapshot";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can the default be just validDocIds to keep it backward-compatible?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tibrewalpratik17 We found that using in-memory based validDocIds is a bit dangerous as it will not give us the consistency (e.g. fetching validDocIds bitmap while the server is restarting & updating validDocIds)

  1. by default, we would want to enforce the customer to use segment compaction + snapshot enabled in the upsert config
  2. We would still want to allow the previous approach (im-memory bitmap based). In this case, we will ask the customers to pick the appropriate validDocIdType.

How do you think?

cc: @Jackie-Jiang

Copy link
Contributor

@tibrewalpratik17 tibrewalpratik17 Feb 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case, can we tag this PR as backward-incompatible as existing compaction tasks will break without snapshot?

I agree of this inconsistent scenario:

 in-memory based validDocIds is a bit dangerous as it will not give us the consistency (e.g. fetching validDocIds bitmap while the server is restarting & updating validDocIds)

@snleee snleee added the backward-incompat Referenced by PRs that introduce or fix backward compat issues label Feb 1, 2024
Copy link
Contributor

@Jackie-Jiang Jackie-Jiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please update the PR description with what is backward incompatible?

@snleee snleee added the release-notes Referenced by PRs that need attention when compiling the next release notes label Feb 2, 2024
@snleee
Copy link
Contributor Author

snleee commented Feb 2, 2024

@Jackie-Jiang @tibrewalpratik17 Please take a final look on this PR. I think that this is now close to be merged.

@snleee snleee merged commit ab525ee into apache:master Feb 2, 2024
19 checks passed
@snleee snleee deleted the fix-race-condition-upsert branch February 2, 2024 22:03
suyashpatel98 pushed a commit to suyashpatel98/pinot that referenced this pull request Feb 28, 2024
* Cover the race condition for upsert compaction

The current code can face the race condition when the
following condition is met:

1. Segment upload is complete for segment replacement
   and segment zk metadata is changed to point the new
   segment.
2. Server is still loading the segment (e.g. if offheap
   upsert is enabled, segment reload can take a while)
3. If minion picks up the compaction task at the above
   step, minion will download the new segment (replaced)
   while it will fetch the validDocId bitmap for old
   segment.

To avoid this scenario, we added the extra crc check
on the minion side. Minion will check if the crc from
downloaded segment and crc from server api for valid
doc id bitmap are the same.

Changes from the PR:
1. Add the new API for fetching validDocId bitmap. the
   response is changed to JSON to include extra
   information like crc.
2. Wired the new API & added tests
3. Added the crc check on the minion task executor
4. Added the crc check on the task generator

* Add the 'validDocIdType' across all APIs

* Changed 'validDocIdType' to 'validDocIdsType'

* Add validDocIdsType enum

* Addressing the comments to use enum everywhere
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backward-incompat Referenced by PRs that introduce or fix backward compat issues bugfix ingestion release-notes Referenced by PRs that need attention when compiling the next release notes upsert
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants