Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(storage): Introduces a split implementation based on split_key to replace the previous move implementation. #18594

Open
wants to merge 16 commits into
base: main
Choose a base branch
from

Conversation

Li0k
Copy link
Contributor

@Li0k Li0k commented Sep 18, 2024

I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.

What's changed and what's your intention?

Previously, we implemented the concept of compaction groups for hummock to achieve write/configuration isolation and to create new compaction groups via the table-level move function.

Since the sst key range generated by the move function is not contiguous, this makes it difficult to re-combine the resulting independent compaction groups (currently not supported). Because of this concern, we set the move threshold very high, fearing that the resulting compaciton group will not be able to be merged again, thus creating a large number of compaction groups.

This concern is contrary to our expectation that in scenarios with high write throughput (backfill, etc.), we would like to split the compaction to achieve parallel compaction, increase the overall compaction throughput, and make full use of the compactor node's scale capability.

Here are the changes involved in this PR for review.

  1. based on table_id level split function, compatible with the original Split interface.
  2. fix key_range/table_id/sst_size after sst split.
  3. Use (table_id, vnode = 0, epoch = 0) as split key for splitting.

I'll summarise the changes to the file, for your code reivew:

  1. src/meta/src/hummock/manager/compaction/compaction_group_schedule.rs introduce helper functions that facilitate splitting, refactor the split interface based on the new implementatio
  2. src/storage/hummock_sdk/src/compaction_group/hummock_version_ext.rs reimplemented split logic based on split_key.
  3. src/storage/hummock_sdk/src/compaction_group/mod.rs provides a wrapper for the split_key based sst / level split implementation.

There may be more code to change in this PR, but the logic is more homogeneous, thank you very much for your review!

Checklist

  • I have written necessary rustdoc comments
  • I have added necessary unit tests and integration tests
  • I have added test labels as necessary. See details.
  • I have added fuzzing tests or opened an issue to track them. (Optional, recommended for new SQL features Sqlsmith: Sql feature generation #7934).
  • My PR contains breaking changes. (If it deprecates some features, please create a tracking issue to remove them in the future).
  • All checks passed in ./risedev check (or alias, ./risedev c)
  • My PR changes performance-critical code. (Please run macro/micro-benchmarks and show the results.)
  • My PR contains critical fixes that are necessary to be merged into the latest release. (Please check out the details)

Documentation

  • My PR needs documentation updates. (Please use the Release note section below to summarize the impact on users)

Release note

If this PR includes changes that directly affect users or other significant modifications relevant to the community, kindly draft a release note to provide a concise summary of these changes. Please prioritize highlighting the impact these changes will have on users.

Copy link
Collaborator

@hzxa21 hzxa21 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the efforts!


// `partition_vnode_count` only works inside a table, to avoid a lot of slicing sst, we only enable it in groups with high throughput and only one table.
// The target `table_ids` might be split to an existing group, so we need to try to update its config
if table_ids.len() == 1 {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this logic for all CG with single table? If yes, only checking the provided table_ids is not enough. Consider the following example:

original cg table_ids = [1, 2, 3]
split table_ids = [2]

resulting cg1 table_ids = [1]
resulting cg2 table_ids = [2]
resulting cg3 table_ids = [3]

All the resulting CGs need this logic. I wonder whether we should do this inside split_compaction_group_impl

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another issue if we do it outside of split_compaction_group_impl: the compaction group txn will not be committed within the same transaction of version delta, which may cause in consistency when error happens.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not required that all single table cgs must be partitioned, we just need to enable it for high throughput cgs, we could even consider deprecating this feature, but for the current PR we choose to keep it. Let me think about how to implement it atomically.

// split 2
// See the example in L603 and the split rule in `split_compaction_group_impl`.
let result_vec = self
.split_compaction_group_impl(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to make sure split1 and split2 happens atomically in the same txn? If not, what happen if split1 succeeds but split2 fails.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, from my point of view it doesn't need to be.

If split 1 succeeds and split 2 fails (split condition/network), we just wait for the next split to be triggered, if necessary.

let split_key = if group_construct.split_key.is_some() {
Some(Bytes::from(group_construct.split_key.clone().unwrap()))
} else {
None
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this branch reachable?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reachable when construct a new compaction group

level.table_infos = level.table_infos[0..=pos].to_vec();
}
SstSplitType::Right => {
insert_table_infos.extend_from_slice(&level.table_infos[pos..]); // the sst at pos has been split to the right
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we reach here, pos must be 0? Can we add an assertion for this?

Copy link
Contributor Author

@Li0k Li0k Sep 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't fully understand what you mean, L409's SstSplitType corresponds to the type of the sst at the pos. (is not the same as splitting all table_infos to right)

The value of pos depends on the values of table_infos and split_key in L397, maybe I misunderstood you?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use get_split_pos to find the last SST with key_range_left < split_key if any or return pos 0. If need_to_split returns SstSplitType::Right, that means split_key >= key_range_left. The only possibilities here is pos==0.

FullKey::new(
TableId::from(table_id),
TableKey(vnode.to_be_bytes().to_vec()),
HummockEpoch::MIN,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it HummockEpoch::MAX?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants