Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(frontend): support execute insert in local mode #8208

Merged
merged 2 commits into from
Mar 3, 2023
Merged

Conversation

ZENOTME
Copy link
Contributor

@ZENOTME ZENOTME commented Feb 27, 2023

I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.

What's changed and what's your intention?

related issue: #7684

support to execute insert without using select query in local mode, such as: insert into t values (1)

For other dml like: insert-select, delete, update, their plan has more than two stage so that we can't execute them in local mode.

Checklist For Contributors

  • I have written necessary rustdoc comments
  • I have added necessary unit tests and integration tests
  • I have added fuzzing tests or opened an issue to track them. (Optional, recommended for new SQL features Sqlsmith: Sql feature generation #7934).
  • I have demonstrated that backward compatibility is not broken by breaking changes and created issues to track deprecated features to be removed in the future. (Please refer to the issue)
  • All checks passed in ./risedev check (or alias, ./risedev c)

Checklist For Reviewers

  • I have requested macro/micro-benchmarks as this PR can affect performance substantially, and the results are shown.

Documentation

Click here for Documentation

Types of user-facing changes

  • SQL commands, functions, and operators

Release note

In past the insert will be executed in distributed mode, but now the insert without using select query will executed in local mode.

@github-actions github-actions bot added type/feature user-facing-changes Contains changes that are visible to users labels Feb 27, 2023
@ZENOTME ZENOTME changed the title feat(frontend): support execute insert in local mode (draft)feat(frontend): support execute insert in local mode Feb 28, 2023
@ZENOTME ZENOTME changed the title (draft)feat(frontend): support execute insert in local mode feat(frontend): support execute insert in local mode Feb 28, 2023
@ZENOTME ZENOTME force-pushed the zj/local_insert branch 2 times, most recently from b128755 to 2c9da12 Compare March 1, 2023 09:05
@codecov
Copy link

codecov bot commented Mar 1, 2023

Codecov Report

Merging #8208 (45b25a0) into main (1ad23ba) will decrease coverage by 0.02%.
The diff coverage is 62.28%.

@@            Coverage Diff             @@
##             main    #8208      +/-   ##
==========================================
- Coverage   71.65%   71.63%   -0.02%     
==========================================
  Files        1131     1131              
  Lines      184150   184230      +80     
==========================================
+ Hits       131948   131978      +30     
- Misses      52202    52252      +50     
Flag Coverage Δ
rust 71.63% <62.28%> (-0.02%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
src/frontend/src/scheduler/local.rs 0.00% <0.00%> (ø)
src/frontend/src/handler/query.rs 17.74% <45.00%> (+1.78%) ⬆️
src/frontend/src/optimizer/mod.rs 90.88% <100.00%> (+0.80%) ⬆️
src/utils/pgwire/src/pg_response.rs 67.18% <0.00%> (-3.13%) ⬇️
...torage/src/hummock/local_version/pinned_version.rs 88.75% <0.00%> (-0.63%) ⬇️
src/meta/src/hummock/mock_hummock_meta_client.rs 65.46% <0.00%> (-0.52%) ⬇️
src/object_store/src/object/mem.rs 86.87% <0.00%> (-0.39%) ⬇️
src/storage/src/hummock/compactor/iterator.rs 96.40% <0.00%> (-0.28%) ⬇️
src/storage/src/hummock/sstable_store.rs 64.77% <0.00%> (-0.16%) ⬇️
src/stream/src/executor/aggregation/minput.rs 96.25% <0.00%> (-0.11%) ⬇️

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@ZENOTME ZENOTME requested review from kwannoel, BugenZhao and liurenjie1024 and removed request for kwannoel March 1, 2023 09:34
@lmatz
Copy link
Contributor

lmatz commented Mar 1, 2023

Do you want to test insert in both local mode and distributed mode in the nightly performance test?
https://buildkite.com/risingwave-test/sysbench

@ZENOTME
Copy link
Contributor Author

ZENOTME commented Mar 2, 2023

Do you want to test insert in both local mode and distributed mode in the nightly performance test? https://buildkite.com/risingwave-test/sysbench

Yes, how I test that

@ZENOTME ZENOTME closed this Mar 2, 2023
@ZENOTME ZENOTME reopened this Mar 2, 2023
@lmatz
Copy link
Contributor

lmatz commented Mar 2, 2023

Do you want to test insert in both local mode and distributed mode in the nightly performance test? https://buildkite.com/risingwave-test/sysbench

Yes, how I test that

QA team will add local mode into the existing performance test pipeline

Ok(vec![self.front_env.worker_node_manager().next_random()?])
}
} else {
Ok(self.front_env.worker_node_manager().list_worker_nodes())
Copy link
Contributor

@kwannoel kwannoel Mar 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm should we list all worker nodes here?
What if parallelism = 3, and worker nodes = 5, does that mean we choose to schedule 5 workers?
Should it match exactly? i.e. choose N workers for N parallelism.

Copy link
Contributor

@liurenjie1024 liurenjie1024 Mar 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it should be workers with target table only.

Copy link
Contributor Author

@ZENOTME ZENOTME Mar 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have processed the case that the stage have scan node before, so in this case the stage don't have the scan node, I think we just need to randomlly select N parallelism worker?🤔

(Previous process is list worker node)

self.front_env.worker_node_manager().list_worker_nodes()

Comment on lines +477 to +483
let worker_node = {
let parallel_unit_ids = vnode_mapping.iter_unique().collect_vec();
let candidates = self.front_env
.worker_node_manager()
.get_workers_by_parallel_unit_ids(&parallel_unit_ids)?;
candidates.choose(&mut rand::thread_rng()).unwrap().clone()
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need get_workers_by_parallel_unit_ids here? Didn't use this before, so not too familiar.
In the case of Insert why is this needed?

Copy link
Contributor Author

@ZENOTME ZENOTME Mar 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The insert executor will send the insert data to the reader registered by the dml executor.

.write_chunk(self.table_id, self.table_version_id, stream_chunk)

let batch_reader = batch_reader.stream_reader().into_stream();

Hence to access the reader, the insert executor need to schedule the same worker node of the dml executor. So get_workers_by_parallel_unit_ids is to get the worker node where dml executor stay in.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, thanks for the clear explanation!

Maybe this can be documented, since the logic is split in various places, it does not seem very clear to me at first glance.
(Unless some documentation already exists, in that case feel free to ignore).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with you. (Seems don't have a related doc)

Copy link
Contributor

@liurenjie1024 liurenjie1024 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally LGTM, just require some refinement.

src/frontend/src/handler/query.rs Show resolved Hide resolved
Ok(vec![self.front_env.worker_node_manager().next_random()?])
}
} else {
Ok(self.front_env.worker_node_manager().list_worker_nodes())
Copy link
Contributor

@liurenjie1024 liurenjie1024 Mar 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it should be workers with target table only.

Copy link
Contributor

@liurenjie1024 liurenjie1024 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ZENOTME ZENOTME added this pull request to the merge queue Mar 3, 2023
Merged via the queue into main with commit 64810b2 Mar 3, 2023
@ZENOTME ZENOTME deleted the zj/local_insert branch March 3, 2023 06:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/feature user-facing-changes Contains changes that are visible to users
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants