Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(batch): support BATCH_PARALLELISM #7558

Closed
fuyufjh opened this issue Jan 28, 2023 · 1 comment
Closed

feat(batch): support BATCH_PARALLELISM #7558

fuyufjh opened this issue Jan 28, 2023 · 1 comment
Assignees

Comments

@fuyufjh
Copy link
Member

fuyufjh commented Jan 28, 2023

Should the batch query jobs i.e. SELECT follow this parallelism?

Do you mean default parallelism of the non-scan stages? Good catch. 🤔

Yes, exactly.

Maybe we need another BATCH_PARALLELISM for this. 😄

Originally posted by @BugenZhao in #7370 (comment)

Currently, the parallelism of each stage in distributed batch query is determined as follows:

  1. Scan stage has the same number of parallelism as the number of partitions of table/mv.
  2. Root stage's parallelism is always one.
  3. Intermediate stage's parallelism is the same as the number of worker nodes.

Sometimes users may want to increase the number of parallelism stages, and we can have a session variable batch_parallelism for this.

@github-actions github-actions bot added this to the release-0.1.16 milestone Jan 28, 2023
@fuyufjh fuyufjh removed this from the release-0.1.17 milestone Feb 6, 2023
@liurenjie1024 liurenjie1024 changed the title support BATCH_PARALLELISM feat(batch): support BATCH_PARALLELISM Feb 16, 2023
@ZENOTME
Copy link
Contributor

ZENOTME commented Mar 16, 2023

closed by #8552

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants