Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SQL: Plan non-equijoin conditions as cross join followed by filter. #14978

Merged
merged 5 commits into from
Sep 19, 2023

Conversation

gianm
Copy link
Contributor

@gianm gianm commented Sep 13, 2023

Druid has previously refused to execute joins with non-equality-based conditions. This was well-intentioned: the idea was to push people to write their queries in a different, hopefully more performant way.

But as we're moving towards fuller SQL support, it makes more sense to allow these conditions to go through with the best plan we can come up with right now: a cross join followed by a filter. In some cases this will allow the query to run, and people will be happy with that. In other cases, it will run into resource limits during execution. But we should at least give the query a chance.

This patch also updates the documentation to explain how people can tell whether their queries are being planned this way.

Druid has previously refused to execute joins with non-equality-based
conditions. This was well-intentioned: the idea was to push people to
write their queries in a different, hopefully more performant way.

But as we're moving towards fuller SQL support, it makes more sense to
allow these conditions to go through with the best plan we can come up
with: a cross join followed by a filter. In some cases this will allow
the query to run, and people will be happy with that. In other cases,
it will run into resource limits during execution. But we should at
least give the query a chance.

This patch also updates the documentation to explain how people can
tell whether their queries are being planned this way.
@LakshSingla LakshSingla removed the MSQ label Sep 15, 2023
Co-authored-by: Benedict Jin <asdf2014@apache.org>
@github-actions github-actions bot added the Area - MSQ For multi stage queries - https://github.com/apache/druid/issues/12262 label Sep 15, 2023
Copy link
Contributor

@LakshSingla LakshSingla left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, We should also ship #14976 along with the same releases, since the doc changes in this PR reference the IS_NOT_DISTINCT_FROM function added in that patch.

@gianm
Copy link
Contributor Author

gianm commented Sep 19, 2023

LGTM, We should also ship #14976 along with the same releases, since the doc changes in this PR reference the IS_NOT_DISTINCT_FROM function added in that patch.

Good point. I'll merge this, and then let's make sure to get #14976 in prior to the next release.

@gianm gianm merged commit 4f498e6 into apache:master Sep 19, 2023
74 checks passed
@gianm gianm deleted the sql-join-extract-condition branch September 19, 2023 17:23
somu-imply added a commit to somu-imply/druid that referenced this pull request Sep 22, 2023
somu-imply added a commit to somu-imply/druid that referenced this pull request Sep 22, 2023
somu-imply added a commit to somu-imply/druid that referenced this pull request Sep 22, 2023
soumyava pushed a commit that referenced this pull request Sep 25, 2023
@LakshSingla LakshSingla added this to the 28.0 milestone Oct 12, 2023
abhishekagarwal87 added a commit that referenced this pull request Nov 29, 2023
…15302)

This PR revives #14978 with a few more bells and whistles. Instead of an unconditional cross-join, we will now split the join condition such that some conditions are now evaluated post-join. To decide what sub-condition goes where, I have refactored DruidJoinRule class to extract unsupported sub-conditions. We build a postJoinFilter out of these unsupported sub-conditions and push to the join.
yashdeep97 pushed a commit to yashdeep97/druid that referenced this pull request Dec 1, 2023
…pache#15302)

This PR revives apache#14978 with a few more bells and whistles. Instead of an unconditional cross-join, we will now split the join condition such that some conditions are now evaluated post-join. To decide what sub-condition goes where, I have refactored DruidJoinRule class to extract unsupported sub-conditions. We build a postJoinFilter out of these unsupported sub-conditions and push to the join.
yashdeep97 pushed a commit to yashdeep97/druid that referenced this pull request Dec 1, 2023
…pache#15302)

This PR revives apache#14978 with a few more bells and whistles. Instead of an unconditional cross-join, we will now split the join condition such that some conditions are now evaluated post-join. To decide what sub-condition goes where, I have refactored DruidJoinRule class to extract unsupported sub-conditions. We build a postJoinFilter out of these unsupported sub-conditions and push to the join.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area - Batch Ingestion Area - Documentation Area - MSQ For multi stage queries - https://github.com/apache/druid/issues/12262 Area - Querying Area - SQL
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants