Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SEDONA-189] Prepare geometries in broadcast join #708

Merged
merged 1 commit into from
Nov 5, 2022

Conversation

tanelk
Copy link
Contributor

@tanelk tanelk commented Nov 4, 2022

Did you read the Contributor Guide?

Is this PR related to a JIRA ticket?

What changes were proposed in this PR?

Use prepared geometries in BroadcastIndexJoinExec to speed up queries. In jira there is a simple dataset of polygons and points, where the speedup is around 4x.

Preparing the geometry has some overhead, but it will be compensated if the prepared geometry will be used several times. Using it in broadcast join seems like a safe bet - the broadcasted dataset is meant to be small.

How was this patch tested?

Running the existing UTs in sql submodule

Did this PR include necessary documentation updates?

  • No, this PR does not affect any public API so no need to change the docs.

@jiayuasu
Copy link
Member

jiayuasu commented Nov 5, 2022

@umartin Hi Martin, is this PreparedGeometry related to your https://issues.apache.org/jira/browse/SEDONA-178 ?

@tanelk
Copy link
Contributor Author

tanelk commented Nov 5, 2022

@umartin Hi Martin, is this PreparedGeometry related to your https://issues.apache.org/jira/browse/SEDONA-178 ?

That seems like an unrelated issue.

PreparedGeometry is a tool in JTS that can speed up spatial predicates (contains, crosses, etc..). It does not directly impact distance queries and does not change impact correctness - performance only.

@umartin
Copy link
Contributor

umartin commented Nov 5, 2022

@umartin Hi Martin, is this PreparedGeometry related to your https://issues.apache.org/jira/browse/SEDONA-178 ?

That seems like an unrelated issue.

PreparedGeometry is a tool in JTS that can speed up spatial predicates (contains, crosses, etc..). It does not directly impact distance queries and does not change impact correctness - performance only.

I agree. It only impacts performance. I was just going off on a tangent when discussing refactoring spatial partitioning code. That might have caused some confusion

@jiayuasu jiayuasu merged commit 0004580 into apache:master Nov 5, 2022
@tanelk tanelk deleted the SEDONA-189_prepared_broadcast branch November 7, 2022 12:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants