Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SEDONA-286] Support optimized distance join on ST_DistanceSphere and ST_DistanceSpheroid #845

Merged
merged 4 commits into from
May 31, 2023

Conversation

jiayuasu
Copy link
Member

@jiayuasu jiayuasu commented May 29, 2023

Did you read the Contributor Guide?

Is this PR related to a JIRA ticket?

What changes were proposed in this PR?

  1. Added patterns for ST_DistanceSpheroid and ST_DistanceSphere based join
  2. Expand buffer area by following the formula provided by Kristin:

$$ \begin{align*} \Delta \textrm{lat} &= \frac{D}{R} \\ \Delta \textrm{lon} &= \max (\frac{D}{R \cos(\textrm{lat} + \Delta \textrm{lat})}, \frac{D}{R \cos(\textrm{lat} - \Delta \textrm{lat})}) \end{align*} $$

where $\textrm{lat}$, $\textrm{lon}$ is the center of the envelope; $R$ is the mean radius of the earth. The center point is expanded by $\Delta \textrm{lat}$, $\Delta \textrm{lon}$ to obtain the envelope for coarse-grained filtering. The envelope is expanded by a constant factor 1.1 to get rid of false negatives.

For spheroid and sphere distance, we use the same formula with the same $R$ value. We take the polar radius of the spheroid as $R$ (approximately 6357 km) and the resulting envelope should be inclusive.

Now the users can trigger our advanced spatial join algorithm (distance join and broadcast index join) by a ST_DistanceSpheroid and ST_DistanceSphere predicate.

SELECT *
FROM df1, df2
WHERE ST_DistanceSpheroid(df1.geom, df2.geom) < 100

100 indicates 100 meters.

How was this patch tested?

Passed new unit tests

Did this PR include necessary documentation updates?

  • Yes, I have updated the documentation update.

@Kontinuation
Copy link
Member

The transformation from sphere/spheroid distance to radius distance may not be an inclusive transformation, which means that $\forall (p_0, p_1, d) [ \textrm{DistanceSphere}(p_0, p_1) &lt; d \implies \textrm{Distance}(p_0, p_1) &lt; \frac{d}{111000} ]$ does not hold. Applying the optimization may result in fewer results being produced. If this is the case, we have to document that such queries may not produce accurate results, especially for geometries near the poles.

@jiayuasu
Copy link
Member Author

@Kontinuation Thanks. Is there a better solution than this? More accurate but also easy to implement?

@Kontinuation
Copy link
Member

Kontinuation commented May 31, 2023

We can derive a more accurate formula of lat-lon envelope for haversine distance $D$:

$$ \begin{align*} \Delta \textrm{lat} &= \frac{D}{R} \\ \Delta \textrm{lon} &= \max (\frac{D}{R \cos(\textrm{lat} + \Delta \textrm{lat})}, \frac{D}{R \cos(\textrm{lat} - \Delta \textrm{lat})}) \end{align*} $$

where $\textrm{lat}$, $\textrm{lon}$ is the center of the envelope; $R$ is the mean radius of the earth. The center point is expanded by $\Delta \textrm{lat}$, $\Delta \textrm{lon}$ to obtain the envelope for coarse-grained filtering. The envelope can also be expanded by a constant factor slightly greater than 1 to get rid of false negatives.

For spheroid distance, we can use the same formula for sphere distance, but with a different $R$ value. We can take the polar radius of the spheroid as $R$ (approximately 6357 km) and the resulting envelope should be inclusive.

I'm not sure if I'm doing the math correctly, however, I found similar equations people are using from here so I guess it should be correct.

@jiayuasu jiayuasu changed the title [SEDONA-286] Support optimized distance join on ST_DistanceSpheroid [SEDONA-286] Support optimized distance join on ST_DistanceSphere and ST_DistanceSpheroid May 31, 2023
@jiayuasu jiayuasu merged commit dc3c563 into master May 31, 2023
39 checks passed
@jiayuasu jiayuasu deleted the geography-join branch May 31, 2023 08:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants