You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I upgrade spark to 3.5.1 , try LATERAL to calculate 1-N-N (1-Nearest-Neighbour)
I'll get point's 1-N-N inside the same table : data_points(id,longitude,latitude) ,use sedona
Actual behavior
spark do not support this type LATERAL
Steps to reproduce the problem
with t_data as (
select id ,st_point(longitude,latitude) as point from data_points order by 1 limit 1000
)
select * from t_data t1, lateral (
select t2.id,ST_DistanceSpheroid(t1.point,t2.point) as distance from t_data t2
where t1.id!=t2.id order by 2 limit 1
)
Spark throws :
"org.apache.spark.sql.catalyst.ExtendedAnalysisException: [UNSUPPORTED_SUBQUERY_EXPRESSION_CATEGORY.ACCESSING_OUTER_QUERY_COLUMN_IS_NOT_ALLOWED] Unsupported subquery expression: Accessing outer query column is not allowed in this locationProject"
I just want to know How can optimize 1-N-N in a large dataset rather than row_number(order by distance) = 1
Settings
Sedona version = 1.5.1
Apache Spark version = 3.5.1
API type = Scala
Scala version = 2.12
JRE version = 1.8
Environment = Standalone
The text was updated successfully, but these errors were encountered:
Expected behavior
reference to https://postgis.net/workshops/postgis-intro/knn.html
https://spark.apache.org/docs/latest/sql-ref-syntax-qry-select-lateral-subquery.html
I upgrade spark to 3.5.1 , try LATERAL to calculate 1-N-N (1-Nearest-Neighbour)
I'll get point's 1-N-N inside the same table : data_points(id,longitude,latitude) ,use sedona
Actual behavior
spark do not support this type LATERAL
Steps to reproduce the problem
with t_data as (
select id ,st_point(longitude,latitude) as point from data_points order by 1 limit 1000
)
select * from t_data t1, lateral (
select t2.id,ST_DistanceSpheroid(t1.point,t2.point) as distance from t_data t2
where t1.id!=t2.id order by 2 limit 1
)
Spark throws :
"org.apache.spark.sql.catalyst.ExtendedAnalysisException: [UNSUPPORTED_SUBQUERY_EXPRESSION_CATEGORY.ACCESSING_OUTER_QUERY_COLUMN_IS_NOT_ALLOWED] Unsupported subquery expression: Accessing outer query column is not allowed in this locationProject"
I just want to know How can optimize 1-N-N in a large dataset rather than row_number(order by distance) = 1
Settings
Sedona version = 1.5.1
Apache Spark version = 3.5.1
API type = Scala
Scala version = 2.12
JRE version = 1.8
Environment = Standalone
The text was updated successfully, but these errors were encountered: