Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SEDONA-238] Implement OGC GeometryType #873

Merged
merged 13 commits into from
Jun 27, 2023
Original file line number Diff line number Diff line change
Expand Up @@ -594,6 +594,14 @@ public static String geometryType(Geometry geometry) {
return "ST_" + geometry.getGeometryType();
}

public static String geometryTypeWithMeasured(Geometry geometry) {
String geometryType = geometry.getGeometryType().toUpperCase();
if (GeomUtils.isMeasuredGeometry(geometry)) {
geometryType += "M";
}
return geometryType;
}

public static Geometry startPoint(Geometry geometry) {
if (geometry instanceof LineString) {
LineString line = (LineString) geometry;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -489,4 +489,11 @@ public static Double getHausdorffDistance(Geometry g1, Geometry g2, double densi
}
return hausdorffDistanceObj.distance();
}

public static Boolean isMeasuredGeometry(Geometry geom) {
Coordinate[] coordinates = geom.getCoordinates();
GeometryFactory geometryFactory = new GeometryFactory();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sedona will not support creating geometryCollections with hybrid dimensions. Hence, to check for measure availability, it is better to poll 1 coordinate using getCoordinate, and check for presence of M by checking !Double.isNan(coordinate.getM()). This avoids having to create a geometryFactory, and a coordinate sequence everytime.

CoordinateSequence sequence = geometryFactory.getCoordinateSequenceFactory().create(coordinates);
return sequence.getDimension() > 2 && sequence.getMeasures() > 0;
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -1078,6 +1078,34 @@ public void affine2DHybridGeomCollection() {
assertEquals(expectedPolygon2.toText(), actualGeomCollection.getGeometryN(0).getGeometryN(1).getGeometryN(1).toText());
}

@Test
public void geometryTypeWithMeasured() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is already a static object GEOMETRY_FACTORY to create geometries, there's no need to created new GeometryFactory() objects during the tests.
Please use GEOMETRY_FACTORY here

String expected = "POINT";
String actual = Functions.geometryTypeWithMeasured(GEOMETRY_FACTORY.createPoint(new Coordinate(10, 5)));
assertEquals(expected, actual);

// Create a point with measure value
CoordinateXYM coords = new CoordinateXYM(2, 3, 4);
Point measuredPoint = new GeometryFactory().createPoint(coords);
String expected2 = "POINTM";
String actual2 = Functions.geometryTypeWithMeasured(measuredPoint);
assertEquals(expected2, actual2);

// Create a linestring with measure value
CoordinateXYM[] coordsLineString = new CoordinateXYM[] {new CoordinateXYM(1, 2, 3), new CoordinateXYM(4, 5, 6)};
LineString measuredLineString = new GeometryFactory().createLineString(coordsLineString);
String expected3 = "LINESTRINGM";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add test case testing geometry collections

String actual3 = Functions.geometryTypeWithMeasured(measuredLineString);
assertEquals(expected3, actual3);

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you checked for XYZM coordinate geometries? I see that PostGIS does not implement that properly even though documentation says otherwise.
Does our function return say POINTM for POINT with XYZM coordinates? Please add testcases testing that

Copy link
Contributor Author

@yyy1000 yyy1000 Jun 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our function can return POINTM for POINT with XYZM coordinates. But it seems that PostGIS can't return that:

postgres=# SELECT GeometryType(ST_GeomFromText('POINTM(0 0 1 0)'));
ERROR:  can not mix dimensionality in a geometry
HINT:  "POINTM(0 0 1 0)" <-- parse error at position 20 within geometry

Should I need to add test for this file?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Try POINT (0 0 1 0) or POINT ZM (0 0 1 0) on PostGIS, they dont throw error, but return POINT.

In our case, yes please add test cases testing XYZM

// Create a polygon with measure value
CoordinateXYM[] coordsPolygon = new CoordinateXYM[] {new CoordinateXYM(0, 0, 0), new CoordinateXYM(1, 1, 0), new CoordinateXYM(0, 1, 0), new CoordinateXYM(0, 0, 0)};
Polygon measuredPolygon = new GeometryFactory().createPolygon(coordsPolygon);
String expected4 = "POLYGONM";
String actual4 = Functions.geometryTypeWithMeasured(measuredPolygon);
assertEquals(expected4, actual4);
}

@Test
public void hausdorffDistanceDefaultGeom2D() throws Exception {
Polygon polygon1 = GEOMETRY_FACTORY.createPolygon(coordArray3d(1, 0, 1, 1, 1, 2, 2, 1, 5, 2, 0, 1, 1, 0, 1));
Expand Down
22 changes: 22 additions & 0 deletions docs/api/flink/Function.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,25 @@
## GeometryType

Introduction: Returns the type of the geometry as a string. Eg: 'LINESTRING', 'POLYGON', 'MULTIPOINT', etc.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add measured geometry explanation in the documentation along with examples


Format: `GeometryType (A:geometry)`

Since: `v1.5.0`

Example:

```sql
SELECT GeometryType(ST_GeomFromText('LINESTRING(77.29 29.07,77.42 29.26,77.27 29.31,77.29 29.07)'));
```

Result:

```
geometrytype
--------------
LINESTRING
```

## ST_3DDistance

Introduction: Return the 3-dimensional minimum cartesian distance between A and B
Expand Down
22 changes: 22 additions & 0 deletions docs/api/sql/Function.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,25 @@
## GeometryType

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add measured geometry explanation in the documentation along with examples

Introduction: Returns the type of the geometry as a string. Eg: 'LINESTRING', 'POLYGON', 'MULTIPOINT', etc.

Format: `GeometryType (A:geometry)`

Since: `v1.5.0`

Example:

```sql
SELECT GeometryType(ST_GeomFromText('LINESTRING(77.29 29.07,77.42 29.26,77.27 29.31,77.29 29.07)'));
```

Result:

```
geometrytype
--------------
LINESTRING
```

## ST_3DDistance

Introduction: Return the 3-dimensional minimum cartesian distance between A and B
Expand Down
1 change: 1 addition & 0 deletions flink/src/main/java/org/apache/sedona/flink/Catalog.java
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,7 @@ public static UserDefinedFunction[] getFuncs() {
new Constructors.ST_GeomFromKML(),
new Constructors.ST_MPolyFromText(),
new Constructors.ST_MLineFromText(),
new Functions.GeometryType(),
new Functions.ST_Area(),
new Functions.ST_AreaSpheroid(),
new Functions.ST_Azimuth(),
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,14 @@
import org.opengis.referencing.operation.TransformException;

public class Functions {
public static class GeometryType extends ScalarFunction {
@DataTypeHint("String")
public String eval(@DataTypeHint(value = "RAW", bridgedTo = org.locationtech.jts.geom.Geometry.class) Object o) {
Geometry geom = (Geometry) o;
return org.apache.sedona.common.Functions.geometryTypeWithMeasured(geom);
}
}

public static class ST_Area extends ScalarFunction {
@DataTypeHint("Double")
public Double eval(@DataTypeHint(value = "RAW", bridgedTo = org.locationtech.jts.geom.Geometry.class) Object o) {
Expand Down
13 changes: 12 additions & 1 deletion flink/src/test/java/org/apache/sedona/flink/FunctionTest.java
Original file line number Diff line number Diff line change
Expand Up @@ -153,7 +153,6 @@ public void testTransformWKT() throws FactoryException {

}


@Test
public void testDimension(){
Table pointTable = tableEnv.sqlQuery(
Expand All @@ -164,6 +163,7 @@ public void testDimension(){
"SELECT ST_Dimension(ST_GeomFromWKT('GEOMETRYCOLLECTION(MULTIPOLYGON(((0 0, 0 1, 1 1, 1 0, 0 0)), ((2 2, 2 3, 3 3, 3 2, 2 2))), MULTIPOINT(6 6, 7 7, 8 8))'))");
assertEquals(2, first(pointTable).getField(0));
}

@Test
public void testDistance() {
Table pointTable = createPointTable(testDataSize);
Expand Down Expand Up @@ -250,6 +250,17 @@ public void testGeomToGeoHash() {
assertEquals(first(pointTable).getField(0), "s0000");
}

@Test
public void testGeometryType() {
Table pointTable = tableEnv.sqlQuery(
"SELECT GeometryType(ST_GeomFromText('LINESTRING(77.29 29.07,77.42 29.26,77.27 29.31,77.29 29.07)'))");
assertEquals("LINESTRING", first(pointTable).getField(0));

pointTable = tableEnv.sqlQuery(
"SELECT GeometryType(ST_GeomFromText('POINTM(2.0 3.5 10.2)'))");
assertEquals("POINTM", first(pointTable).getField(0));
}

@Test
public void testPointOnSurface() {
Table pointTable = createPointTable_real(testDataSize);
Expand Down
11 changes: 11 additions & 0 deletions python/sedona/sql/st_functions.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@


__all__ = [
"GeometryType",
"ST_3DDistance",
"ST_AddPoint",
"ST_Area",
Expand Down Expand Up @@ -120,6 +121,16 @@

_call_st_function = partial(call_sedona_function, "st_functions")

@validate_argument_types
def GeometryType(geometry: ColumnOrName):
"""Return the type of the geometry as a string.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add measured geometry explanation in the documentation


:param geometry: Geometry column to calculate the dimension for.
:type geometry: ColumnOrName
:return: Type of geometry as a string column.
:rtype: Column
"""
return _call_st_function("GeometryType", geometry)

@validate_argument_types
def ST_3DDistance(a: ColumnOrName, b: ColumnOrName) -> Column:
Expand Down
1 change: 1 addition & 0 deletions python/tests/sql/test_dataframe_api.py
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,7 @@
(stc.ST_PolygonFromText, ("multiple_point", lambda: f.lit(',')), "constructor", "", "POLYGON ((0 0, 1 0, 1 1, 0 0))"),

# functions
(stf.GeometryType, ("line",), "linestring_geom", "", "LINESTRING"),
(stf.ST_3DDistance, ("a", "b"), "two_points", "", 5.0),
(stf.ST_Affine, ("geom", 1.0, 2.0, 1.0, 2.0, 1.0, 2.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0), "square_geom", "", "POLYGON ((2 3, 4 5, 5 6, 3 4, 2 3))"),
(stf.ST_AddPoint, ("line", lambda: f.expr("ST_Point(1.0, 1.0)")), "linestring_geom", "", "LINESTRING (0 0, 1 0, 2 0, 3 0, 4 0, 5 0, 1 1)"),
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,7 @@ object Catalog {

val expressions: Seq[FunctionDescription] = Seq(
// Expression for vectors
function[GeometryType](),
function[ST_PointFromText](),
function[ST_PolygonFromText](),
function[ST_LineStringFromText](),
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -1018,6 +1018,7 @@ case class ST_Dimension(inputExpressions: Seq[Expression])
copy(inputExpressions = newChildren)
}
}

case class ST_BoundingDiagonal(inputExpressions: Seq[Expression])
extends InferredExpression(Functions.boundingDiagonal _) {
protected def withNewChildrenInternal(newChildren: IndexedSeq[Expression]) = {
Expand All @@ -1031,3 +1032,10 @@ case class ST_HausdorffDistance(inputExpressions: Seq[Expression])
copy(inputExpressions = newChildren)
}
}

case class GeometryType(inputExpressions: Seq[Expression])
extends InferredExpression(Functions.geometryTypeWithMeasured _) with FoldableExpression {
protected def withNewChildrenInternal(newChildren: IndexedSeq[Expression]) = {
copy(inputExpressions = newChildren)
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,9 @@ import org.apache.spark.sql.sedona_sql.expressions.collect.{ST_Collect}
import org.locationtech.jts.operation.buffer.BufferParameters

object st_functions extends DataFrameAPI {
def GeometryType(a: Column, b: Column): Column = wrapExpression[GeometryType](a, b)
def GeometryType(a: String, b: String): Column = wrapExpression[GeometryType](a, b)

def ST_3DDistance(a: Column, b: Column): Column = wrapExpression[ST_3DDistance](a, b)
def ST_3DDistance(a: String, b: String): Column = wrapExpression[ST_3DDistance](a, b)

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -1038,5 +1038,13 @@ class dataFrameAPITestScala extends TestBaseScala {
assert(expected == actual)
assert(expected == actualDefaultValue)
}

it("Passed GeometryType") {
val polyDf = sparkSession.sql("SELECT ST_GeomFromWKT('POLYGON ((1 2, 2 1, 2 0, 4 1, 1 2))') AS geom")
val df = polyDf.select(ST_GeometryType("geom"))
val expected = "POLYGON"
val actual = df.take(1)(0).get(0).asInstanceOf[String]
assert(expected == actual)
}
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -2062,4 +2062,23 @@ class functionTestScala extends TestBaseScala with Matchers with GeometrySample
}
}

it ("should pass GeometryType") {
val geomTestCases = Map (
("'POINT (51.3168 -0.56)'") -> "POINT",
("'POINT (0 0 1)'") -> "POINT",
("'LINESTRING (0 0, 0 90)'") -> "LINESTRING",
("'POLYGON ((0 0,0 5,5 0,0 0))'") -> "POLYGON",
("'POINTM (1, 2, 3)'") -> "POINTM",
("'LINESTRINGM (0 0 1, 0 90 1)'") -> "LINESTRINGM",
("'POLYGONM ((0 0 1,0 5 1,5 0 1,0 0 1))'") -> "POLYGONM"
)
for ((geom, expectedResult) <- geomTestCases) {
val df = sparkSession.sql(s"SELECT GeometryType(ST_GeomFromWKT($geom)), " +
s"$expectedResult")
val actual = df.take(1)(0).get(0).asInstanceOf[String]
val expected = df.take(1)(0).get(1).asInstanceOf[String]
assertEquals(expected, actual)
}
}

}