Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SEDONA-461] Add ST_IsValidReason #1181

Merged
merged 1 commit into from
Jan 5, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
25 changes: 25 additions & 0 deletions common/src/main/java/org/apache/sedona/common/Functions.java
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,7 @@
import org.locationtech.jts.operation.linemerge.LineMerger;
import org.locationtech.jts.operation.valid.IsSimpleOp;
import org.locationtech.jts.operation.valid.IsValidOp;
import org.locationtech.jts.operation.valid.TopologyValidationError;
import org.locationtech.jts.precision.GeometryPrecisionReducer;
import org.locationtech.jts.simplify.TopologyPreservingSimplifier;
import org.wololo.jts2geojson.GeoJSONWriter;
Expand All @@ -71,6 +72,8 @@ public class Functions {
private static GeometryCollection EMPTY_GEOMETRY_COLLECTION = GEOMETRY_FACTORY.createGeometryCollection(null);
private static final double DEFAULT_TOLERANCE = 1e-6;
private static final int DEFAULT_MAX_ITER = 1000;
private static final int OGC_SFS_VALIDITY = 0; // Use usual OGC SFS validity semantics
private static final int ESRI_VALIDITY = 1; // ESRI validity model

public static double area(Geometry geometry) {
return geometry.getArea();
Expand Down Expand Up @@ -1272,4 +1275,26 @@ public static Double hausdorffDistance(Geometry g1, Geometry g2, double densityF
public static Double hausdorffDistance(Geometry g1, Geometry g2) throws Exception{
return GeomUtils.getHausdorffDistance(g1, g2, -1);
}

public static String isValidReason(Geometry geom) {
return isValidReason(geom, OGC_SFS_VALIDITY);
}

public static String isValidReason(Geometry geom, int flags) {
IsValidOp isValidOp = new IsValidOp(geom);

// Set the validity model based on flags
if (flags == ESRI_VALIDITY) {
isValidOp.setSelfTouchingRingFormingHoleValid(true);
} else {
isValidOp.setSelfTouchingRingFormingHoleValid(false);
}

if (isValidOp.isValid()) {
return "Valid Geometry";
} else {
TopologyValidationError error = isValidOp.getValidationError();
return error.toString();
}
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -1884,4 +1884,32 @@ public void lineLocatePoint() {
assertEquals(expectedResult2, actual2, FP_TOLERANCE);
assertEquals(expectedResult3, actual3, FP_TOLERANCE);
}

@Test
public void isValidReason() {
// Valid geometry
Geometry validGeom = GEOMETRY_FACTORY.createPolygon(coordArray(30, 10, 40, 40, 20, 40, 10, 20, 30, 10));
String validReasonDefault = Functions.isValidReason(validGeom);
assertEquals("Valid Geometry", validReasonDefault);

Integer OGC_SFS_VALIDITY = 0;
Integer ESRI_VALIDITY = 1;

String validReasonOGC = Functions.isValidReason(validGeom, OGC_SFS_VALIDITY);
assertEquals("Valid Geometry", validReasonOGC);

String validReasonESRI = Functions.isValidReason(validGeom, ESRI_VALIDITY);
assertEquals("Valid Geometry", validReasonESRI);

// Invalid geometry (self-intersection)
Geometry invalidGeom = GEOMETRY_FACTORY.createPolygon(coordArray(30, 10, 40, 40, 20, 40, 30, 10, 10, 20, 30, 10));
String invalidReasonDefault = Functions.isValidReason(invalidGeom);
assertEquals("Ring Self-intersection at or near point (30.0, 10.0, NaN)", invalidReasonDefault);

String invalidReasonOGC = Functions.isValidReason(invalidGeom, OGC_SFS_VALIDITY);
assertEquals("Ring Self-intersection at or near point (30.0, 10.0, NaN)", invalidReasonOGC);

String invalidReasonESRI = Functions.isValidReason(invalidGeom, ESRI_VALIDITY);
assertEquals("Self-intersection at or near point (10.0, 20.0, NaN)", invalidReasonESRI);
}
}
50 changes: 50 additions & 0 deletions docs/api/flink/Function.md
Original file line number Diff line number Diff line change
Expand Up @@ -1610,6 +1610,56 @@ Output:
false
```

## ST_IsValidReason

Introduction: Returns text stating if the geometry is valid. If not, it provides a reason why it is invalid. The function can be invoked with just the geometry or with an additional flag. The flag alters the validity checking behavior. The flags parameter is a bitfield with the following options:

- 0 (default): Use usual OGC SFS (Simple Features Specification) validity semantics.
- 1: "ESRI flag", Accepts certain self-touching rings as valid, which are considered invalid under OGC standards.

Formats:
```
ST_IsValidReason (A: Geometry)
```
```
ST_IsValidReason (A: Geometry, flag: Integer)
```

Since: `v1.5.1`

SQL Example for valid geometry:

```sql
SELECT ST_IsValidReason(ST_GeomFromWKT('POLYGON ((100 100, 100 300, 300 300, 300 100, 100 100))')) as validity_info
```

Output:

```
Valid Geometry
```

SQL Example for invalid geometries:

```sql
SELECT gid, ST_IsValidReason(geom) as validity_info
FROM Geometry_table
WHERE ST_IsValid(geom) = false
ORDER BY gid
```

Output:

```
gid | validity_info
-----+----------------------------------------------------
5330 | Self-intersection at or near point (32.0, 5.0, NaN)
5340 | Self-intersection at or near point (42.0, 5.0, NaN)
5350 | Self-intersection at or near point (52.0, 5.0, NaN)

```


## ST_Length

Introduction: Return the perimeter of A
Expand Down
49 changes: 49 additions & 0 deletions docs/api/sql/Function.md
Original file line number Diff line number Diff line change
Expand Up @@ -1620,6 +1620,55 @@ Output:
false
```

## ST_IsValidReason

Introduction: Returns text stating if the geometry is valid. If not, it provides a reason why it is invalid. The function can be invoked with just the geometry or with an additional flag. The flag alters the validity checking behavior. The flags parameter is a bitfield with the following options:

- 0 (default): Use usual OGC SFS (Simple Features Specification) validity semantics.
- 1: "ESRI flag", Accepts certain self-touching rings as valid, which are considered invalid under OGC standards.

Formats:
```
ST_IsValidReason (A: Geometry)
```
```
ST_IsValidReason (A: Geometry, flag: Integer)
```

Since: `v1.5.1`

SQL Example for valid geometry:

```sql
SELECT ST_IsValidReason(ST_GeomFromWKT('POLYGON ((100 100, 100 300, 300 300, 300 100, 100 100))')) as validity_info
```

Output:

```
Valid Geometry
```

SQL Example for invalid geometries:

```sql
SELECT gid, ST_IsValidReason(geom) as validity_info
FROM Geometry_table
WHERE ST_IsValid(geom) = false
ORDER BY gid
```

Output:

```
gid | validity_info
-----+----------------------------------------------------
5330 | Self-intersection at or near point (32.0, 5.0, NaN)
5340 | Self-intersection at or near point (42.0, 5.0, NaN)
5350 | Self-intersection at or near point (52.0, 5.0, NaN)

```

## ST_Length

Introduction: Return the perimeter of A
Expand Down
1 change: 1 addition & 0 deletions flink/src/main/java/org/apache/sedona/flink/Catalog.java
Original file line number Diff line number Diff line change
Expand Up @@ -145,6 +145,7 @@ public static UserDefinedFunction[] getFuncs() {
new Functions.ST_HausdorffDistance(),
new Functions.ST_IsCollection(),
new Functions.ST_CoordDim(),
new Functions.ST_IsValidReason()
};
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -1086,4 +1086,17 @@ public Double eval(@DataTypeHint("Double") Double angleInRadian) {

}
}

public static class ST_IsValidReason extends ScalarFunction {
@DataTypeHint("String")
public String eval(@DataTypeHint(value = "RAW", bridgedTo = Geometry.class) Object geomObject) {
Geometry geom = (Geometry) geomObject;
return org.apache.sedona.common.Functions.isValidReason(geom);
}
@DataTypeHint("String")
public String eval(@DataTypeHint(value = "RAW", bridgedTo = Geometry.class) Object geomObject, @DataTypeHint("Integer") Integer flags) {
Geometry geom = (Geometry) geomObject;
return org.apache.sedona.common.Functions.isValidReason(geom, flags);
}
}
}
35 changes: 35 additions & 0 deletions flink/src/test/java/org/apache/sedona/flink/FunctionTest.java
Original file line number Diff line number Diff line change
Expand Up @@ -1204,4 +1204,39 @@ public void testCoordDimFor3D() {
assertEquals(3, result, 0);
}

@Test
public void testIsValidReason() {
// Test with an invalid geometry (bow-tie polygon)
String bowTieWKT = "POLYGON ((100 200, 100 100, 200 200, 200 100, 100 200))";
Table bowTieTable = tableEnv.sqlQuery("SELECT ST_GeomFromText('" + bowTieWKT + "') AS geom");
Table bowTieValidityTable = bowTieTable.select(call("ST_IsValidReason", $("geom")));
String bowTieValidityReason = (String) first(bowTieValidityTable).getField(0);
System.out.println(bowTieValidityReason);
assertTrue(bowTieValidityReason.contains("Self-intersection"));

// Test with a valid geometry (simple linestring)
String lineWKT = "LINESTRING (220227 150406, 2220227 150407, 222020 150410)";
Table lineTable = tableEnv.sqlQuery("SELECT ST_GeomFromText('" + lineWKT + "') AS geom");
Table lineValidityTable = lineTable.select(call("ST_IsValidReason", $("geom")));
String lineValidityReason = (String) first(lineValidityTable).getField(0);
assertEquals("Valid Geometry", lineValidityReason);

final int OGC_SFS_VALIDITY = 0;
final int ESRI_VALIDITY = 1;

// Geometry that is invalid under both OGC and ESRI standards, but with different reasons
String selfTouchingWKT = "POLYGON ((0 0, 2 0, 1 1, 2 2, 0 2, 1 1, 0 0))";
Table specialCaseTable = tableEnv.sqlQuery("SELECT ST_GeomFromText('" + selfTouchingWKT + "') AS geom");

// Test with OGC flag
Table ogcValidityTable = specialCaseTable.select(call("ST_IsValidReason", $("geom"), OGC_SFS_VALIDITY));
String ogcValidityReason = (String) first(ogcValidityTable).getField(0);
assertEquals("Ring Self-intersection at or near point (1.0, 1.0, NaN)", ogcValidityReason); // Expecting a self-intersection error as per OGC standards

// Test with ESRI flag
Table esriValidityTable = specialCaseTable.select(call("ST_IsValidReason", $("geom"), ESRI_VALIDITY));
String esriValidityReason = (String) first(esriValidityTable).getField(0);
assertEquals("Interior is disconnected at or near point (1.0, 1.0, NaN)", esriValidityReason); // Expecting an error related to interior disconnection as per ESRI standards
}

}
19 changes: 18 additions & 1 deletion python/sedona/sql/st_functions.py
Original file line number Diff line number Diff line change
Expand Up @@ -129,7 +129,8 @@
"ST_CoordDim",
"ST_IsCollection",
"ST_Affine",
"ST_BoundingDiagonal"
"ST_BoundingDiagonal",
"ST_IsValidReason"
]


Expand Down Expand Up @@ -1582,3 +1583,19 @@ def ST_IsCollection(geometry: ColumnOrName) -> Column:
:rtype: Column
"""
return _call_st_function("ST_IsCollection", geometry)

@validate_argument_types
def ST_IsValidReason(geometry: ColumnOrName, flag: Optional[Union[ColumnOrName, int]] = None) -> Column:
"""
Provides a text description of why a geometry is not valid or states that it is valid.
An optional flag parameter can be provided for additional options.

:param geometry: Geometry column to validate.
:type geometry: ColumnOrName
:param flag: Optional flag to modify behavior of the validity check.
:type flag: Optional[Union[ColumnOrName, int]]
:return: Description of validity as a string column.
:rtype: Column
"""
args = (geometry,) if flag is None else (geometry, flag)
return _call_st_function("ST_IsValidReason", args)
2 changes: 2 additions & 0 deletions python/tests/sql/test_dataframe_api.py
Original file line number Diff line number Diff line change
Expand Up @@ -159,6 +159,8 @@
(stf.ST_YMax, ("geom",), "triangle_geom", "", 1.0),
(stf.ST_YMin, ("geom",), "triangle_geom", "", 0.0),
(stf.ST_Z, ("b",), "two_points", "", 4.0),
(stf.ST_IsValidReason, ("geom",), "triangle_geom", "", "Valid Geometry"),
(stf.ST_IsValidReason, ("geom", 1), "triangle_geom", "", "Valid Geometry"),

# predicates
(stp.ST_Contains, ("geom", lambda: f.expr("ST_Point(0.5, 0.25)")), "triangle_geom", "", True),
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -175,6 +175,7 @@ object Catalog {
function[ST_Degrees](),
function[ST_HausdorffDistance](-1),
function[ST_DWithin](),
function[ST_IsValidReason](),
// Expression for rasters
function[RS_NormalizedDifference](),
function[RS_Mean](),
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -1163,3 +1163,17 @@ case class ST_IsCollection(inputExpressions: Seq[Expression])
copy(inputExpressions = newChildren)
}
}

/**
* Returns a text description of the validity of the geometry considering the specified flags.
* If flag not specified, it defaults to OGC SFS validity semantics.
*
* @param geom The geometry to validate.
* @param flag The validation flags.
* @return A string describing the validity of the geometry.
*/
case class ST_IsValidReason(inputExpressions: Seq[Expression])
extends InferredExpression(inferrableFunction2(Functions.isValidReason), inferrableFunction1(Functions.isValidReason)) {

protected def withNewChildrenInternal(newChildren: IndexedSeq[Expression]) = copy(inputExpressions = newChildren)
}
Original file line number Diff line number Diff line change
Expand Up @@ -431,4 +431,12 @@ object st_functions extends DataFrameAPI {

def ST_IsCollection(geometry: String): Column = wrapExpression[ST_IsCollection](geometry)

def ST_IsValidReason(geometry: Column): Column = wrapExpression[ST_IsValidReason](geometry)

def ST_IsValidReason(geometry: Column, flag: Column): Column = wrapExpression[ST_IsValidReason](geometry, flag)

def ST_IsValidReason(geometry: String): Column = wrapExpression[ST_IsValidReason](geometry)

def ST_IsValidReason(geometry: String, flag: Integer): Column = wrapExpression[ST_IsValidReason](geometry, flag)

}
Original file line number Diff line number Diff line change
Expand Up @@ -1258,5 +1258,28 @@ class dataFrameAPITestScala extends TestBaseScala {
val actual = df.head()(0).asInstanceOf[Boolean]
assertTrue(actual)
}

it("Passed ST_IsValidReason") {
// Valid Geometry
val validPolygonWKT = "POLYGON ((0 0, 2 0, 2 2, 0 2, 1 1, 0 0))"
val validTable = Seq(validPolygonWKT).toDF("wkt").select(ST_GeomFromWKT($"wkt").as("geom"))
val validityTable = validTable.select(ST_IsValidReason($"geom"))
val validityReason = validityTable.take(1)(0).getString(0)
assertEquals("Valid Geometry", validityReason)

// Geometry that is invalid under both OGC and ESRI standards, but with different reasons
val selfTouchingWKT = "POLYGON ((0 0, 2 0, 1 1, 2 2, 0 2, 1 1, 0 0))"
val specialCaseTable = Seq(selfTouchingWKT).toDF("wkt").select(ST_GeomFromWKT($"wkt").as("geom"))

// Test with OGC flag (OGC_SFS_VALIDITY = 0)
val ogcValidityTable = specialCaseTable.select(ST_IsValidReason($"geom", lit(0)))
val ogcValidityReason = ogcValidityTable.take(1)(0).getString(0)
assertEquals("Ring Self-intersection at or near point (1.0, 1.0, NaN)", ogcValidityReason)

// Test with ESRI flag (ESRI_VALIDITY = 1)
val esriValidityTable = specialCaseTable.select(ST_IsValidReason($"geom", lit(1)))
val esriValidityReason = esriValidityTable.take(1)(0).getString(0)
assertEquals("Interior is disconnected at or near point (1.0, 1.0, NaN)", esriValidityReason)
}
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -2278,4 +2278,36 @@ class functionTestScala extends TestBaseScala with Matchers with GeometrySample

}

it ("ST_IsValidReason should provide reasons for invalid geometries") {
val testData = Seq(
(5330, "POLYGON ((0 0, 3 3, 0 3, 3 0, 0 0))"),
(5340, "POLYGON ((100 100, 300 300, 100 300, 300 100, 100 100))"),
(5350, "POLYGON ((0 0, 0 10, 10 10, 10 0, 0 0), (20 20, 20 30, 30 30, 30 20, 20 20))")
)

val df = sparkSession.createDataFrame(testData).toDF("gid", "wkt")
.select($"gid", expr("ST_GeomFromWKT(wkt) as geom"))
df.createOrReplaceTempView("geometry_table")

val result = sparkSession.sql(
"""
|SELECT gid, ST_IsValidReason(geom) as validity_info
|FROM geometry_table
|WHERE ST_IsValid(geom) = false
|ORDER BY gid
""".stripMargin)

val expectedResults = Map(
5330 -> "Self-intersection at or near point (1.5, 1.5, NaN)",
5340 -> "Self-intersection at or near point (200.0, 200.0, NaN)",
5350 -> "Hole lies outside shell at or near point (20.0, 20.0)"
)

result.collect().foreach { row =>
val gid = row.getAs[Int]("gid")
val validityInfo = row.getAs[String]("validity_info")
assert(validityInfo == expectedResults(gid))
}
}

}
Loading