Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use PROJJSON instead of WKT2:2019 #96

Merged
merged 11 commits into from
May 25, 2022
154 changes: 116 additions & 38 deletions format-specs/geoparquet.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,55 +50,133 @@ Version of the geoparquet spec used, currently 0.3.0

Each geometry column in the dataset must be included in the columns field above with the following content, keyed by the column name:

| Field Name | Type | Description |
| ---------- | ----------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------- |
| encoding | string | **REQUIRED** Name of the geometry encoding format. Currently only 'WKB' is supported. |
| geometry_type | string or \[string] | **REQUIRED** The geometry type(s) of all geometries, or 'Unknown' if they are not known. |
| crs | string | **OPTIONAL** [WKT2](https://docs.opengeospatial.org/is/18-010r7/18-010r7.html) string representing the Coordinate Reference System (CRS) of the geometry. If the crs field is not included then the data in this column must be stored in longitude, latitude. In the case where a crs is not provided, CRS-aware implementations should assume a default value of [OGC:CRS84](https://www.opengis.net/def/crs/OGC/1.3/CRS84) (longitude-latitude coordinates). |
| orientation | string | **OPTIONAL** Winding order of exterior ring of polygons. If present must be 'counterclockwise'; interior rings are wound in opposite order. If absent, no assertions are made regarding the winding order.
| edges | string | **OPTIONAL** Name of the coordinate system for the edges. Must be one of 'planar' or 'spherical'. The default value is 'planar'. |
| bbox | \[number] | **OPTIONAL** Bounding Box of the geometries in the file, formatted according to [RFC 7946, section 5](https://tools.ietf.org/html/rfc7946#section-5). |
| epoch | double | **OPTIONAL** Coordinate epoch in case of a dynamic CRS, expressed as a decimal year. |

| Field Name | Type | Description |
| ------------- | ------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| encoding | string | **REQUIRED** Name of the geometry encoding format. Currently only 'WKB' is supported. |
brendan-ward marked this conversation as resolved.
Show resolved Hide resolved
| geometry_type | string or \[string] | **REQUIRED** The geometry type(s) of all geometries, or 'Unknown' if they are not known. |
| crs | JSON object | **OPTIONAL** [PROJJSON](https://proj.org/specifications/projjson.html) JSON object representing the Coordinate Reference System (CRS) of the geometry. If the crs field is not included then the data in this column must be stored in longitude, latitude, and CRS-aware implementations may assume a default value of [OGC:CRS84](https://www.opengis.net/def/crs/OGC/1.3/CRS84) (longitude-latitude coordinates). |
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note change with respect to default value from "should" to "may", but the specific term should likely follow RFC 2119 with respect to these terms as per #26

Based on discussions around this default, it may instead be appropriate to use "must" in order to prevent ambiguity regarding an unset value.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the reason exactly to change this?

(I am not necessarily opposed to be clear, just don't directly remember that we discussed this)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My intent was: "may" indicates reader may do so at their own peril (i.e., if you really don't care about ellipsoid); "should" indicates we recommend that the reader do so - which seems a bit strong when the ellipsoid of the long,lat coords is not defined.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"should" indicates we recommend that the reader do so - which seems a bit strong when the ellipsoid of the long,lat coords is not defined.

It seems the last discussion seems to lean towards assuming WGS84? (which is still not a well defined ellipsoid without an epoch, but OK ;))

| orientation | string | **OPTIONAL** Winding order of exterior ring of polygons. If present must be 'counterclockwise'; interior rings are wound in opposite order. If absent, no assertions are made regarding the winding order. |
| edges | string | **OPTIONAL** Name of the coordinate system for the edges. Must be one of 'planar' or 'spherical'. The default value is 'planar'. |
| bbox | \[number] | **OPTIONAL** Bounding Box of the geometries in the file, formatted according to [RFC 7946, section 5](https://tools.ietf.org/html/rfc7946#section-5). |
| epoch | double | **OPTIONAL** Coordinate epoch in case of a dynamic CRS, expressed as a decimal year.

#### crs

The Coordinate Reference System (CRS) is an optional parameter for each geometry column defined in geoparquet format.

The CRS must be provided in [WKT](https://en.wikipedia.org/wiki/Well-known_text_representation_of_coordinate_reference_systems) version 2, also known as **WKT2**. WKT2 has several revisions, this specification only supports [WKT2_2019](https://docs.opengeospatial.org/is/18-010r7/18-010r7.html).
The CRS must be provided in [PROJJSON](https://proj.org/specifications/projjson.html) format, which is a JSON encoding of [WKT2:2019 / ISO-19162:2019](https://docs.opengeospatial.org/is/18-010r7/18-010r7.html), which itself implements the model of [OGC Topic 2: Referencing by coordinates abstract specification / ISO-19111:2019](http://docs.opengeospatial.org/as/18-005r4/18-005r4.html). Apart from the difference of encodings, the semantics is intended to be exactly the same as WKT2:2019, and PROJJSON can be morphed losslessly from/into WKT2:2019.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note there was a comment on this in #97 - #97 (comment) That's the 'require' PR, but this one should merge first so we can cut 0.4.0 release.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Either one could theoretically go out as 0.4.0; per the discussion I thought the intent was to compare the optional PROJJSON with OGC:CRS84 default vs required PROJJSON, rather than them being additive. But I can certainly update this PR to address comments in #97 if we'd like to see this particular PR go out sooner.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. I think the plan was to aim for 0.4.0 without it - I organized the 0.4 milestone around that - and then to be sure to solve this for 0.5. Just to be sure to make iterative progress as it seemed like that one could take longer.

I think we targeted next week for the 0.4.0 release. I'm also not opposed to get this into 0.4 release with everyone happy, and just put up a discussion in #98 to help push it forward. So maybe see where that discussion is at on monday of next week, and if it's not fully resolved then it'd be great if you could clean up this PR and we push out 0.4


If CRS is not provided, then all coordinates in the geometry must use longitude, latitude to store their data.
If an implementation is CRS-aware and needs a CRS representation of the data it should assume a default value is [OGC:CRS84](https://www.opengis.net/def/crs/OGC/1.3/CRS84). It's equivalent to the well-known [EPSG:4326](https://epsg.org/crs_4326/WGS-84.html) but changes the axis from latitude-longitude to longitude-latitude. The WKT2:2019 string for OGC:CRS84 is:
If an implementation is CRS-aware and needs a CRS representation of the data it should assume a default value is [OGC:CRS84](https://www.opengis.net/def/crs/OGC/1.3/CRS84), which is equivalent to the well-known [EPSG:4326](https://epsg.org/crs_4326/WGS-84.html) but changes the axis from latitude-longitude to longitude-latitude.

Note: EPSG:4326 and OGC:CRS84 are equivalent with respect to this specification because this specification specifically overrides the coordinate axis order in the `crs` to be longitude-latitude.

The current PROJJSON JSON object for OGC:CRS84 is:

```
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: apply JSON syntax highlighting to the block

Suggested change
```
```json

GEOGCRS["WGS 84 (CRS84)",
ENSEMBLE["World Geodetic System 1984 ensemble",
MEMBER["World Geodetic System 1984 (Transit)"],
MEMBER["World Geodetic System 1984 (G730)"],
MEMBER["World Geodetic System 1984 (G873)"],
MEMBER["World Geodetic System 1984 (G1150)"],
MEMBER["World Geodetic System 1984 (G1674)"],
MEMBER["World Geodetic System 1984 (G1762)"],
MEMBER["World Geodetic System 1984 (G2139)"],
ELLIPSOID["WGS 84",6378137,298.257223563,
LENGTHUNIT["metre",1]],
ENSEMBLEACCURACY[2.0]],
PRIMEM["Greenwich",0,
ANGLEUNIT["degree",0.0174532925199433]],
CS[ellipsoidal,2],
AXIS["geodetic longitude (Lon)",east,
ORDER[1],
ANGLEUNIT["degree",0.0174532925199433]],
AXIS["geodetic latitude (Lat)",north,
ORDER[2],
ANGLEUNIT["degree",0.0174532925199433]],
USAGE[
SCOPE["Not known."],
AREA["World."],
BBOX[-90,-180,90,180]],
ID["OGC","CRS84"]]
{
"$schema": "https://proj.org/schemas/v0.4/projjson.schema.json",
"type": "GeographicCRS",
"name": "WGS 84 (CRS84)",
"datum_ensemble": {
"name": "World Geodetic System 1984 ensemble",
"members": [
{
"name": "World Geodetic System 1984 (Transit)",
"id": {
"authority": "EPSG",
"code": 1166
}
},
{
"name": "World Geodetic System 1984 (G730)",
"id": {
"authority": "EPSG",
"code": 1152
}
},
{
"name": "World Geodetic System 1984 (G873)",
"id": {
"authority": "EPSG",
"code": 1153
}
},
{
"name": "World Geodetic System 1984 (G1150)",
"id": {
"authority": "EPSG",
"code": 1154
}
},
{
"name": "World Geodetic System 1984 (G1674)",
"id": {
"authority": "EPSG",
"code": 1155
}
},
{
"name": "World Geodetic System 1984 (G1762)",
"id": {
"authority": "EPSG",
"code": 1156
}
},
{
"name": "World Geodetic System 1984 (G2139)",
"id": {
"authority": "EPSG",
"code": 1309
}
}
],
"ellipsoid": {
"name": "WGS 84",
"semi_major_axis": 6378137,
"inverse_flattening": 298.257223563
},
"accuracy": "2.0",
"id": {
"authority": "EPSG",
"code": 6326
}
},
"coordinate_system": {
"subtype": "ellipsoidal",
"axis": [
{
"name": "Geodetic longitude",
"abbreviation": "Lon",
"direction": "east",
"unit": "degree"
},
{
"name": "Geodetic latitude",
"abbreviation": "Lat",
"direction": "north",
"unit": "degree"
}
]
},
"scope": "Not known.",
"area": "World.",
"bbox": {
"south_latitude": -90,
"west_longitude": -180,
"north_latitude": 90,
"east_longitude": 180
},
"id": {
"authority": "OGC",
"code": "CRS84"
}
}
```

Note: earlier versions of OGC:CRS84 used a single datum instead of a datum ensemble. The number of elements in the datum ensemble may vary over time when new realizations of WGS 84 are added to the ensemble
brendan-ward marked this conversation as resolved.
Show resolved Hide resolved


Due to the large number of CRSes available and the difficulty of implementing all of them, we expect that a number of implementations will start without support for the optional `crs` field.
Users are recommended to store their data in longitude, latitude (OGC:CRS84 or not including the `crs` field) for it to work with the widest number of tools. But data that is better served in particular projections can choose to use an alternate coordinate reference system. We expect many tools will support alternate CRSes, but encourage users to check to ensure their chosen tool supports their chosen crs.

Expand Down