Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use PROJJSON instead of WKT2:2019 #96

Merged
merged 11 commits into from
May 25, 2022

Conversation

brendan-ward
Copy link
Contributor

Resolves #95 based on associated discussion in #90.

This assumes #94 is added separately, allowing crs to be explicitly set to null.

This preserves the existing optional behavior and default if unspecified of OGC:CRS84. The default value and optional vs required status will be taken up in another PR.

This includes a minor formatting change of the table including crs but only the crs row changed content.

| ------------- | ------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| encoding | string | **REQUIRED** Name of the geometry encoding format. Currently only 'WKB' is supported. |
| geometry_type | string or \[string] | **REQUIRED** The geometry type(s) of all geometries, or 'Unknown' if they are not known. |
| crs | JSON object | **OPTIONAL** [PROJJSON](https://proj.org/specifications/projjson.html) JSON object representing the Coordinate Reference System (CRS) of the geometry. If the crs field is not included then the data in this column must be stored in longitude, latitude, and CRS-aware implementations may assume a default value of [OGC:CRS84](https://www.opengis.net/def/crs/OGC/1.3/CRS84) (longitude-latitude coordinates). |
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note change with respect to default value from "should" to "may", but the specific term should likely follow RFC 2119 with respect to these terms as per #26

Based on discussions around this default, it may instead be appropriate to use "must" in order to prevent ambiguity regarding an unset value.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the reason exactly to change this?

(I am not necessarily opposed to be clear, just don't directly remember that we discussed this)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My intent was: "may" indicates reader may do so at their own peril (i.e., if you really don't care about ellipsoid); "should" indicates we recommend that the reader do so - which seems a bit strong when the ellipsoid of the long,lat coords is not defined.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"should" indicates we recommend that the reader do so - which seems a bit strong when the ellipsoid of the long,lat coords is not defined.

It seems the last discussion seems to lean towards assuming WGS84? (which is still not a well defined ellipsoid without an epoch, but OK ;))

@kylebarron
Copy link
Collaborator

kylebarron commented May 11, 2022

Pending #93 (and #64, a discussion of whether to make the json schema definition a core part of the geoparquet metadata spec), we should decide whether we want the json schema to be updated along with or after associated spec PRs. In this case, we'd want the json schema to be updated so that the crs refers to the projjson json schema to allow for validating the structure of the crs json.


Note: EPSG:4326 and OGC:CRS84 are equivalent with respect to this specification because this specification specifically overrides the coordinate axis order in the `crs` to be longitude-latitude.

The current PROJJSON JSON object for OGC:CRS84 is:

```
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: apply JSON syntax highlighting to the block

Suggested change
```
```json

@brendan-ward
Copy link
Contributor Author

we should decide whether we want the json schema to be updated along with or after associated spec PRs

I can update the schema here once #93 is merged; I was thinking that ideally both the spec and required changes to the schema would get updated within the same PR. It may be good for us to come up with a checklist of required / recommended updates to files here as part of making a PR to update the spec.

I'm assuming that we'd want to update the script for building an example parquet file after such PRs are in place, since that corresponds more directly to a specific version of the spec rather than intermediate changes within a PR, right?

@kylebarron
Copy link
Collaborator

It may be good for us to come up with a checklist of required / recommended updates to files here as part of making a PR to update the spec

👍 would be a great thing for a PR template.

I'm assuming that we'd want to update the script for building an example parquet file after such PRs are in place, since that corresponds more directly to a specific version of the spec rather than intermediate changes within a PR, right?

That's true, there are really three parts:

  • Spec description in markdown
  • Metadata description in JSON Schema
  • Script to build parquet file

I don't have a strong opinion on when to update the Parquet script, but that could also be related to #40. For example, I think STAC spec developed entirely on a dev branch and only merged that into the main branch on each release? So when a member of the public visited the repo, they'd see the state of the last release by default instead of the current WIP state between releases.


If an implementation is CRS-aware and needs a CRS representation of the data it may assume a default value is [OGC:CRS84](https://www.opengis.net/def/crs/OGC/1.3/CRS84), which is equivalent to the well-known [EPSG:4326](https://epsg.org/crs_4326/WGS-84.html) but changes the axis from latitude-longitude to longitude-latitude. However, there is no guarantee that coordinates
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: The "However..." can be removed if we instead state above that

If CRS is not provided, then all coordinates in the geometry must use longitude, latitude based on the OGC:CRS84 CRS to store their data.

But without that, lacking a crs means that coordinates are in an unspecified CRS.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm for removing the however and stating what it should be. Though perhaps we just say 'must use longitude, latitude with the WGS84 datum' to cover both OGC:CRS84 and EPSG:4326 (with axis order flipped).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 on removing the "however ..". So that if CRS is not available, readers can assume WGS84

@brendan-ward brendan-ward marked this pull request as ready for review May 11, 2022 16:41
format-specs/geoparquet.md Outdated Show resolved Hide resolved
format-specs/geoparquet.md Outdated Show resolved Hide resolved
format-specs/geoparquet.md Outdated Show resolved Hide resolved
@brendan-ward
Copy link
Contributor Author

OGC:CRS84 official has probably frozen for > 10 years

Which form should we include here? The EPSG:4326 with lon,lat ordering as we have here as a proxy for what OGC:CRS84 would be if it wasn't frozen, since that is what users get if they use PROJ to get the PROJJSON, or the form that would represent the single-datum version? Or rather, were you just suggesting that we rephrase the note?

I think one of our goals in including it here is that at some point we can also make a recommendation for non-CRS aware tools that can safely assume they are producing OGC:CRS84 referenced data to copy / paste this json to their crs field (if we require crs or at least strongly recommend it in order to make geographic coordinates less ambiguous than now).

@rouault
Copy link
Contributor

rouault commented May 11, 2022

For non-CRS aware tools, perhaps we should recommend the simplest minimal reasonable form of OGC:CRS84 definition

{
    "type": "GeographicCRS",
    "name": "WGS 84 longitude-latitude",
    "datum": {
        "type": "GeodeticReferenceFrame",
        "name": "World Geodetic System 1984",
        "ellipsoid": {
            "name": "WGS 84",
            "semi_major_axis": 6378137,
            "inverse_flattening": 298.257223563
        }
    },
    "coordinate_system": {
        "subtype": "ellipsoidal",
        "axis": [
        {
            "name": "Geodetic longitude",
            "abbreviation": "Lon",
            "direction": "east",
            "unit": "degree"
        },
        {
            "name": "Geodetic latitude",
            "abbreviation": "Lat",
            "direction": "north",
            "unit": "degree"
        }
        ]
    },
    "id": {
        "authority": "OGC",
        "code": "CRS84"
    }
}


#### crs

The Coordinate Reference System (CRS) is an optional parameter for each geometry column defined in geoparquet format.

The CRS must be provided in [WKT](https://en.wikipedia.org/wiki/Well-known_text_representation_of_coordinate_reference_systems) version 2, also known as **WKT2**. WKT2 has several revisions, this specification only supports [WKT2_2019](https://docs.opengeospatial.org/is/18-010r7/18-010r7.html).
The CRS must be provided in [PROJJSON](https://proj.org/specifications/projjson.html) format, which is a JSON encoding of [WKT2:2019 / ISO-19162:2019](https://docs.opengeospatial.org/is/18-010r7/18-010r7.html), which itself implements the model of [OGC Topic 2: Referencing by coordinates abstract specification / ISO-19111:2019](http://docs.opengeospatial.org/as/18-005r4/18-005r4.html). Apart from the difference of encodings, the semantics is intended to be exactly the same as WKT2:2019, and PROJJSON can be morphed losslessly from/into WKT2:2019.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note there was a comment on this in #97 - #97 (comment) That's the 'require' PR, but this one should merge first so we can cut 0.4.0 release.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Either one could theoretically go out as 0.4.0; per the discussion I thought the intent was to compare the optional PROJJSON with OGC:CRS84 default vs required PROJJSON, rather than them being additive. But I can certainly update this PR to address comments in #97 if we'd like to see this particular PR go out sooner.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. I think the plan was to aim for 0.4.0 without it - I organized the 0.4 milestone around that - and then to be sure to solve this for 0.5. Just to be sure to make iterative progress as it seemed like that one could take longer.

I think we targeted next week for the 0.4.0 release. I'm also not opposed to get this into 0.4 release with everyone happy, and just put up a discussion in #98 to help push it forward. So maybe see where that discussion is at on monday of next week, and if it's not fully resolved then it'd be great if you could clean up this PR and we push out 0.4

the OGC:CRS84 CRS based on elements of the `crs` field. For simplicity, Javascript
object dot notation is used to refer to nested elements.

The CRS is likely equivalent to OGC:CRS84 for a GeoParquet file if the `id` element is present:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe not that important for this PR (can be done in a follow-up as well), but what do people think about moving the details of what the CRS repr of OGC:CRS84 and how to recognize this to a separate section towards the bottom of this file? So here we focus on the basic explanation of this metadata fields

* `id.authority` = `"OGC"` and `id.code` = `"CRS84"`
* `id.authority` = `"EPSG"` and `id.code` = `4326` (due to longitude, latitude ordering in this specification)

The CRS is likely equivalent to OGR:CRS84 if all of the following are true:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would somehow stress here that what follows is really optional, as IMO it is perfectly fine for a non-CRS aware implementation to only handle files that have a "proper" CRS with EPSG:4326 or OGC:CRS84 id. Being able to detect other equivalent json CRS objects without this id seems rather like a corner case (and they can give a clear error that the CRS of the data is not recognized, as they will do for files with non-WGS84 crs anyway)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 - I think it's ok to even just remove this section. Like perhaps add it to some 'best practices' type document, but not have it in the main spec.

@cholmes cholmes added this to the 0.4 milestone May 17, 2022
Copy link
Member

@cholmes cholmes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. I put in one small comment, but I'm also happy to merge this and then tackle that in a separate PR.


If an implementation is CRS-aware and needs a CRS representation of the data it may assume a default value is [OGC:CRS84](https://www.opengis.net/def/crs/OGC/1.3/CRS84), which is equivalent to the well-known [EPSG:4326](https://epsg.org/crs_4326/WGS-84.html) but changes the axis from latitude-longitude to longitude-latitude. However, there is no guarantee that coordinates
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm for removing the however and stating what it should be. Though perhaps we just say 'must use longitude, latitude with the WGS84 datum' to cover both OGC:CRS84 and EPSG:4326 (with axis order flipped).

@cholmes
Copy link
Member

cholmes commented May 24, 2022

Is this ready to merge? It's looking good to me. Thanks for all your awesome work on this @brendan-ward (and everyone else reviewing / suggesting).

@brendan-ward
Copy link
Contributor Author

The only thing outstanding is what part (if any) of the "OGC:CRS84 details" section to cut out or move to a different file (if so, where?).

@cholmes
Copy link
Member

cholmes commented May 24, 2022

Ah, right. I think the whole OGC:CRS84 details section could move out.

I'd probably just do something simple like just make an 'additional-info.md' file that sits right next the spec, in the same folder, and puts the crs details in it, maybe also the example file above. It could expand out in time, could also evolve to live somewhere else, like become more of a guide to geospatial for non-geo people, help explain various aspects of the spec.

I think it's also ok to just merge this and then we can do that one as a separate PR. I'd be happy to work on it.

@jorisvandenbossche
Copy link
Collaborator

jorisvandenbossche commented May 25, 2022

Ah, right. I think the whole OGC:CRS84 details section could move out.

Given that it was moved to an additional section at the bottom, for me it is also fine to keep it there for now (we can move it another file later, if we prefer that), in light of getting this PR merged.

EDIT: ah, so what you said in your last sentence ;)

I think it's also ok to just merge this and then we can do that one as a separate PR. I'd be happy to work on it.

@jorisvandenbossche jorisvandenbossche merged commit fe87513 into opengeospatial:main May 25, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Use projjson instead of wkt2
5 participants