Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature metadata extension (EXT_feature_metadata) #3

Merged
merged 17 commits into from
Feb 24, 2021
Merged

Conversation

lilleyse
Copy link

@lilleyse lilleyse commented Oct 2, 2020

EDIT by @ptrgags: A lot has changed since this PR was first created. I'm rewriting this description.

This is an update to the first feature metadata draft in #1.

Direct link to the extension: EXT_feature_metadata

This extension is now quite a bit slimmed down, as the metadata encoding format is nearly identical between this glTF extension and a related 3D Tiles extension, 3DTILES_metadata. This common specification is the Cesium 3D Metadata Specification. This glTF extension simply references the common spec to explain the core concepts.

Copy link

@ptrgags ptrgags left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lilleyse went through and pointed out a bunch of things that either need fixing or could use further discussion.

@ptrgags
Copy link

ptrgags commented Oct 2, 2020

@lilleyse I tweaked the metadata to the point that my NDVI/land cover example (see microcosm folder) is valid to the current JSON schema so I could include it.

I also tweaked the existing examples to make them compliant with the schema.

Other changes to the schema will take more thought, I'll get to work on those on Tuesday

@lilleyse
Copy link
Author

lilleyse commented Oct 7, 2020

@ptrgags I plan on checking out your comments tomorrow

@ptrgags
Copy link

ptrgags commented Oct 7, 2020

I created a point cloud vertex metadata example with several per-vertex properties. There's a Node.js script for generating the glTF.

I can verify that the point cloud renders in Cesium, but that doesn't tell me much about whether the buffer is set up correctly without feature-metadata specific tools.

@lilleyse
Copy link
Author

lilleyse commented Oct 7, 2020

@ptrgags aside from bug fixes are these the main schema changes?

  • featureCollections is now a dictionary and feature collections are now referred to by a string ID rather than an index
  • Added name to featureCollection
  • Removed semantic everywhere

@ptrgags
Copy link

ptrgags commented Oct 7, 2020

@lilleyse and I also discussed "should we support arrays of strings" -- we're leaning towards not doing this, because it might lead to allowing complicated cases like arrays of binary, arrays of arrays...

@ptrgags
Copy link

ptrgags commented Oct 7, 2020

Another discussion point @lilleyse and I had both in Slack and in a call: How should we align our bufferViews?

  • 8 bytes (which is what B3DM uses) -- this is needed at minimum because we support float32, uint64, etc.
  • 64 bytes (which is what Apache Arrow recommends to optimize for SIMD) See here -- seems overkill now, but could be useful for future-proofing.

@baothientran would you have an opinion on this?

@lilleyse lilleyse changed the base branch from master to batch-table October 7, 2020 22:35
@ptrgags
Copy link

ptrgags commented Oct 8, 2020

Concerning padding: I had a Slack conversation with @baothientran today where we discussed this.

We came to the conclusion that we should word the alignment requirements that "bufferViews must be aligned to a multiple of 8 bytes" This accomplishes a few things:

  • It satisfies our goal of supporting 64-bit datatypes
  • Since these glTFs are going to be streamed over the network, optimizing for storage is helpful. Where this is important, choosing 8-byte alignment is good
  • In the rare/future case where the user wants SIMD optimization, this wording allows for 64-byte alignment like Apache Arrow recommends.

To reach this decision, we also talked about the following:

  • If in the future we wanted to pull this metadata into a WebGL 2 uniform buffer, that has different alignment requirements. https://learnopengl.com/Advanced-OpenGL/Advanced-GLSL. But this affects padding between array elements. Since we want to optimize for storage, we decided not to worry about this.
  • We weren't entirely sure how much C++ compilers optimize for SIMD instructions. But in JavaScript, it's even less available. Chromium decided against SIMD back in 2017 -- not sure if there have been any more recent thoughts on this

@lilleyse
Copy link
Author

lilleyse commented Oct 8, 2020

bufferViews must be aligned to a multiple of 8 bytes

We should be explicit that this alignment is relative to the start of the glb, not the start of the buffer. Otherwise the bufferView might be misaligned by 4 bytes since GLB only requires 4-byte alignment for chunks.

EDIT: this only applies if the buffer is embedded in the glb

@ptrgags
Copy link

ptrgags commented Oct 8, 2020

I talked with @lilleyse on Slack about one detail about the featureClass.normalized property: if the implementation uses a relatively small sized float value (e.g. float32 or float16) for storing the normalized value, but the input data was larger (e.g. an UINT32 or UINT64), then there will be some loss of precision.

We decided it's best to put this as an implementation note in the spec.

@lilleyse
Copy link
Author

lilleyse commented Oct 8, 2020

A couple longer term things to think about:

Bitfields

...or are parallel arrays of BOOLEAN (1-bit) type sufficient? Are there use cases for using bitfields?

Enums

  • Would string values be allowed? Or only integers?
  • Would signed integers be allowed? I think yes. I've seen examples where negative values indicated "No information"
"LAS_Point": {
  "properties": {
    "Classification": {
      "componentType": "UINT8",
      "enum": {
        "0": "Created, never classified",
        "1": "Unclassified",
        "2": "Ground",
        "255": "User defined"
      }
    }
  }
}

@lilleyse
Copy link
Author

lilleyse commented Oct 8, 2020

A client may also want to know the number of occurrences of each enum. But I think that would have to go in the feature collection property since it's specific to the dataset and doesn't belong in the class definition.

This is helpful for knowing what enums are available for styling purposes.

Same goes with min/max. The current schema defines these as the allowable min and max for a property. But a particular dataset might have its own actual min/max and that would belong on the feature collection property, not the class.

@lilleyse lilleyse force-pushed the feature-metadata branch 2 times, most recently from 09247af to a8a5ed9 Compare October 8, 2020 23:31
@lilleyse
Copy link
Author

lilleyse commented Oct 8, 2020

@lilleyse
Copy link
Author

lilleyse commented Oct 9, 2020

@ptrgags I pushed some changes in a8a5ed9:

  • elementType defaults to SCALAR. Because of that, removed elementType being required.
  • Renamed offsetsElementType to offsetsComponentType since it looks like a property's componentType. I could go either way here.
  • Clarified that offsetsBufferView is byte lengths for string and binary blobs and just lengths for arrays (not byte lengths)
  • Small wording tweaks here and there

@lilleyse
Copy link
Author

lilleyse commented Oct 9, 2020

Another longer term consideration, maybe for @IanLilleyT:

Time

@ptrgags
Copy link

ptrgags commented Oct 9, 2020

Should offsetsBufferView also have the same alignment rules?

Ah yes, I'll adjust that in the schema.

In microcosm.gltf: "LULC Name Offsets" and "LULC Color" byteOffset are not divisible by 8. Also check weather.gltf (I didn't check that yet).

Ah whoops I had updated the schema and JSON but forgot to double check the buffer contents.

And yeah I'll be focusing more on weather today

@lilleyse lilleyse force-pushed the feature-metadata branch 13 times, most recently from 922eda1 to f767fb2 Compare February 24, 2021 00:36
Copy link

@ptrgags ptrgags left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lilleyse just a few things

@lilleyse
Copy link
Author

@ptrgags updated

Copy link

@ptrgags ptrgags left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

spotted a couple missing commas

lilleyse and others added 2 commits February 24, 2021 16:51
…eTable.property.schema.json

Co-authored-by: Peter Gagliardi <ptrgags@gmail.com>
@ptrgags
Copy link

ptrgags commented Feb 24, 2021

Looks good, thanks @lilleyse!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants