-
Notifications
You must be signed in to change notification settings - Fork 179
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Best Practices: Catalog Layout -- Catalog vs Collection as root document #925
Comments
I guess the authors meant door 3, but I'm more inclined to door 2 tbh. @cholmes |
|
Here's an attempt to simplify and clarify the best practices section for catalog layout to address radiantearth#925. In practice, we've found that every Catalog is named catalog.json and every Collection is named collection.json, which is enforced by tools such as PySTAC.
Great! I opened #932 with an initial attempt at a wording change to this section. Feel free to take it or leave it, this type of change is subject to interpretation! |
I mean, I may have entered it on github. But I think it was a copy and paste from something Seth wrote up. |
Guess the idea was to force the more specific name collection.json when possible... Every Collection is Catalog, so stating simply to use catalog.json or collection.json would be equivalent to: "Animals should be named animals.json: Cats should be named cats.json", which is ambiguous since it would give two possible names for cats... Maybe use this:"Catalogs which are not collections should be named catalog.json. Collections should be named collection.json" |
Based on feedback in radiantearth#925
I like that. Updated #932 |
I found a usecase for specific collection.json name in STAC Index. Knowing it's a collection from just the name helps to prioritize collections over catalogs in the crawling process. So +1 for collection.json. |
I'm re-opening this as I've figured out why it was like it was before and I think we should at least discuss it again. The last days I looked into how STAC Browser handles the Breadcrumb generation and URL generation. In fact, I think the best practice was written for STAC Browser before. Having a catalog.json in each folder allows to generate a structure and navigate to the parent without any further details. If there'a a distinction between catalog.json and collection.json this is not true any longer. This is why we had door #3 before and I think @cholmes actually was right that he added it to the spec, but the person giving the idea was Seth as former STAC Browser maintainer. With the recent change to the best practice nice slug generation for STAC Browser is nearly impossible without reading all parent catalogs as I can't rely on a given best practice. Having that said, not many catalogs actually follow the old best practice, neither does PySTAC. Thus solving the issue in STAC Browser will get complicated anyway. |
I'm reading Best Practices - Catalog Layout for some guidance on how to structure my catalog which is a Collection containing Items as direct children.
Of particular interest:
When writing a static catalog for this use case, based on the best practices quoted above, I see a few options:
I'm inclined to choose door 2, because it doesn't duplicate objects or require unnecessary structure, after all a Collection is a Catalog and best practice 1 implies a collection can be a root document. Door 2 is also what I get if, in PySTAC, I create a Collection, add some Items to it, and export it. The door 2 structure also validates in PySTAC. However, it looks like other published catalogs such as https://landsat-stac.s3.amazonaws.com/catalog.json use door 1. Door 3 came about because best practice 2 references the established convention of index.html which makes me think that root objects should always be named catalog.json regardless of whether they contain a Catalog or Collection.
Is there already a generally accepted best practice for this situation?
The text was updated successfully, but these errors were encountered: