Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document how to generate vector embeddings #98

Merged
merged 4 commits into from
Dec 22, 2023
Merged

Conversation

weiji14
Copy link
Contributor

@weiji14 weiji14 commented Dec 22, 2023

What I am changing

image

How I did it

TODO:

How you can test it

  • Run jupyter-book build docs/ locally

Related Issues

Part of #90

Step by step instructions on how to produce embeddings from the pretrained model. From checking that one has permissions to get the GeoTIFF files, to downloading of the model checkpoint, and running the model prediction to get the GeoParquet output. Also gave a tip on what a suitable VM instance would be like.
@weiji14 weiji14 added the documentation Improvements or additions to documentation label Dec 22, 2023
@weiji14 weiji14 self-assigned this Dec 22, 2023
@weiji14 weiji14 changed the title 📝 Document how to generate vector embeddings Document how to generate vector embeddings Dec 22, 2023
Extra technical details on how the raw (B, 1538, 768) embeddings are turned into (B, 768) shaped embeddings by taking the mean along the spatial patches.
Useful details about the filename convention and table schema of the embeddings stored in GeoParquet format, and some sample GeoPandas code showing how to read a *.gpq file. Also linking to some guides and resources from the Cloud Native Geospatial Foundation.
Never sure whether it's singular or plural.
@weiji14 weiji14 marked this pull request as ready for review December 22, 2023 05:29
@weiji14 weiji14 added this to the v0 Release milestone Dec 22, 2023
@weiji14
Copy link
Contributor Author

weiji14 commented Dec 22, 2023

Gonna merge this straight and preview it live on the main branch!

@weiji14 weiji14 merged commit 73bc58c into main Dec 22, 2023
2 checks passed
@weiji14 weiji14 deleted the docs/model_embeddings branch December 22, 2023 05:33
brunosan pushed a commit that referenced this pull request Dec 27, 2023
* 📝 Document how to generate vector embeddings

Step by step instructions on how to produce embeddings from the pretrained model. From checking that one has permissions to get the GeoTIFF files, to downloading of the model checkpoint, and running the model prediction to get the GeoParquet output. Also gave a tip on what a suitable VM instance would be like.

* 📝 Document details of how the mean embeddings were computed

Extra technical details on how the raw (B, 1538, 768) embeddings are turned into (B, 768) shaped embeddings by taking the mean along the spatial patches.

* 📝 Document format of the GeoParquet table and how to read it

Useful details about the filename convention and table schema of the embeddings stored in GeoParquet format, and some sample GeoPandas code showing how to read a *.gpq file. Also linking to some guides and resources from the Cloud Native Geospatial Foundation.

* ✏️ Typo embedding -> embeddings

Never sure whether it's singular or plural.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant