zarrdata

A convenient way to store a publically-accessible Zarr dataset that is versioned and optionally tied to a Zenodo DOI:

import xarray as xr
import fsspec
uri = 'https://scottyhq.github.io/zarrdata/air_temperature.zarr'
ds = xr.open_dataset(uri, engine="zarr", consolidated=True)
ds.air.isel(time=1).plot(x="lon")

The basic idea is to host a smallish citeable record (<1GB) on a static GitHub pages website so that your tutorial, research code, benchmarking suite, etc. can run against a citeable dataset.

Key limitation of this approach is that Zarr chunks must be less than 100MB, per GitHub repository limits and the total size of the repo/zarr store should be less than 1GB per GitHub Pages limits. If you're dealing with data>1GB or want high-performance you probably want to store the data files on AWS S3, GCS, etc...

Configuration steps

Add zarr data In the create_zarr.py script I just create a Zarr store from the Xarray tutorial dataset, but if you have data.zarr you just add it to your repo
Add a jekyll configuration file GitHub pages automatically deploys your repository and serves static HTTP via Jekyll. Because Jekyll ignores hidden files (.zattrs, .zmetadata, etc) by default you need a _config.yml to ensure those files are added
Enable github pages To publish the site you just need to enable GitHub Pages for the repository. It's as simple as going to repository Settings->Pages->Source (select 'main' branch and 'Save')! The you'll have a live HTTP-website with the repo README.md rendered! For this repo https://github.com/scottyhq/zarrdata the website is https://scottyhq.github.io/zarrdata .

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
air_temperature.zarr		air_temperature.zarr
.gitignore		.gitignore
.zmetadata		.zmetadata
LICENSE		LICENSE
README.md		README.md
_config.yml		_config.yml
create_zarr.py		create_zarr.py
environment.yml		environment.yml
zarrdata-read.ipynb		zarrdata-read.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

zarrdata

Configuration steps

About

Releases 1

Languages

License

scottyhq/zarrdata

Folders and files

Latest commit

History

Repository files navigation

zarrdata

Configuration steps

About

Resources

License

Stars

Watchers

Forks

Releases 1

Languages