Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add new backend api documentation #4810

Merged
merged 58 commits into from
Mar 8, 2021
Merged
Changes from 2 commits
Commits
Show all changes
58 commits
Select commit Hold shift + click to select a range
7e33010
documentation first draft
aurghs Jan 14, 2021
75544c9
documentation update
aurghs Jan 14, 2021
c6f64cc
documentation update
aurghs Jan 14, 2021
11fc283
update documentation
aurghs Jan 26, 2021
1b8ac13
update backend documentation
aurghs Feb 2, 2021
e471f3a
incompletre draft: update Backend Documentation
aurghs Feb 2, 2021
65c339d
fix
aurghs Feb 3, 2021
113197d
fix syle
aurghs Feb 3, 2021
f9ed1d4
Update doc/internals.rst
aurghs Feb 4, 2021
6b2ecd5
Update doc/internals.rst
aurghs Feb 4, 2021
1958163
Update doc/internals.rst
aurghs Feb 4, 2021
bba32e4
Update doc/internals.rst
aurghs Feb 4, 2021
cb8d716
Update doc/internals.rst
aurghs Feb 4, 2021
2adf355
Update doc/internals.rst
aurghs Feb 4, 2021
7ec3238
Update doc/internals.rst
aurghs Feb 4, 2021
fb03493
Update doc/internals.rst
aurghs Feb 4, 2021
22794d2
Update doc/internals.rst
aurghs Feb 4, 2021
67d2c1f
Update doc/internals.rst
aurghs Feb 4, 2021
f58f16b
Update doc/internals.rst
aurghs Feb 4, 2021
6a07a7c
Update doc/internals.rst
aurghs Feb 4, 2021
1285874
Update doc/internals.rst
aurghs Feb 4, 2021
112837d
Update doc/internals.rst
aurghs Feb 4, 2021
bdc46aa
Update doc/internals.rst
aurghs Feb 4, 2021
fe22048
Update doc/internals.rst
aurghs Feb 4, 2021
f492136
update section lazy laoding
aurghs Feb 3, 2021
fa7f212
Merge branch 'documentation-draft' of github.com:bopen/xarray into do…
aurghs Feb 4, 2021
b74c803
Merge branch 'master' into documentation-draft
aurghs Feb 4, 2021
de9432f
Merge remote-tracking branch 'origin/master' into documentation-draft
aurghs Feb 4, 2021
c50a95c
Update doc/internals.rst
aurghs Feb 4, 2021
ab62beb
Update doc/internals.rst
aurghs Feb 4, 2021
e470f36
Update doc/internals.rst
aurghs Feb 4, 2021
a777445
update internals.rst backend
aurghs Feb 4, 2021
87ed0fa
Merge branch 'documentation-draft' of github.com:bopen/xarray into do…
aurghs Feb 4, 2021
1f0870e
add lazy loading documentation
aurghs Feb 10, 2021
885a6bd
update example on indexing type
aurghs Feb 11, 2021
1381336
style
aurghs Feb 11, 2021
0ef410a
fix
aurghs Feb 11, 2021
dc36138
modify backend indexing doc
aurghs Feb 11, 2021
23e2423
fix
aurghs Feb 11, 2021
99ca49e
removed LazilyVectorizedIndexedArray from backend doc
aurghs Feb 11, 2021
b1eb077
small fix in doc
aurghs Feb 11, 2021
39bf16b
small fixes in backend doc
aurghs Feb 11, 2021
121c060
removed exmple vectorized indexing
aurghs Feb 12, 2021
e838d40
update documentation
aurghs Feb 25, 2021
8633e08
update documentation
aurghs Feb 25, 2021
3281345
Merge branch 'documentation-draft' of github.com:bopen/xarray into do…
aurghs Feb 25, 2021
992d47d
Merge remote-tracking branch 'origin/master' into documentation-draft
aurghs Feb 25, 2021
e3eb56d
isort
aurghs Feb 25, 2021
a456478
rename store_spec in filename_or_obj in guess_can_open
aurghs Feb 25, 2021
abf60e0
small update in backend documentation
aurghs Feb 25, 2021
e72ce9b
small update in backend documentation
aurghs Feb 25, 2021
7108f80
Update doc/internals.rst
aurghs Mar 3, 2021
e8499cd
Update doc/internals.rst
aurghs Mar 4, 2021
0955c16
fix backend documentation
aurghs Mar 4, 2021
54c202c
replace LazilyOuterIndexedArray with LazilyIndexedArray
aurghs Mar 4, 2021
3cc18d5
Update doc/internals.rst
aurghs Mar 5, 2021
9faf5e6
Update doc/internals.rst
alexamici Mar 8, 2021
06371df
Fix broken doc merge
alexamici Mar 8, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
98 changes: 98 additions & 0 deletions doc/internals.rst
Original file line number Diff line number Diff line change
Expand Up @@ -231,3 +231,101 @@ re-open it directly with Zarr:
zgroup = zarr.open("rasm.zarr")
print(zgroup.tree())
dict(zgroup["Tair"].attrs)


How to add a new backend
------------------------

Adding a new backend for read support to Xarray is easy, and you don't need to integrate your code in Xarray.
All you need to do is to:

- Implement a function that returns an instance :py:class:``Dataset``

- Create a `BackendEntrypoint`` instance with your function as input.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Create a `BackendEntrypoint`` instance with your function as input.
- Create a ``BackendEntrypoint`` instance with your function as input.


- Declare such instance as external plugin in your setup.py.
keewis marked this conversation as resolved.
Show resolved Hide resolved

``BackendEntrypoint` class is the main interface with Xarray,
it's a container of attributes and functions to be implemented by the backend:

- ``open_dataset``
- [``open_dataset_parameters``]
- [``guess_can_open``]

While ``open_dataset`` is mandatory, ``open_dataset_parameters`` and ``guess_can_open`` are optional.


BackendEntrypoint.open_dataset
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider documenting the interface as a class? That might make it easier to document (e.g., in the form of docstrings).

++++++++++++++++++++++++++++++

**Inputs**

``BackendEntrypoint.open_dataset`` function shall take in input one argument, ``filename`` and one keyword argument ``drop_variables``:

- ``filename`` may be a string containg a relative path, or the an instance of ``pathlib.Path``.
- ``drop_variables`` may be `None` or a iterable containing the variables names to be dropped in reading the data.

It may also take in input a set of keyword arguments, that will be passed from Xarray :py:func:`open_dataset`
directly to the backend ``BackendEntrypoint.open_dataset``.
Currently in Xarray :py:func:`open_dataset` there are two group of arguments will be passed to the backend.
The first one are the **decoders**, explicity defined in Xarray :py:func:`open_dataset` signature:

- ``mask_and_scale=None``
aurghs marked this conversation as resolved.
Show resolved Hide resolved
- ``decode_times=None``
- ``decode_timedelta=None``
- ``use_cftime=None``
- ``concat_characters=None``
- ``decode_coords=None``

They will be passed to the backend only if the user will pass explicity a value different from `None`.
These parameters can be enabled/disabled by by the User, setting the keyword ``decode_cf`` managed by Xarray.
The backend can implement these specific decoders keywords arguments,
and it is desiderable if this makes sense for the specific backend. For more details see **decoders** sub-section.


The second one can be passed by the user in a dictionary inside ``backend_kwargs`` or explicity as keyword arguments ``**kwargs``.
They will be grouped together and passed to the backend as keyword arguments.

**Output**

```BackendEntrypoint`.open_dataset`` output shall be an instance of Xarray :py:class:`Dataset`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
```BackendEntrypoint`.open_dataset`` output shall be an instance of Xarray :py:class:`Dataset`
``BackendEntrypoint.open_dataset`` output shall be an instance of Xarray :py:class:`Dataset`

that implements an additional method ``close``, used by Xarray to ensure that the related files are closed.
If don't want to support the lazy loading, then the :py:class:`Dataset` shall contain numpy.arrays and your work is almost done.

BackendEntrypoint.open_dataset_parameters
+++++++++++++++++++++++++++++++++++++++++
``open_dataset_parameters``is the list of ``BackendEntrypoint.open_dataset`` parameters.
It is needed to enable/disable the decoders supported by the backend when the User set explicity ``decode_cf``. For this
reason all the decoders supported by the backend must be explicity declared in the signature.
``open_dataset_parameters`` it is no mandatory and if it is not provided xarray will inspect the signature of
``BackendEntrypoint.open_dataset` and it will create ``open_dataset_parameters``.
However, the signature inspection will not support `**kwargs` and `*args` are in the signature and in this case it will
raise an error.

BackendEntrypoint.guess_can_open
+++++++++++++++++++++++++++++++++++++++++



How to support Lazy Loading
+++++++++++++++++++++++++++

Decoders
++++++++
- strings.CharacterArrayCoder()
aurghs marked this conversation as resolved.
Show resolved Hide resolved
- strings.EncodedStringCoder()
- variables.UnsignedIntegerCoder()
- variables.CFMaskCoder()
- variables.CFScaleOffsetCoder()
- times.CFTimedeltaCoder()
- times.CFDatetimeCoder(use_cftime=use_cftime)

How to register a backend
+++++++++++++++++++++++++
Define in your setup.py (or setup.cfg) an new entrypoint with:

- group: ``xarray.backend``
- name: the name to be passed to :py:func:`open_dataset` as `engine``.`
- object reference: the reference to the instance of ``BackendEntrypoint``

See https://packaging.python.org/specifications/entry-points/#data-model for more information.