Add a geospatial extension #31

mcflugen · 2024-01-30T01:12:24Z

This pull request is meant to explore one possible implementation of BMI extensions (or maybe plugins, rather than extensions? Would that be a better word?).

The accompanying SIDL file that describes the interface for this example extension is something like the following (this extension is similar to what @mdpiper outlined in csdms/bmi#99),

package csdms version 0.0.1 {
  interface bmi_geo {

    int initialize(in string config_file);
    int get_grid_coordinate_names(in int grid, out array<string, 1> names);
    int get_grid_coordinate_units(in int grid, out array<string, 1> names);
    int get_grid_coordinate(in int grid, int string coordinate, in array<double, 1> values);
    int get_grid_crs(out string name);
  }
}

A couple notes on this,

This SIDL definition is separate from the core bmi SIDL.
I've added an initialize function that is intended to be used in the same way as core BMI initialize. I think every extension likely should have this (but maybe it shouldn't be a requirement, I'm not sure).

This requires a new function, get_extensions be added to the core BMI. This function puts together a list of all the extensions a component implements. Each element of the list is a string of the form <extension-id>@<library-name>:<entry-point> (the precise form of this string doesn't matter, this is just something I came up with that seemed to work). <extension-id> is a unique identifier for the extension, <library-name> is the name of the library that contains the extension (for Python, this would be a module name, for C it would be the name of a shared library), and <entry-point> is the name of the class that implements the extension. The library would be separate from and sit alongside of the core bmi library.

heat/bmi_geo.py defines the abstract base class for the geospatial extension.
heat/bmi_heat_geo.py implements the extension for the heat model.

Some example code that demonstrates how a framework might use the extension,

>>> import importlib
>>> import numpy as np
>>> from heat.bmi_heat import BmiHeat

>>> heat = BmiHeat()
>>> heat.initialize("heat.yaml")

>>> heat.get_extensions()
('bmi_geospatial@heat.bmi_heat_geo:BmiHeatGeo',)

>>> extensions = {id_: entry_point for id_, entry_point in [ext.split("@") for ext in heat.get_extensions()]}
>>> module_name, class_name = extensions["bmi_geospatial"].split(":")
>>> module = importlib.import_module(module_name)
>>> cls = getattr(module, class_name)

>>> heat_geo = cls(heat)
>>> heat_geo.get_grid_coordinate_names(0)
('y', 'x')

RolfHut · 2024-02-09T09:52:37Z

For eWaterCycle, we have adopted a "plugin" system for tying specific models into our main python package (work of @BSchilperoort). Maybe this can serve as inspiration?
https://github.com/eWaterCycle/ewatercycle-leakybucket/blob/main/plugin_guide.md

BSchilperoort · 2024-02-13T16:06:02Z

We used python's entry points for adding these plugins. However, to me the extension system here seems a bit complex (at least for the Python implementation).

My main question is: should the extensions be optional? Which situation are we discussing here:

BMI extensions should be optional for the user. The base BMI should be usable without installing any extra bmi-plugin dependencies.
BMI extensions are shipped with the model's main BMI and it cannot be installed without the extensions.

The first case is a bit more difficult for developers, as you'd need to write the code in such a way you don't need to import bmi_geospatial for the main BMI to be used.
The second case is simpler for the model developers, as you'd add an extra inheritance to the code and implement the required methods.

Perhaps for Python a structure like this can be good for case 1:

class BmiHeat(Bmi):
    _geo = None

    def __init__(self) -> None:
        pass

    def get_extensions(self) -> tuple[str, ...]:
        return ("geo")

    def initialize_extension(self, extension: str) -> None:
        if extension == "geo":
            self._geo: = BmiHeatGeo(self)  # raises an ImportError if bmi_geospatial is not installed
        else:
            raise ValueError

    @property
    def geo(self):  # give users a nice error when they forgot to initialize the extension
        if self._geo is not None:
            return self._geo
        else:
            raise NotInitializedError

which is used like:

>>> heat = BmiHeat()
>>> heat.initialize_extension("geo")
>>> heat.geo.get_grid_coordinate_names(0)

For case 2 multiple inheritance seems most straightforward to me:

class BmiHeat(Bmi, BmiGeo):

Where the BmiGeo methods have the prefix geo_.

I'm curious what you think of this @mcflugen!

RolfHut · 2024-02-14T05:35:57Z

hmm, my 2 cents on this is that I prefer option 2. (because in eWaterCycle, a larger separation between user and model developer exists. If a model supports both with and without a certain extension, I'd argue that the developer presents the user with two versions of the model, in the case of @BSchilperoort his example:

class BmiHeat(Bmi):
class BmiHeatGeo(Bmi, BmiGeo):

mcflugen · 2024-02-22T19:32:24Z

@BSchilperoort, @RolfHut Thanks for your comments. This is exactly the sort of feedback I was looking for!

My feeling is that BMI extensions should be optional.

The primary rationale for making extensions optional is the potential for a large and dynamic set of extensions. Requiring implementers to support every function across all extensions, and to continually update their libraries with the addition and removal of extensions, would be overly demanding, error prone, and possibly limit reusability.

I intentionally didn't use inheritance as I wanted to decouple the extension from the core BMI. The core BMI, through the get_extensions function, tells a framework what extensions are available and how to access them. The extension has a reference to the core BMI since it may need to access BMI functions or the underlying model. The extension can directly access the core BMI, but not the other way around.

I think we can leave it up to a framework or wrappers to add attributes to an instance of a BMI to make a cleaner way to access extensions (i.e. heat.geo.get_grid_coordinate_names(0)). I was thinking of something similar but using a dictionary instead of attributes (i.e. heat.ext["geo"].get_grid_coordinate_names(0)) as extension names may not necessarily be valid variable names. Either way, though, this can be functionality that's added by a wrapper.

I view user-friendliness as a secondary goal of the BMI. It can be left up to a framework to make a BMI more user-friendly. For example, the sensible-bmi wraps a BMI to make it more Pythonic and user-friendly.

RolfHut · 2024-02-23T08:34:38Z

I think you touch on an important point in the design philosophy of BMI in general that may not have been written down explicitly, but that I do observe in your (and our own) work:

the reason for BMI to exist is to support re-use and coupling of models. This is achieved by optimizing for interoperability between models ("components" if we include data). Optimizing interoperabillity often means trading user friendly-ness for more generalized, standerdized ways of working.
there is no single user of BMI and since every user, or group of users, have different demand on user friendliness, BMI as standard can not and should not try to accomodate.
the reasons most wrappers / frameworks exist is to create user friendlyness for a certain group of users, usually with a certain subset of models / science domains. This is great and should be encouraged.

Concluding: there should be a clear distinction what is "core BMI" that optimizes interoperability and "extensions and platforms" that optimize user friendliness.

mdpiper · 2024-02-23T15:24:32Z

@RolfHut A side note that @mcflugen listed a set of BMI design principles in csdms/bmi#105 (which we should merge and include in the documentation). We should add your observations on BMI design, as well.

mcflugen added 4 commits January 29, 2024 17:30

add get_extension method

ca35934

add abc for the geospatial extension

8b7dfa4

implement geospatial extension for bmi_heat

86db321

remove some lint

b6f9254

mdpiper mentioned this pull request Feb 7, 2024

Add a geospatial extension csdms/bmi-example-fortran#20

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a geospatial extension #31

Add a geospatial extension #31

mcflugen commented Jan 30, 2024

RolfHut commented Feb 9, 2024

BSchilperoort commented Feb 13, 2024 •

edited

Loading

RolfHut commented Feb 14, 2024

mcflugen commented Feb 22, 2024

RolfHut commented Feb 23, 2024

mdpiper commented Feb 23, 2024

Add a geospatial extension #31

Are you sure you want to change the base?

Add a geospatial extension #31

Conversation

mcflugen commented Jan 30, 2024

RolfHut commented Feb 9, 2024

BSchilperoort commented Feb 13, 2024 • edited Loading

RolfHut commented Feb 14, 2024

mcflugen commented Feb 22, 2024

RolfHut commented Feb 23, 2024

mdpiper commented Feb 23, 2024

BSchilperoort commented Feb 13, 2024 •

edited

Loading