Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a geospatial extension #31

Open
wants to merge 4 commits into
base: master
Choose a base branch
from
Open

Conversation

mcflugen
Copy link
Member

This pull request is meant to explore one possible implementation of BMI extensions (or maybe plugins, rather than extensions? Would that be a better word?).

The accompanying SIDL file that describes the interface for this example extension is something like the following (this extension is similar to what @mdpiper outlined in csdms/bmi#99),

package csdms version 0.0.1 {
  interface bmi_geo {

    int initialize(in string config_file);
    int get_grid_coordinate_names(in int grid, out array<string, 1> names);
    int get_grid_coordinate_units(in int grid, out array<string, 1> names);
    int get_grid_coordinate(in int grid, int string coordinate, in array<double, 1> values);
    int get_grid_crs(out string name);
  }
}

A couple notes on this,

  1. This SIDL definition is separate from the core bmi SIDL.
  2. I've added an initialize function that is intended to be used in the same way as core BMI initialize. I think every extension likely should have this (but maybe it shouldn't be a requirement, I'm not sure).

This requires a new function, get_extensions be added to the core BMI. This function puts together a list of all the extensions a component implements. Each element of the list is a string of the form <extension-id>@<library-name>:<entry-point> (the precise form of this string doesn't matter, this is just something I came up with that seemed to work). <extension-id> is a unique identifier for the extension, <library-name> is the name of the library that contains the extension (for Python, this would be a module name, for C it would be the name of a shared library), and <entry-point> is the name of the class that implements the extension. The library would be separate from and sit alongside of the core bmi library.

  • heat/bmi_geo.py defines the abstract base class for the geospatial extension.
  • heat/bmi_heat_geo.py implements the extension for the heat model.

Some example code that demonstrates how a framework might use the extension,

>>> import importlib
>>> import numpy as np
>>> from heat.bmi_heat import BmiHeat

>>> heat = BmiHeat()
>>> heat.initialize("heat.yaml")

>>> heat.get_extensions()
('bmi_geospatial@heat.bmi_heat_geo:BmiHeatGeo',)

>>> extensions = {id_: entry_point for id_, entry_point in [ext.split("@") for ext in heat.get_extensions()]}
>>> module_name, class_name = extensions["bmi_geospatial"].split(":")
>>> module = importlib.import_module(module_name)
>>> cls = getattr(module, class_name)

>>> heat_geo = cls(heat)
>>> heat_geo.get_grid_coordinate_names(0)
('y', 'x')

@RolfHut
Copy link

RolfHut commented Feb 9, 2024

For eWaterCycle, we have adopted a "plugin" system for tying specific models into our main python package (work of @BSchilperoort). Maybe this can serve as inspiration?
https://github.com/eWaterCycle/ewatercycle-leakybucket/blob/main/plugin_guide.md

@BSchilperoort
Copy link

BSchilperoort commented Feb 13, 2024

We used python's entry points for adding these plugins. However, to me the extension system here seems a bit complex (at least for the Python implementation).

My main question is: should the extensions be optional? Which situation are we discussing here:

  1. BMI extensions should be optional for the user. The base BMI should be usable without installing any extra bmi-plugin dependencies.
  2. BMI extensions are shipped with the model's main BMI and it cannot be installed without the extensions.

The first case is a bit more difficult for developers, as you'd need to write the code in such a way you don't need to import bmi_geospatial for the main BMI to be used.
The second case is simpler for the model developers, as you'd add an extra inheritance to the code and implement the required methods.

Perhaps for Python a structure like this can be good for case 1:

class BmiHeat(Bmi):
    _geo = None

    def __init__(self) -> None:
        pass

    def get_extensions(self) -> tuple[str, ...]:
        return ("geo")

    def initialize_extension(self, extension: str) -> None:
        if extension == "geo":
            self._geo: = BmiHeatGeo(self)  # raises an ImportError if bmi_geospatial is not installed
        else:
            raise ValueError

    @property
    def geo(self):  # give users a nice error when they forgot to initialize the extension
        if self._geo is not None:
            return self._geo
        else:
            raise NotInitializedError

which is used like:

>>> heat = BmiHeat()
>>> heat.initialize_extension("geo")
>>> heat.geo.get_grid_coordinate_names(0)

For case 2 multiple inheritance seems most straightforward to me:

class BmiHeat(Bmi, BmiGeo):

Where the BmiGeo methods have the prefix geo_.

I'm curious what you think of this @mcflugen!

@RolfHut
Copy link

RolfHut commented Feb 14, 2024

hmm, my 2 cents on this is that I prefer option 2. (because in eWaterCycle, a larger separation between user and model developer exists. If a model supports both with and without a certain extension, I'd argue that the developer presents the user with two versions of the model, in the case of @BSchilperoort his example:

class BmiHeat(Bmi):
class BmiHeatGeo(Bmi, BmiGeo):

@mcflugen
Copy link
Member Author

@BSchilperoort, @RolfHut Thanks for your comments. This is exactly the sort of feedback I was looking for!

My feeling is that BMI extensions should be optional.

The primary rationale for making extensions optional is the potential for a large and dynamic set of extensions. Requiring implementers to support every function across all extensions, and to continually update their libraries with the addition and removal of extensions, would be overly demanding, error prone, and possibly limit reusability.

I intentionally didn't use inheritance as I wanted to decouple the extension from the core BMI. The core BMI, through the get_extensions function, tells a framework what extensions are available and how to access them. The extension has a reference to the core BMI since it may need to access BMI functions or the underlying model. The extension can directly access the core BMI, but not the other way around.

I think we can leave it up to a framework or wrappers to add attributes to an instance of a BMI to make a cleaner way to access extensions (i.e. heat.geo.get_grid_coordinate_names(0)). I was thinking of something similar but using a dictionary instead of attributes (i.e. heat.ext["geo"].get_grid_coordinate_names(0)) as extension names may not necessarily be valid variable names. Either way, though, this can be functionality that's added by a wrapper.

I view user-friendliness as a secondary goal of the BMI. It can be left up to a framework to make a BMI more user-friendly. For example, the sensible-bmi wraps a BMI to make it more Pythonic and user-friendly.

@RolfHut
Copy link

RolfHut commented Feb 23, 2024

I think you touch on an important point in the design philosophy of BMI in general that may not have been written down explicitly, but that I do observe in your (and our own) work:

  • the reason for BMI to exist is to support re-use and coupling of models. This is achieved by optimizing for interoperability between models ("components" if we include data). Optimizing interoperabillity often means trading user friendly-ness for more generalized, standerdized ways of working.
  • there is no single user of BMI and since every user, or group of users, have different demand on user friendliness, BMI as standard can not and should not try to accomodate.
  • the reasons most wrappers / frameworks exist is to create user friendlyness for a certain group of users, usually with a certain subset of models / science domains. This is great and should be encouraged.

Concluding: there should be a clear distinction what is "core BMI" that optimizes interoperability and "extensions and platforms" that optimize user friendliness.

@mdpiper
Copy link
Member

mdpiper commented Feb 23, 2024

@RolfHut A side note that @mcflugen listed a set of BMI design principles in csdms/bmi#105 (which we should merge and include in the documentation). We should add your observations on BMI design, as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants