Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support passing a function to combine_attrs #4896

Merged
merged 17 commits into from
Jun 8, 2021

Conversation

keewis
Copy link
Collaborator

@keewis keewis commented Feb 11, 2021

Allowing to pass a function to combine_attrs is important if a user wants to do something the builtin merge strategies don't support.

@dcherian
Copy link
Contributor

One thing to think about is .encoding.

IIUC we want to treat .encoding like .attrs eventually, so we need to design the signature of this callable to handle that especially since .encoding can contain "useful" attributes like "coordinates" and "bounds" after #2844.

One option would be

def my_combine_attrs(list_of_attrs_dicts, list_of_encoding_dicts):
    attrs = ...
    encoding = ...
    return attrs, encoding

@keewis
Copy link
Collaborator Author

keewis commented Feb 17, 2021

sounds reasonable, but that would require a bigger change than just extending merge_attrs (which currently ignores encoding)

@max-sixty
Copy link
Collaborator

Shall we merge?

@keewis
Copy link
Collaborator Author

keewis commented Apr 19, 2021

not sure. There are a few questions about the signature of the user functions (see #4896 (comment) and #3891 (comment)), which I would like to answer before including this in a release (I might be wrong, but I think changing the signature after releasing is pretty hard)

xarray/core/merge.py Outdated Show resolved Hide resolved
Copy link
Contributor

@dcherian dcherian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @keewis

xarray/core/combine.py Outdated Show resolved Hide resolved
@dcherian dcherian added the plan to merge Final call for comments label May 26, 2021
@keewis
Copy link
Collaborator Author

keewis commented May 26, 2021

there's two remaining issues: should we use a namedtuple / a dataclass for Context instead of the custom class? We basically want a struct without methods (for now?).

Also, how do we best construct that object without a lot of overhead? We need to get at least the function name, but if we pass that manually it's one more place to update when renaming something (not that we do that very often). Using inspect is possible but might be too complicated:

In [5]: import inspect
   ...: 
   ...: def current_function_name():
   ...:     frame = inspect.currentframe()
   ...:     try:
   ...:         caller = frame.f_back
   ...:         name = caller.f_code.co_name
   ...:     finally:
   ...:         del frame
   ...:         del caller
   ...: 
   ...:     return name
   ...: 
   ...: def func():
   ...:     print(current_function_name())
   ...: 
   ...: def another_func():
   ...:     print(current_function_name())
   ...: 
   ...: class A:
   ...:     def method(self):
   ...:         print(current_function_name())
   ...: 
   ...: f = func
   ...: 
   ...: func()
   ...: another_func()
   ...: f()
   ...: 
   ...: a = A()
   ...: a.method()
func
another_func
func
method

With this we can only get the name of the definition so this might break for injected methods, but I guess for injected methods it would be difficult to manually pass the function name, anyways.

@shoyer
Copy link
Member

shoyer commented May 27, 2021

Also, how do we best construct that object without a lot of overhead? We need to get at least the function name, but if we pass that manually it's one more place to update when renaming something (not that we do that very often). Using inspect is possible but might be too complicated:

Rather than introspection, I think we should try to be fully explicit about the function being called. Trying to introspect it from stack-frames is madness :)

So in that case, we would need to pass down the context information from the top level functions in xarray, e.g., everything that takes a combine_attrs argument.

In terms of the overall interface, one other concern I have is about the information we make available to users of this API. I can imagine that they might not only want attributes but also the complete xarray objects on which the function is being called. If that's the case, then they would also need more information from the global context.

@dcherian dcherian removed the plan to merge Final call for comments label May 27, 2021
@keewis
Copy link
Collaborator Author

keewis commented Jun 8, 2021

let's merge this as-is (unless there are any comments on the current state?) and I'll add the construction of the context objects in a new PR.

@dcherian
Copy link
Contributor

dcherian commented Jun 8, 2021

let's merge this as-is

Works for me. Once merged perhaps @huard or @DamienIrving can help us iterate on context

@keewis keewis merged commit e87d65b into pydata:master Jun 8, 2021
@keewis keewis deleted the callable-combine_attrs branch June 8, 2021 21:40
@keewis
Copy link
Collaborator Author

keewis commented Jun 8, 2021

thanks for the reviews, @shoyer, @dcherian, @max-sixty

@github-actions
Copy link
Contributor

github-actions bot commented Jun 8, 2021

Unit Test Results

0 files  ±0  0 suites  ±0   0s ⏱️ ±0s
0 tests ±0  0 ✔️ ±0  0 💤 ±0  0 ❌ ±0 

Results for commit e87d65b. ± Comparison against base commit e87d65b.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants