-
Notifications
You must be signed in to change notification settings - Fork 344
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create einsum operation #197
Conversation
I think implementing the for (tensor, left_expression) in zip(tensors, left_expressions):
axis_names = ...
tensor = rearrange(tensor, left_expression + "->" + " ".join(axis_names)) for every Then, you could do a similar rearrange on the output expression. What do you think @arogozhnikov? (for a future PR, of course) |
einsum("index index -> ", np.ones((5, 5)))
Edit: fixed with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cool, thanks Miles for very clean PR and collecting test suite
I've left some thoughts and request for two kinds of tests:
- fail on parsing of features that aren't yet supported
- add tests for symbolic backends. Probably latter would only include tf, but still
Okay, all suggestions implemented. Let me know what you think. |
Okay everything is implemented for the new syntax: y = einsum(x, x, "i j, i k -> j k") I also added new validation checks for the argument order, and corresponding unit-tests. In the unittests, I also now check the specific message of each error, rather than the error type. Let me know what you think. |
It's trickier since you want some of axes to be derived from the inputs shapes. Not relevant for the PR. Just commenting since you asked |
@@ -560,6 +578,12 @@ def layers(self): | |||
from .layers import keras | |||
return keras | |||
|
|||
def einsum(self, pattern, *x): | |||
return self.tf.vectorized_map( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
want to understand why it looks so strange in tf.keras
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if I was interpreting the symbolic (layer=True) backends correctly or not.
Basically, this einsum
assumes the x
tensors have a leading batch axis, which are assumed to not be specified in the pattern. I assumed this because the create_symbol
method specifies the shape as a batch shape, rather than an absolute shape. Is that correct, or should it assume the pattern also specifies the batch axis?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, I think my implementation has a potential issue: if one symbol is batched, and one symbol is not (like a weight matrix).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think the correct strategy is here? Should I avoid adding einsum
for keras, since it is technically a layer=True
backend?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
layer=True just refers to providing layers, it should not be related to any batch variables, and patterns should include batch variables. Anyway, I forgot keras now just a redirection to TF layers, so just excluded this part
PR is merged, made very minor changes. |
Awesome!! Great to hear. |
This is a great PR! |
I'm interested in adding rearrange support. Any pointers? |
This creates the functional einsum function as requested on #73. CC @arogozhnikov @cgarciae
The current implementation simply parses the string and converts it to
einsum
notation by mapping axis names to single characters (I usestring.ascii_letters
, starting froma, b, c
etc).Currently, it has the following features:
->
.It does not currently support
"(batch channel) feature, feature -> batch channel"
.These could be added later if desired. Some backends do not support custom reductions in their einsum implementations so it will be a bit more work.
I also added a docstring and some unittests (in
tests/test_einsum.py
).Here are some examples of use, with the numpy backend:
Note that the number of spaces next to the comma above are arbitrary, you could do either
"in_dim, ..."
or"in_dim , ..."
- both will work.Eager to hear feedback on this!
Cheers,
Miles
Edit 1: Got working for repeat indices on one side (as used in, e.g., trace).
Edit 2: Added support for chainer, oneflow, cupy, tensorflow.keras.
Edit 3: Added many more tests, some mirroring those used in the np.einsum tests.
Edit 4: More and more unit tests.
Edit 5: Tweaked the syntax to have tensors first, pattern second. Adapted tests, and added new validation for order of arguments.