-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add LL::quotient and remove uses of divideRight and sublayoutIsIdentity #4968
base: main
Are you sure you want to change the base?
Conversation
We add a new abstraction `LL::quotient` that abstracts the idea of "a linear layout does not permute certain dimensions". Doing so, allows us to remove `divideRight` and subsume them into this higher-level abstraction. We also fix a bug in `isCrossCTAConversion`. We alsoremove some code duplication from `transferWithinThreads` and `cvtReorderRegisters` in favour of a more generic approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May I know what bug is there?
We also fix a bug in isCrossCTAConversion.
// Note that that this is stronger than the usual definition of quotienting | ||
// because it requires the linear map to be the identity, not just leave these | ||
// dimensions invariant. | ||
bool canQuotient(ArrayRef<StringAttr> dimNames) const; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the function is like a special implementation of divideRight
, but the function name itself sounds like a general implementation. It's not general enough though to get a quotient in all circumstances because it requires the linear map to be the identity.
Maybe you can consider changing the function name
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is intentionally less general than divideRight
. My point here is that I don't think that we ever need anything as general as divideRight
. divideRight
tries to find a left-inverse (right-inverse?) to the tensor product. This is way too powerful and reasonably tricky to do in general (doesn't exist for an arbitrary tensor, it has to be surjective and preserve the tensor product structure).
quotient
is simpler (easier to check that its implementation is correct in all cases) and more difficult to use wrong, so I argue it's the right abstraction we want here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So my point here is that we should probably change the function name as it's not strictly speaking a quotient. IMO it's better to denote it in the function name rather than looking up the comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is a quotient in the sense of vector spaces. It's only that's not as general as the quotient there, as a linear function descends to a function on the quotient if it maps the space that you are quotienting by into itself. Here we are a bit tighter, as we ask for it to also be the identity in this subspace, but I think it's alright?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In case you are not familiar with the quotient of vector spaces, just note that that wiki page is the best resource to learn about them as it's a bit too terse. I can find a better explanation somewhere else tomorrow (closing this for today). (if you know about that concept then disregard this message :D)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's only that's not as general as the quotient there, as a linear function descends to a function on the quotient if it maps the space that you are quotienting by into itself.
Right, that's what I meant
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In case you are not familiar with the quotient of vector spaces
Nope, I'm not. I'll take a look
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry, I miswrote above "it's the best resource" when I meant "it's not the best resource" lol
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps a better name for this function would be isTrivialOver
, as it checks whether it acts trivially (as the identity map) on a given set of dimensions... I think that quotient
is a good name for the other function, where we quotient over a set of variables, even if we give up in some cases where you could actually quotient, so I think it's fine to leave it as a NYI comment. Thoughts?
// checks whether it is the identity map on these dimensions (i.e | ||
// LinearLayouts::canQuotient) and if so, returns the sublayout of the | ||
// remaining dimensions. | ||
std::optional<LinearLayout> quotient(ArrayRef<StringAttr> dimNames) const; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above. Because I think it's a special case for divideRight
.
// Is the sublayout defined from dimNames to dimNames the identity? | ||
// In particular, is the input and output size in these dimensions | ||
// the same, and are the bases the identity? | ||
bool squareSublayoutIsIdentity(ArrayRef<StringAttr> dimNames) const; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, are we ignoring the case where the input dimension names are not equal to the output dimension names, but the input and output are identity mappings?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The identity in maths is just defined between a space and itself, so yes, this is the correct mathematical concept. Even if you have registers that map 1-to-1 to a vector of outputs, this would not be an identity map strictly speaking, as it's not mapping elements to themselves, but identifying registers with a matrix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, makes sense then
Previously, we were checking that:
This means that the At the moment I have drawn this rather rough distribution, but it can be optimised in the future. For example, if you just have those two initial conditions, you can still perform the conversion without using distributed shmem, but you'll need to write an indexing that depends on |
Sure, I think it makes sense |
We add a new abstraction
LL::quotient
that abstracts the idea of "alinear layout does not permute certain dimensions". Doing so, allows us
to remove
divideRight
and subsume them into this higher-level abstraction.We also fix a bug in
isCrossCTAConversion
.We also remove some code duplication from
transferWithinThreads
andcvtReorderRegisters
in favour of a more generic approach.We fix a bug in
sublayout
that meant thatsublayout
would reorderoutDims
at will by using a set instead of a vector.
I am missing adding tests for LL::quotient, will do in a minute.