Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add LL::quotient and remove uses of divideRight and sublayoutIsIdentity #4968

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

lezcano
Copy link
Contributor

@lezcano lezcano commented Oct 22, 2024

We add a new abstraction LL::quotient that abstracts the idea of "a
linear layout does not permute certain dimensions". Doing so, allows us
to remove divideRight and subsume them into this higher-level abstraction.
We also fix a bug in isCrossCTAConversion.

We also remove some code duplication from transferWithinThreads and
cvtReorderRegisters in favour of a more generic approach.

We fix a bug in sublayout that meant that sublayout would reorder outDims
at will by using a set instead of a vector.

I am missing adding tests for LL::quotient, will do in a minute.

We add a new abstraction `LL::quotient` that abstracts the idea of "a
linear layout does not permute certain dimensions". Doing so, allows us
to remove `divideRight` and subsume them into this higher-level abstraction.
We also fix a bug in `isCrossCTAConversion`.

We alsoremove some code duplication from `transferWithinThreads` and
`cvtReorderRegisters` in favour of a more generic approach.
@Jokeren Jokeren requested a review from jlebar October 22, 2024 19:08
Copy link
Contributor

@Jokeren Jokeren left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May I know what bug is there?

We also fix a bug in isCrossCTAConversion.

// Note that that this is stronger than the usual definition of quotienting
// because it requires the linear map to be the identity, not just leave these
// dimensions invariant.
bool canQuotient(ArrayRef<StringAttr> dimNames) const;
Copy link
Contributor

@Jokeren Jokeren Oct 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the function is like a special implementation of divideRight, but the function name itself sounds like a general implementation. It's not general enough though to get a quotient in all circumstances because it requires the linear map to be the identity.

Maybe you can consider changing the function name

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is intentionally less general than divideRight. My point here is that I don't think that we ever need anything as general as divideRight. divideRight tries to find a left-inverse (right-inverse?) to the tensor product. This is way too powerful and reasonably tricky to do in general (doesn't exist for an arbitrary tensor, it has to be surjective and preserve the tensor product structure).

quotient is simpler (easier to check that its implementation is correct in all cases) and more difficult to use wrong, so I argue it's the right abstraction we want here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So my point here is that we should probably change the function name as it's not strictly speaking a quotient. IMO it's better to denote it in the function name rather than looking up the comment

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a quotient in the sense of vector spaces. It's only that's not as general as the quotient there, as a linear function descends to a function on the quotient if it maps the space that you are quotienting by into itself. Here we are a bit tighter, as we ask for it to also be the identity in this subspace, but I think it's alright?

Copy link
Contributor Author

@lezcano lezcano Oct 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In case you are not familiar with the quotient of vector spaces, just note that that wiki page is the best resource to learn about them as it's a bit too terse. I can find a better explanation somewhere else tomorrow (closing this for today). (if you know about that concept then disregard this message :D)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's only that's not as general as the quotient there, as a linear function descends to a function on the quotient if it maps the space that you are quotienting by into itself.

Right, that's what I meant

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In case you are not familiar with the quotient of vector spaces

Nope, I'm not. I'll take a look

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, I miswrote above "it's the best resource" when I meant "it's not the best resource" lol

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps a better name for this function would be isTrivialOver, as it checks whether it acts trivially (as the identity map) on a given set of dimensions... I think that quotient is a good name for the other function, where we quotient over a set of variables, even if we give up in some cases where you could actually quotient, so I think it's fine to leave it as a NYI comment. Thoughts?

// checks whether it is the identity map on these dimensions (i.e
// LinearLayouts::canQuotient) and if so, returns the sublayout of the
// remaining dimensions.
std::optional<LinearLayout> quotient(ArrayRef<StringAttr> dimNames) const;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above. Because I think it's a special case for divideRight.

// Is the sublayout defined from dimNames to dimNames the identity?
// In particular, is the input and output size in these dimensions
// the same, and are the bases the identity?
bool squareSublayoutIsIdentity(ArrayRef<StringAttr> dimNames) const;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, are we ignoring the case where the input dimension names are not equal to the output dimension names, but the input and output are identity mappings?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The identity in maths is just defined between a space and itself, so yes, this is the correct mathematical concept. Even if you have registers that map 1-to-1 to a vector of outputs, this would not be an identity map strictly speaking, as it's not mapping elements to themselves, but identifying registers with a matrix.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, makes sense then

@lezcano
Copy link
Contributor Author

lezcano commented Oct 22, 2024

May I know what bug is there?

Previously, we were checking that:

return !layout.sublayoutIsZero(nonBlockInDims.getArrayRef(), {kBlock}) ||
         !layout.sublayoutIsIdentity({kBlock}, {kBlock});

This means that the kBlock of the output will be defined uninquely and as the identity by the kBlock of the input, which is fine. But we are missing a check that layout.sublayoutIsZero({kBlock}, nonBlockOutDims.getArrayRef()), as otherwise the layout will depend on the kBlock, but a few lines after that we are "removing" the kBlock dimensions, effectively assuming that that sublayout is zero.

At the moment I have drawn this rather rough distribution, but it can be optimised in the future. For example, if you just have those two initial conditions, you can still perform the conversion without using distributed shmem, but you'll need to write an indexing that depends on kBlock, which is a tad trickier, but of course, can be done.

@Jokeren
Copy link
Contributor

Jokeren commented Oct 22, 2024

This means that the kBlock of the output will be defined uninquely and as the identity by the kBlock of the input, which is fine. But we are missing a check that layout.sublayoutIsZero({kBlock}, nonBlockOutDims.getArrayRef()), as otherwise the layout will depend on the kBlock, but a few lines after that we are "removing" the kBlock dimensions, effectively assuming that that sublayout is zero.

Sure, I think it makes sense

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants