Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update mono to support Unsafe.BitCast #103915

Closed
wants to merge 1 commit into from

Conversation

tannergooding
Copy link
Member

No description provided.

src/mono/mono/mini/intrinsics.c Outdated Show resolved Hide resolved
src/mono/mono/mini/intrinsics.c Outdated Show resolved Hide resolved
@tannergooding
Copy link
Member Author

I was wanting to update Mono interpreter as well, which looks to be generally handled in mono/mini/interp/transform.c, but there doesn't appear to be existing handling for many other functions (including Subtract, ReadUnaligned, etc) and so it wasn't immediately obvious to me what handling should exist there.

@lambdageek
Copy link
Member

I was wanting to update Mono interpreter as well, which looks to be generally handled in mono/mini/interp/transform.c, but there doesn't appear to be existing handling for many other functions (including Subtract, ReadUnaligned, etc) and so it wasn't immediately obvious to me what handling should exist there.

In general with interp intrinsics the goals are:

  1. eliminate long chains of calls if they impact startup performance (the tier0 interpreted version doesn't do inlining so all managed-to-managed calls are expensive. once tiering kicks in generally if everything inlines, those will all go away)
  2. eliminate long IL sequences that can be done entirely in C. (so for example "Marvin Block")

So for something like ReadUnaligned which is a leaf function that will inline to a single load opcode (and some inlining-related moves) it's not really worth it to turn them into intrinsics.

On the other hand BitCast (if it's hot in startup profiling) which expands to a bunch of calls but could be replaced by a single move or load is probably more worthwhile.

@tannergooding
Copy link
Member Author

it's not really worth it to turn them into intrinsics.

I would expect the interpreter to see the same general benefits as RyuJIT here as well, which is that it improves the overall throughput and code quality as there is no need to execute the inliner for known hot methods that optimize down to functionally single opcodes.

That is, most of the APIs on S.R.CS.Unsafe and other key helper types (like BitConverter, RuntimeHelpers, the vector types, etc) are often "trivial", but they are used extensively to provide performance in the lowest foundations of the ecosystem. The cost of doing the inlining, the chance that inlining can introduce "boundaries" in the general IR that prevent optimizations, that they can block early optimizations around dead code elimination or constant folding, and similar considerations means that it is almost always worthwhile to ensure they are explicitly handled by the underlying runtime.

I don't think it's critical to handle them all immediately, per say, but I think that long term we should push to ensuring that Mono and RyuJIT try to consistently handle APIs intrinsically to ensure that we don't end up with cases where using an API improves RyuJIT but regresses Mono.

@tannergooding
Copy link
Member Author

tannergooding commented Jun 25, 2024

@lambdageek, could you help diagnose the crash here? I think it's caused by the EMIT_NEW_VARLOADA, but it isn't clear what the "right" way to handle that otherwise would be.

I had initially used NEW_VARLOADA because that's what NEW_ARGLOADA defers to (but it expected arg_array and param_types to exist as local types, so I just passed the info to VARLOADA directly).

Some other paths appear to use EMIT_NEW_VARLOADA_VREG, but that looks like it might be intended for locals instead and its not clear what scenario it should be used under otherwise if it is the right thing to use (is it only for non-primitive types, for example, or can it always be called).

Comment on lines 590 to 593
} else if (tto_type == MONO_TYPE_I4) {
opcode = OP_MOVE;
tto_stack = STACK_I4;
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There seems to be a failure around this path for small types (byte->byte, or ushort->short, etc).

Is there some Mono specific handling that's needed here @lambdageek?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or maybe do I need to explicitly check for MONO_TYPE_I1/I2/U1/U2/U4/U8?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know, I'm sorry. Though mini_type_to_stack_type seems to suggest that your latest idea is on the right track. From the failures it looks like maybe you need to set ins->klass too - it looks like there's some kind of sign extension happening when zero extension is expected.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yea, maybe that's it. compare to how we translate dup:

		case MONO_CEE_DUP: {
			MonoInst *temp, *store;
			sp--;
			ins = *sp;
			klass = ins->klass;

			temp = mono_compile_create_var (cfg, type_from_stack_type (ins), OP_LOCAL);
			EMIT_NEW_TEMPSTORE (cfg, store, temp->inst_c0, ins);

			EMIT_NEW_TEMPLOAD (cfg, ins, temp->inst_c0);
			ins->klass = klass;
			*sp++ = ins;

			EMIT_NEW_TEMPLOAD (cfg, ins, temp->inst_c0);
			ins->klass = klass;
			*sp++ = ins;

			inline_costs += 2;
			break;
		}

@tannergooding tannergooding force-pushed the mono-bitcast branch 2 times, most recently from 558db9f to 880a6eb Compare June 27, 2024 19:50
@kasperk81
Copy link
Contributor

contributes to #101495 @fanyang-mono

@tannergooding
Copy link
Member Author

As much as I would like to finish this and get it in, even the simple primitive to primitive bitcasts that are mapping to existing Mono support are not working/seemingly causing CI crashes and I don't have the time/availability to continue debugging it.

Anyone should feel free to pick this up if they have more context or time.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants