-
Notifications
You must be signed in to change notification settings - Fork 246
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Language feature: barriers #8
Comments
The |
Examples from @h3r2tic and others using group shared memory:
groupshared uint stack[GROUP_SIZE * LDS_STACK_SIZE];
bool bvhIntersectAny(Ray ray, uint stackBase : SV_GroupIndex) {
uint bottom = stackBase * LDS_STACK_SIZE;
uint stackPtr = bottom;
stack[stackPtr++] = InvalidAddr;
do {
Node node = g_nodes[NonUniformResourceIndex(bvh)][addr];
if(!isLeaf(node)) {
// ...
if(traverseLeft || traverseRight) {
uint postpone = InvalidAddr;
// ...
if(traverseLeft && traverseRight) {
stack[stackPtr++] = postpone; // needs manual bounds checking
}
continue;
}
} else {
if(isInstance(node)) {
stack[stackPtr++] = InvalidAddr; // needs manual bounds checking
continue;
} else {
//
}
}
addr = stack[--stackPtr];// needs manual bounds checking
if(addr == InvalidAddr && stackPtr > 0 && instanceId != ~0) {
addr = stack[--stackPtr];// needs manual bounds checking
}
} while(addr != InvalidAddr);
}
|
We had the following discussion in the Discord about trying to bring some of Rust's "fearless concurrency" to rust-gpu, by tweaking the way that accesses to shared memory (Buffers/UAVs, Workgroup memory, etc.) is managed when translating to SPIR-V. Just so we don't lose the conversation, I've pasted it below - can try to tease out more details later but for now we just want to be able to stow this so we don't lose it...
|
One additional detail that got discussed after this little back and forth was that the aim here is to get borrow checking working across GPU threads - being able to know explicitly that indexing into the buffer is safe. For instance, if you're mutably borrowing from a buffer via a unique index (say indexing by the workgroup ID per workgroup), then a compile time check should be able to tell you your access is safe. For debugging, it should also be possible to automatically insert (somewhat expensive) checks that invocations aren't racing. There's probably also a place in there somewhere for something akin to rust's explicit lifetimes to allow more complex indexing to be compile-time checked. From the SPIR-V side the aim is to make painless things that typically developers get frustrated over or make mistakes with - acquire/release/barriers are a pain in the prominent shading languages, and experience shows developers very much tend to get them wrong. The proposal discussed above takes that decision making away from the developer and hard codes what the most useful case as a default behavior - unlike GLSL/HLSL. We may want to provide hooks for less common patterns further down the line, but for now this behaviour should cover the needs of any developer that isn't trying to experiment with synchronization patterns... |
Putting this on the correct issue this time.... Ok so turning this into something a bit more concrete, I think I'd propose three new attributes that affect struct definitions declared as storage buffers/images/texel buffers (UAVs), shared memory (LDS), or anything else we somehow end up being able to touch from multiple threads: #[SubgroupVisible]
#[WorkgroupVisible]
#[DeviceVisible] Each of these attributes modifies what happens during borrow and drop, and memory semantics used for OpLoad/OpStore when accessing references to these structs. Note: If none of these attributes are present it should be a compiler error to borrow a mutable reference from the struct, (unless the code is declared unsafe?). ScopeThe word before "Visible" in the attribute determines the scope at which writes to the structure are made visible to other shader invocations. Acquire on BorrowWhen a reference is borrowed from the structure, an OpMemoryBarrier should be performed as follows:
Release on DropWhen a reference to the structure is dropped, an OpMemoryBarrier should be performed as follows:
LoadsWhen a reference to the structure is read from, an OpLoad should be generated with the following Memory Operands:
or similarly for images:
StoresWhen a reference to the structure is written to, an OpStore should be generated with the following Memory Operands:
or similarly for images:
|
A couple issues:
|
@khyperia Thanks for looking at this!
My understanding is that it's possible to override borrow/drop in the language itself, rather than in the backend, so I was imagining this be dealt with at least partially above the compiler; perhaps emitting some sort of intrinsic that the backend would turn into the right memory barriers. I was experimenting with this on the rust playground and made some decent progress in getting a struct in regular rust to output something (println! with the relevant instruction) at borrow/drop, but like a fool I hadn't saved it locally and lost it all when my browser tab crashed 🤦♀️
Ok so again I'm not necessarily expecting this to be a wholly backend issue. My hope here is that we can coax the front-end into tagging the variables (and/or the variable accesses), such that when it gets to the backend all you'd need to do is check if a relevant tag is there or not. Unlike the first issue you pointed out, I don't have a handle on how this might be done with rust's attribute system yet - so this is kind of an open issue that I'm hoping we can have more discussion on. |
So I've been thinking about the implementation complexity, and given the newness of this project and that it's probably useful to bootstrap this in the short term, maybe there should be two phases to this? The initial phase will have dedicated acquire/release and load/store ops, and then the more complex/ergonomic thing here can be a future investigation? Unless anyone thinks that's a terrible idea I'll write this into the draft RFC I'm working on. |
We want to have the same types of safety on the GPU as we do on the CPU. One of those areas that make it easy to introduce race conditions is with memory barriers. HLSL and GLSL require the programmer to explicitly put them in the right locations without any formal verifiction.
This is an attempt at solving that within the Rust type system (attempt still has racy bugs and is largely incomplete).
The text was updated successfully, but these errors were encountered: