Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Property variable name missing from device exception message under linux release with seatbelts #743

Closed
ptheywood opened this issue Dec 6, 2021 · 3 comments · Fixed by #744
Labels

Comments

@ptheywood
Copy link
Member

DeviceExceptions thrown for environemnt properties that do not exist do not always print the message variable in the exception string.

This only appears to affect Release builds under linux with (SEATBELTS=ON).

E.g. the Test DeviceExceptionTest.DeviceEnvironmentGetUnknownProperty should throw an exception with the following error message, which it does under debug builds on linux:

First error:
/home/ptheywood/code/flamegpu/FLAMEGPU2/include/flamegpu/runtime/utility/DeviceEnvironment.cuh(143)[0,0,0][0,0,0]:
Environment property with name: nope was not found.

But for Release builds (with SEATBELTS) it throws:

/home/ptheywood/code/flamegpu/FLAMEGPU2/include/flamegpu/runtime/utility/DeviceEnvironment.cuh(143)[0,0,0][0,0,0]:
Environment property with name:  was not found.

To quickly enable the output in all build modes (but break the test suite) run in test_device_exception.cu can be modified to the following:

    void run(int steps = 2) {
        // CudaModel must be declared here
        // As the initial call to constructor fixes the agent population
        // This means if we haven't called model.newAgent(agent) first
        CUDASimulation cudaSimulation(model);
        cudaSimulation.SimulationConfig().steps = steps;
        // This fails as agentMap is empty
        AgentVector population(agent, 1);
        cudaSimulation.setPopulationData(population);
        try {
            cudaSimulation.simulate();
        } catch (const exception::DeviceError& e) {
            printf("%s\n", e.what());
        } 
    }

Then rebuild the test suite and run

./bin/Release/tests --gtest_filter="DeviceExceptionTest.DeviceEnvironmentGetUnknownProperty"

Ideally we should add a test which checks the error message string is correct, otherwise this could be re-introduces in the future.
It is likely this occurs for other types of device exception too.

This was highlighted by Kenneth.

@ptheywood ptheywood added the bug label Dec 6, 2021
@Robadob
Copy link
Member

Robadob commented Dec 6, 2021

Looked into this.

My best guess would be it stems from this code block.

image

There's already an exception for RTC, where char* is explicit templated.

image

My guess would be that (at one time) RTC was always built as release, and release builds treat string literal as char* rather than char[]. Trying to repro on Windows atm to test and fix.

@Robadob
Copy link
Member

Robadob commented Dec 6, 2021

So it's the same for release x SEATBELTS=ON, on Windows too. Will hopefully fix this by the end of the day in time for next release.

@Robadob
Copy link
Member

Robadob commented Dec 6, 2021

So I've now investigated this and solved it locally, PR coming soon.

In summary, the above functions are marked for compilation by RTC, or when included by FLAMEGPUDeviceException.cu. Otherwise, when the host compiler includes this header, it complains about CUDA intrinsics not existing.

My belief is that, this means when other files wish to compile a DeviceException include, they don't see the explicit template and just use the generic, leading to the bug.

Debug builds must report string literals as char[], hence unaffected by the bug.

The fix is instead to mark both functions as inline and wrap them with #ifdef __CUDACC__ instead.

Robadob added a commit that referenced this issue Dec 6, 2021
Also updates the relevant test suite to check for the bug in several cases.
Closes #743
ptheywood pushed a commit that referenced this issue Dec 6, 2021
Also updates the relevant test suite to check for the bug in several cases.
Closes #743
Robadob added a commit that referenced this issue Dec 7, 2021
Forgot to do these before it was merged.
ptheywood pushed a commit that referenced this issue Dec 7, 2021
Forgot to do these before it was merged.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants