Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"SWIG director method error" from HostFn exception inside Ensemble. #1067

Closed
Robadob opened this issue May 31, 2023 · 3 comments · Fixed by #1068
Closed

"SWIG director method error" from HostFn exception inside Ensemble. #1067

Robadob opened this issue May 31, 2023 · 3 comments · Fixed by #1068
Labels

Comments

@Robadob
Copy link
Member

Robadob commented May 31, 2023

If an exception is thrown from a Python host function during a regular CUDASimulation's execution, it is propagated back to python and a full error message and stack trace is received.

If the same occurs during a CUDAEnsemble, it is caught and handled by C/C++ so we only get this detail.

$ python boids_spatial3D.py
CUDAEnsemble completed 0 runs successfully!
There were a total of 4 errors.
Traceback (most recent call last):
  File "/home/rob/fgpu2/examples/python_rtc/boids_spatial3D_bounded/boids_spatial3D.py", line 426, in <module>
    cudaSimulation.simulate(rp);
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/rob/fgpu2/build_py311/lib/Release/python/venv/lib/python3.11/site-packages/pyflamegpu/pyflamegpu.py", line 9147, in simulate
    return _pyflamegpu.CUDAEnsemble_simulate(self, plan)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
pyflamegpu.pyflamegpu.FLAMEGPURuntimeException: (EnsembleError) /home/rob/fgpu2/src/flamegpu/simulation/CUDAEnsemble.cu(264): Run 2 failed on device 0, thread 2 with exception:
SWIG director method error.

According to the SWIG documentation, this is currently a limitation:

This code will check the Python error state after each method call from a director into Python, and throw a C++ exception if an error occurred. This exception can be caught in C++ to implement an error handler. Currently no information about the Python error is stored in the Swig::DirectorMethodException object, but this will likely change in the future.

From digging around in the generated python, I found that the error handler which generates the C++ exception is passed a PyObject*, which presumably represents the exception.

void SwigDirector_HostFunction::run(flamegpu::HostAPI *arg0) {
  SWIG_PYTHON_THREAD_BEGIN_BLOCK;
  {
    swig::SwigVar_PyObject obj0;
    obj0 = SWIG_NewPointerObj(SWIG_as_voidptr(arg0), SWIGTYPE_p_flamegpu__HostAPI,  0 );
    if (!swig_get_self()) {
      Swig::DirectorException::raise("'self' uninitialized, maybe you forgot to call HostFunction.__init__.");
    }
#if defined(SWIG_PYTHON_DIRECTOR_VTABLE)
    const size_t swig_method_index = 0;
    const char *const swig_method_name = "run";
    PyObject *method = swig_get_method(swig_method_index, swig_method_name);
    swig::SwigVar_PyObject result = PyObject_CallFunctionObjArgs(method ,(PyObject *)obj0, NULL);
#else
    swig::SwigVar_PyObject swig_method_name = SWIG_Python_str_FromChar("run");
    swig::SwigVar_PyObject result = PyObject_CallMethodObjArgs(swig_get_self(), (PyObject *) swig_method_name ,(PyObject *)obj0, NULL);
#endif
    if (!result) {
      PyObject *error = PyErr_Occurred();
      {
        if (error != NULL) {
          throw Swig::DirectorMethodException();
        }
      }
    }
  }
  SWIG_PYTHON_THREAD_END_BLOCK;
}

I thought using PyErr_Fetch(), PyErr_GetExcInfo() or PyException_GetTraceback()/PyException_GetContext()/PyException_GetCause() I might be able to get the detail (https://docs.python.org/3/c-api/exceptions.html). But no luck. Fetch returns all null strings, and the rest cause the program to crash with no output (possible access violation or something if they're returning null ptrs).

Needs more investigation.

@Robadob Robadob added the SWIG label May 31, 2023
@Robadob
Copy link
Member Author

Robadob commented Jun 1, 2023

Update:

It's trivially possible to catch runtime errors and rethrow them

try:
    ...
except Exception as err:
    traceback.print_exception(*sys.exc_info())
    raise

However, this is not just awkward, it also does not resolve errors caused by syntax issues within the host function. In this case the below code was at fault (I'm not familiar with when/where Python chooses to 'compile' syntax, unclear if it's a specific to format strings being done late or the whole function).

# Note variable 'bar' is not defined/in-scope.
foo = f"bar:{bar:0.3f}"

@Robadob
Copy link
Member Author

Robadob commented Jun 1, 2023

A few more iterations of the 10 minute pyflamegpu build process, I've now managed to get the error message to print.

%feature("director:except") {
    if ($error != NULL) {
        PyObject *type, *value, *traceback;
        PyErr_Fetch(&type, &value, &traceback);
        // Message
        PyObject *value_str = PyObject_Str(value);
        const char *pStrErrorMessage = PyUnicode_AsUTF8(value_str);
        printf("Exception Message: %s\n", pStrErrorMessage);
        // @todo type to string
        // @todo traceback to string
        // Propagate exception
        // @todo store the data in here?
        throw Swig::DirectorMethodException();
    }
}

@Robadob
Copy link
Member Author

Robadob commented Jun 1, 2023

I've now got it as far as printing the message, exception type, and host function class name.

// swig director exceptions (handle python callback exceptions as C++ exceptions not Runtime errors)
%feature("director:except") {
    if ($error != NULL) {
        PyObject *type, *value, *traceback;
        PyErr_Fetch(&type, &value, &traceback);
        PyErr_NormalizeException(&type, &value, &traceback);
        PyObject *hostfn = swig_get_self();
        // Message
        PyObject *value_str = PyObject_Str(value);
        const char *pStrErrorMessage = PyUnicode_AsUTF8(value_str);
        printf("Exception Message: %s\n", pStrErrorMessage);
        // Type
        PyObject* type_obj_name = PyObject_GetAttrString(type, "__name__");
        PyObject *type_str = PyObject_Str(type_obj_name);
        const char *pTypeStr = PyUnicode_AsUTF8(type_str);
        printf("Type: %s\n", pTypeStr);
        // Director Obj Type        
        PyObject* hostfn_type = PyObject_Type(hostfn);
        PyObject* hostfn_type_name = PyObject_GetAttrString(hostfn_type, "__name__");
        PyObject *hostfn_str = PyObject_Str(hostfn_type_name);
        const char *pHostFnTypeStr = PyUnicode_AsUTF8(hostfn_str);
        printf("HostFnType: %s\n", pHostFnTypeStr);
        // @todo traceback to string
        PyObject *trace_str = PyObject_Str(traceback);
        const char *pTraceStr = PyUnicode_AsUTF8(trace_str);
        printf("Trace: %s\n", pTraceStr);
        // Cleanup
        Py_DECREF(trace_str);
        Py_DECREF(type_obj_name);
        Py_DECREF(hostfn_type_name);
        Py_DECREF(hostfn_type);
        Py_DECREF(hostfn_str);
        Py_DECREF(type_str);
        Py_DECREF(value_str);
        Py_DECREF(type);
        Py_DECREF(value);
        Py_DECREF(traceback);
        // Propagate exception
        // @todo store the data in here?
        throw Swig::DirectorMethodException();
    }
}

I've now tested this traceback implementation, which works after including Python's frameobject.h. May have a PR soon.

Robadob added a commit that referenced this issue Jun 2, 2023
mondus pushed a commit that referenced this issue Jul 7, 2023
mondus pushed a commit that referenced this issue Jul 7, 2023
* Python: Retain hostfn exception msg when thrown during ensemble.

Tested on Windows with Python 3.9

Closes #1067

* Python 3.11+ fix
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant