Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG]: Potential leaks when using enum_ #3865

Open
2 of 3 tasks
couletj opened this issue Apr 12, 2022 · 9 comments
Open
2 of 3 tasks

[BUG]: Potential leaks when using enum_ #3865

couletj opened this issue Apr 12, 2022 · 9 comments

Comments

@couletj
Copy link

couletj commented Apr 12, 2022

Required prerequisites

Problem description

Dear team, Asan reports possible memory leaks when wrapping enum from C++ (using py::enum_, precisely when a call to the value method is done.

Wrapping struct (using py::class_) seems to raise no leaks.

Config : python3.8 / gcc8.3, pybind v2.9.2

May be related to #3228

Reproducible example code

#include <pybind11/pybind11.h>

enum Egg {
  Bar,
  Spam,  
};


namespace py = pybind11;

PYBIND11_MODULE(example, m) {
    py::enum_<Egg>(m, "Egg")
      .value("Bar", Egg::Bar)
      .value("Spam", Egg::Spam);
}
@couletj couletj added the triage New bug, unverified label Apr 12, 2022
@Skylion007
Copy link
Collaborator

Likely causes include our abbusive use of attr to add values to the dictionary. The fact that we are storing a C++ pair in the python dictionary, or finally some bug with dynamic_attrs that we haven't caught. It could also be a reference cycle between the attr and the enum, in which case Python won't properly GC when the class is deleted.

@ssooffiiaannee
Copy link

Could you explain how do you know there is a leak?

@couletj
Copy link
Author

couletj commented Apr 15, 2022

Memory leaks or errors seems to be reported by your favorite memory checker (Totalview, valgrind, asan ...) if you build the pybind module with debugging options :

g++ -O0 -g -fsanitize=address -shared -std=c++11 -fPIC -I pybind11/include/ test.pybind.cpp -o example$(python3-config --extension-suffix)

According to the reports, the faulty lines seems to be

  • return unique_function_record(new detail::function_record()); in pybind11.h::cpp_function::make_function_record
  • func->m_ml->ml_doc = signatures.empty() ? nullptr : PYBIND11_COMPAT_STRDUP(signatures.c_str()); in pybind11.h::cpp_function::initialize_generic
  • rec->def = new PyMethodDef(); in pybind11.h::cpp_function::initialize_generic
  • auto *t = PYBIND11_COMPAT_STRDUP(s); in pybind11.h::cpp_function::strdup_guard::operator ()

Context of call varies, for example here one of the stacks (sorry for readability !)

In test.pybind.cpp
      .value("Bar", Egg::Bar)

In pybind11::enum_<Egg>::not-in-charge enum_<NULL> l.2179
      attr("__setstate__") = cpp_function(
          [](detail::value_and_holder &v_h, Scalar arg) {
              detail::initimpl::setstate<Base>(
                  v_h, static_cast<Type>(arg), Py_TYPE(v_h.inst) != v_h.type->type);
          },
          detail::is_new_style_constructor(),
          pybind11::name("__setstate__"),
          is_method(*this),
          arg("state"));

In pybind11::ccp_function (some long template here) l.100 

    cpp_function(Func &&f, const Extra &...extra) {
        initialize(
            std::forward<Func>(f), (detail::function_signature_t<Func> *) nullptr, extra...);
    }

In pybind11::ccp_function::initialize (some long template here) l.177 
    auto *rec = unique_rec.get();

In pybind11::cpp_function::make_function_record l.162
    return unique_function_record(new detail::function_record());

@ssooffiiaannee
Copy link

How did you get that? I did as follows

>>> import example
==9434==ASan runtime does not come first in initial library list; you should either link runtime to your application or manually preload it with LD_PRELOAD.

@couletj
Copy link
Author

couletj commented Apr 15, 2022

That's great, it means you compiled with gcc sanitizer sucessfully :) (that is the -fsanitize=address option, see https://gcc.gnu.org/onlinedocs/gcc/Instrumentation-Options.html). Now you have to provide the asan library at runtime, I do it with

LD_PRELOAD=/opt/tools/gcc-8.3/lib64/libasan.so python test.py

Be careful :

  1. The path to asan lib may will probably be different, it depends on your installation
  2. I think it is not possible to use the interpreter in this configuration to write a simple test.py importing the pybind module
  3. Sanitizer output is huge and hard to read ! In particular you will see lot of warning comming from python library itself, you have to search the ones comming from your library
  4. If you prefer to use valgrind, do not compile with -fsanitize=address

@cjolivier01
Copy link

I also show leaks with asan only for enum (every enum) in make_function_record()

@Skylion007
Copy link
Collaborator

Skylion007 commented Sep 12, 2022

@cjolivier01 Could this be related to the free_data function not being set correctly? A relevant PR seems like #3229 which fixed a similar memory leak. Also could be related to the InitializingFunctionRecordDeleter in pybind11 perhaps.

@feltech
Copy link

feltech commented Nov 25, 2022

We're also seeing these leaks with enums. Also, if you add the py::arithmetic tag, then an additional memory leak is generated.

feltech added a commit to feltech/OpenAssetIO that referenced this issue Nov 25, 2022
Part of OpenAssetIO#739. We were combining the "bridge" tests of the
`openassetio-python-bridge` with the "module" tests of Python bindings
helpers in `openassetio-python-module` and, somewhat arbitrarily,
grouping them under an "internal" heading.

These tests all relied on running under `pytest` and using an
`_openassetio_test` Python extension module to provide C++ helpers. This
makes sense for the "module" tests (`PyRetainingSharedPtr`) but was a
quick-and-dirty solution for the "bridge" tests
(`createPythonPluginSystemManagerImplementationFactory`).

So move the "bridge" tests out into their own directory with a C++
CTest that embeds a Python interpreter. This is much more representative
of how we expect the `openassetio-python-bridge` library to be used.

The leak sanitizer (via ASan) flags some leaks from pybind11 when the
test program exits. This is apparently a known bug
(pybind/pybind11#3865). The leak arises when
defining a `py::enum_` and then calling `.value` on it to add an entry
to the enum.

It's mysterious that we've not seen it in previous sanitizer runs that
exercise the offending part of the code (i.e. `pytest` tests that import
the `_openassetio` bindings). It could have been hidden behind the
`python` executable, whereas for this test we have a custom
`openassetio-python-bridge-test-exe` executable that embeds Python. Or
we could be missing some cleanup in this executable, which `python`
handles properly.

Assuming this is a genuine bug, at least it's only a leak at program
exit, so is benign, and there's nothing we can do about it until
pybind11 fixes it upstream.

So just suppress any leaks from being reported if they have
`pybind11::enum_<` in their stack trace.

Signed-off-by: David Feltell <david.feltell@foundry.com>
feltech added a commit to feltech/OpenAssetIO that referenced this issue Nov 28, 2022
Part of OpenAssetIO#739. We were combining the "bridge" tests of the
`openassetio-python-bridge` with the "module" tests of Python bindings
helpers in `openassetio-python-module` and, somewhat arbitrarily,
grouping them under an "internal" heading.

These tests all relied on running under `pytest` and using an
`_openassetio_test` Python extension module to provide C++ helpers. This
makes sense for the "module" tests (`PyRetainingSharedPtr`) but was a
quick-and-dirty solution for the "bridge" tests
(`createPythonPluginSystemManagerImplementationFactory`).

So move the "bridge" tests out into their own directory with a C++
CTest that embeds a Python interpreter. This is much more representative
of how we expect the `openassetio-python-bridge` library to be used.

The leak sanitizer (via ASan) flags some leaks from pybind11 when the
test program exits. This is apparently a known bug
(pybind/pybind11#3865). The leak arises when
defining a `py::enum_` and then calling `.value` on it to add an entry
to the enum.

It's mysterious that we've not seen it in previous sanitizer runs that
exercise the offending part of the code (i.e. `pytest` tests that import
the `_openassetio` bindings). It could have been hidden behind the
`python` executable, whereas for this test we have a custom
`openassetio-python-bridge-test-exe` executable that embeds Python. Or
we could be missing some cleanup in this executable, which `python`
handles properly.

Assuming this is a genuine bug, at least it's only a leak at program
exit, so is benign, and there's nothing we can do about it until
pybind11 fixes it upstream.

So just suppress any leaks from being reported if they have
`pybind11::enum_<` in their stack trace.

Signed-off-by: David Feltell <david.feltell@foundry.com>
@Skylion007 Skylion007 added help wanted and removed triage New bug, unverified labels Feb 4, 2023
@Skylion007
Copy link
Collaborator

After doing some more investigation, I am fairly sure it's due to use calling .attr() on enum_base which is just a handle of the object store in a separate struct of enum in a weird way. We probably need to refactor to get rid of the weird base struct in a so that the attributes are properly cleared. Something is keeping the attrs we add during construction from being 'dec_ref' properly

JDuffeyBQ added a commit to BlueQuartzSoftware/simplnx that referenced this issue Aug 5, 2024
* Ignore leaks detected in:
  * "/usr/bin" e.g. bash and python
  * "site-packages/" e.g. numpy
  * "pybind11.h"
    * Currently having an issue with leaking enum bindings
    * pybind/pybind11#3865

Signed-off-by: Jared Duffey <jared.duffey@bluequartz.net>
imikejackson pushed a commit to BlueQuartzSoftware/simplnx that referenced this issue Aug 6, 2024
* Ignore leaks detected in:
  * "/usr/bin" e.g. bash and python
  * "site-packages/" e.g. numpy
  * "pybind11.h"
    * Currently having an issue with leaking enum bindings
    * pybind/pybind11#3865

Signed-off-by: Jared Duffey <jared.duffey@bluequartz.net>
JDuffeyBQ added a commit to BlueQuartzSoftware/simplnx that referenced this issue Aug 6, 2024
* Fixed issue with catch2 test discovery for asan CI

* Catch2 runs the test executable in order to discover the tests
but the executable couldn't find the clang asan library in the build
step

* Added leak suppressions file for LeakSanitizer

* Ignore leaks detected in:
  * "/usr/bin" e.g. bash and python
  * "site-packages/" e.g. numpy
  * "pybind11.h"
    * Currently having an issue with leaking enum bindings
    * pybind/pybind11#3865

Signed-off-by: Jared Duffey <jared.duffey@bluequartz.net>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants