-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add type_caster<PyObject>
#4601
Changes from 15 commits
ef13f04
c7c52b0
f220f0c
467b237
79f2b05
45f915b
00cc8b4
432de63
db47541
b393001
0ba8003
d976e66
af1886a
3391bbd
f6c7cee
9907943
5ccb893
39cb0d8
cf4c282
6afa757
2e5a699
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
@@ -1041,7 +1041,11 @@ make_caster<T> load_type(const handle &handle) { | |||||||||||||||
PYBIND11_NAMESPACE_END(detail) | ||||||||||||||||
|
||||||||||||||||
// pytype -> C++ type | ||||||||||||||||
template <typename T, detail::enable_if_t<!detail::is_pyobject<T>::value, int> = 0> | ||||||||||||||||
template <typename T, | ||||||||||||||||
detail::enable_if_t<!detail::is_pyobject<T>::value | ||||||||||||||||
&& !detail::is_same_ignoring_cvref<T, PyObject *>::value, | ||||||||||||||||
int> | ||||||||||||||||
= 0> | ||||||||||||||||
T cast(const handle &handle) { | ||||||||||||||||
using namespace detail; | ||||||||||||||||
static_assert(!cast_is_temporary_value_reference<T>::value, | ||||||||||||||||
|
@@ -1055,6 +1059,16 @@ T cast(const handle &handle) { | |||||||||||||||
return T(reinterpret_borrow<object>(handle)); | ||||||||||||||||
} | ||||||||||||||||
|
||||||||||||||||
// Note that `cast<PyObject *>(obj)` increments the reference count of `obj`. | ||||||||||||||||
// This is necessary for the case that `obj` is a temporary. | ||||||||||||||||
// It is the responsibility of the caller to ensure that the reference count | ||||||||||||||||
// is decremented. | ||||||||||||||||
template <typename T, | ||||||||||||||||
detail::enable_if_t<detail::is_same_ignoring_cvref<T, PyObject *>::value, int> = 0> | ||||||||||||||||
T cast(const handle &handle) { | ||||||||||||||||
return handle.inc_ref().ptr(); | ||||||||||||||||
} | ||||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is the any case where you want to steal the ref? Ie. shouldn't you also add this? I do not think it will be used currently by your code but if that cast path is used in the future, better to have it.
Suggested change
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Not sure. I think the best way to answer this: can we come up with a meaningful test case? My gut feeling is that stealing a Python reference like this is much more likely to backfire than be helpful. When dropping down to the bare-metal level of There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Wouldn't the current test cases cover it? Semantically, it's equivalent to what is written above, the difference is that without this the Easy example/test case: auto *pyobject_ptr = py::cast<PyObject*>(std::move(obj)); If we include this explicit cast for callbacks, some end will be silly enough to call it so we should make the semantics as useful/efficient as possible. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Actually, a slight modification to the current one of the current test cases would definitely cover it (see below). |
||||||||||||||||
|
||||||||||||||||
// C++ type -> py::object | ||||||||||||||||
template <typename T, detail::enable_if_t<!detail::is_pyobject<T>::value, int> = 0> | ||||||||||||||||
object cast(T &&value, | ||||||||||||||||
|
Original file line number | Diff line number | Diff line change | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
@@ -0,0 +1,61 @@ | ||||||||||||
// Copyright (c) 2023 The pybind Community. | ||||||||||||
|
||||||||||||
#pragma once | ||||||||||||
|
||||||||||||
#include "detail/common.h" | ||||||||||||
#include "detail/descr.h" | ||||||||||||
#include "cast.h" | ||||||||||||
#include "pytypes.h" | ||||||||||||
|
||||||||||||
PYBIND11_NAMESPACE_BEGIN(PYBIND11_NAMESPACE) | ||||||||||||
PYBIND11_NAMESPACE_BEGIN(detail) | ||||||||||||
|
||||||||||||
template <> | ||||||||||||
class type_caster<PyObject> { | ||||||||||||
public: | ||||||||||||
static constexpr auto name = const_name("PyObject *"); | ||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Won't this const name break mypy/ type parsing? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. As unsatisfying as it, I think the name should be an There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It will likely break any position args etc that come after it, right? If you really want to differentiate it (and create invalid typing) just have it be There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Wow ... that's a tough one to decide. From a high-level perspective, using But it hides an important clue: beware, this is dealing with raw pointers. But then again, to whom is that clue useful? The author of the code with the raw pointers already needs to be super careful when writing the code, i.e. doesn't need any extra clues. For everybody else is probably a not very enlightening surprise. I'm leaning towards thinking using There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Testing 5ccb893 now. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Actually one typing detail. Does this caster accept Py_None as meaning nullptr? That would be the one caveat and if did, it may better to list the type as There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do you mean the That's essentially just doing an
|
||||||||||||
|
||||||||||||
// This overload is purely to guard against accidents. | ||||||||||||
template <typename T, | ||||||||||||
detail::enable_if_t<!is_same_ignoring_cvref<T, PyObject *>::value, int> = 0> | ||||||||||||
static handle cast(T &&, return_value_policy, handle /*parent*/) { | ||||||||||||
static_assert(is_same_ignoring_cvref<T, PyObject *>::value, | ||||||||||||
"Invalid C++ type T for to-Python conversion (type_caster<PyObject>)."); | ||||||||||||
return nullptr; // Unreachable. | ||||||||||||
} | ||||||||||||
|
||||||||||||
static handle cast(PyObject *src, return_value_policy policy, handle /*parent*/) { | ||||||||||||
if (src == nullptr) { | ||||||||||||
throw error_already_set(); | ||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is an error guaranteed here? Python may raise a system exception, but I think that might be triggered before even entering the caster. Worried about the case where a nullptr is fed in somehow but no error is set. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Good question: No, but pybind11/include/pybind11/pytypes.h Lines 483 to 487 in 6de6191
(This was one of the things I made as clean/clear as possible while working on #1895.) |
||||||||||||
} | ||||||||||||
if (PyErr_Occurred()) { | ||||||||||||
raise_from(PyExc_SystemError, "src != nullptr but PyErr_Occurred()"); | ||||||||||||
throw error_already_set(); | ||||||||||||
} | ||||||||||||
if (policy == return_value_policy::take_ownership) { | ||||||||||||
return src; | ||||||||||||
} | ||||||||||||
if (policy == return_value_policy::reference | ||||||||||||
|| policy == return_value_policy::automatic_reference) { | ||||||||||||
return handle(src).inc_ref(); | ||||||||||||
} | ||||||||||||
pybind11_fail("type_caster<PyObject>::cast(): unsupported return_value_policy: " | ||||||||||||
+ std::to_string(static_cast<int>(policy))); | ||||||||||||
} | ||||||||||||
|
||||||||||||
bool load(handle src, bool) { | ||||||||||||
value = reinterpret_borrow<object>(src); | ||||||||||||
return true; | ||||||||||||
} | ||||||||||||
|
||||||||||||
template <typename T> | ||||||||||||
using cast_op_type = PyObject *; | ||||||||||||
|
||||||||||||
explicit operator PyObject *() { return value.ptr(); } | ||||||||||||
|
||||||||||||
private: | ||||||||||||
object value; | ||||||||||||
}; | ||||||||||||
|
||||||||||||
PYBIND11_NAMESPACE_END(detail) | ||||||||||||
PYBIND11_NAMESPACE_END(PYBIND11_NAMESPACE) |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,107 @@ | ||
#include <pybind11/functional.h> | ||
#include <pybind11/stl.h> | ||
#include <pybind11/type_caster_pyobject_ptr.h> | ||
|
||
#include "pybind11_tests.h" | ||
|
||
#include <cstddef> | ||
#include <vector> | ||
|
||
namespace { | ||
|
||
std::vector<PyObject *> make_vector_pyobject_ptr(const py::object &ValueHolder) { | ||
std::vector<PyObject *> vec_obj; | ||
for (int i = 1; i < 3; i++) { | ||
vec_obj.push_back(ValueHolder(i * 93).release().ptr()); | ||
} | ||
// This vector now owns the refcounts. | ||
return vec_obj; | ||
} | ||
|
||
} // namespace | ||
|
||
TEST_SUBMODULE(type_caster_pyobject_ptr, m) { | ||
m.def("cast_from_pyobject_ptr", []() { | ||
PyObject *ptr = PyLong_FromLongLong(6758L); | ||
return py::cast(ptr, py::return_value_policy::take_ownership); | ||
}); | ||
m.def("cast_to_pyobject_ptr", [](py::handle obj) { | ||
auto rc1 = obj.ref_count(); | ||
auto *ptr = py::cast<PyObject *>(obj); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This can test my custom caster by just making the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. FYI I just pushed a commit that expands the source code comment, before seeing your latest comment here. I need to look/think more. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The more I think it, the more I think it's a good feature / caster to have and I do not see any reason not to include the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I didn't find it easy to unlock that optimization: 2e5a699 Do you have ideas for achieving the same outcome with less code? The added code complexity is a bit at odds with the really minor runtime optimization, in any real-world application. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @rwgk I am just bringing the complexity / runtime optimization inline with the other caster methods. If adding a single overload can signficantly improve performance and remove unnecessary refcount calls, I see no reason why not to include it. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. OK, thanks for the review! |
||
auto rc2 = obj.ref_count(); | ||
if (rc2 != rc1 + 1) { | ||
return -1; | ||
} | ||
return 100 - py::reinterpret_steal<py::object>(ptr).attr("value").cast<int>(); | ||
}); | ||
|
||
m.def( | ||
"return_pyobject_ptr", | ||
[]() { return PyLong_FromLongLong(2314L); }, | ||
py::return_value_policy::take_ownership); | ||
m.def("pass_pyobject_ptr", [](PyObject *ptr) { | ||
return 200 - py::reinterpret_borrow<py::object>(ptr).attr("value").cast<int>(); | ||
}); | ||
|
||
m.def("call_callback_with_object_return", | ||
[](const std::function<py::object(int)> &cb, int value) { return cb(value); }); | ||
m.def( | ||
"call_callback_with_pyobject_ptr_return", | ||
[](const std::function<PyObject *(int)> &cb, int value) { return cb(value); }, | ||
py::return_value_policy::take_ownership); | ||
m.def( | ||
"call_callback_with_pyobject_ptr_arg", | ||
[](const std::function<int(PyObject *)> &cb, py::handle obj) { return cb(obj.ptr()); }, | ||
py::arg("cb"), // This triggers return_value_policy::automatic_reference | ||
py::arg("obj")); | ||
|
||
m.def("cast_to_pyobject_ptr_nullptr", [](bool set_error) { | ||
if (set_error) { | ||
PyErr_SetString(PyExc_RuntimeError, "Reflective of healthy error handling."); | ||
} | ||
PyObject *ptr = nullptr; | ||
py::cast(ptr); | ||
}); | ||
|
||
m.def("cast_to_pyobject_ptr_non_nullptr_with_error_set", []() { | ||
PyErr_SetString(PyExc_RuntimeError, "Reflective of unhealthy error handling."); | ||
py::cast(Py_None); | ||
}); | ||
|
||
m.def("pass_list_pyobject_ptr", [](const std::vector<PyObject *> &vec_obj) { | ||
int acc = 0; | ||
for (const auto &ptr : vec_obj) { | ||
acc = acc * 1000 + py::reinterpret_borrow<py::object>(ptr).attr("value").cast<int>(); | ||
} | ||
return acc; | ||
}); | ||
|
||
m.def("return_list_pyobject_ptr_take_ownership", | ||
make_vector_pyobject_ptr, | ||
// Ownership is transferred one-by-one when the vector is converted to a Python list. | ||
py::return_value_policy::take_ownership); | ||
|
||
m.def("return_list_pyobject_ptr_reference", | ||
make_vector_pyobject_ptr, | ||
// Ownership is not transferred. | ||
py::return_value_policy::reference); | ||
|
||
m.def("dec_ref_each_pyobject_ptr", [](const std::vector<PyObject *> &vec_obj) { | ||
std::size_t i = 0; | ||
for (; i < vec_obj.size(); i++) { | ||
py::handle h(vec_obj[i]); | ||
if (static_cast<std::size_t>(h.ref_count()) < 2) { | ||
break; // Something is badly wrong. | ||
} | ||
h.dec_ref(); | ||
} | ||
return i; | ||
}); | ||
|
||
#ifdef PYBIND11_NO_COMPILE_SECTION // Change to ifndef for manual testing. | ||
{ | ||
PyObject *ptr = nullptr; | ||
(void) py::cast(*ptr); | ||
} | ||
#endif | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,91 @@ | ||
import pytest | ||
|
||
from pybind11_tests import type_caster_pyobject_ptr as m | ||
|
||
|
||
# For use as a temporary user-defined object, to maximize sensitivity of the tests below. | ||
class ValueHolder: | ||
def __init__(self, value): | ||
self.value = value | ||
|
||
|
||
def test_cast_from_pyobject_ptr(): | ||
assert m.cast_from_pyobject_ptr() == 6758 | ||
|
||
|
||
def test_cast_to_pyobject_ptr(): | ||
assert m.cast_to_pyobject_ptr(ValueHolder(24)) == 76 | ||
|
||
|
||
def test_return_pyobject_ptr(): | ||
assert m.return_pyobject_ptr() == 2314 | ||
|
||
|
||
def test_pass_pyobject_ptr(): | ||
assert m.pass_pyobject_ptr(ValueHolder(82)) == 118 | ||
|
||
|
||
@pytest.mark.parametrize( | ||
"call_callback", | ||
[ | ||
m.call_callback_with_object_return, | ||
m.call_callback_with_pyobject_ptr_return, | ||
], | ||
) | ||
def test_call_callback_with_object_return(call_callback): | ||
def cb(value): | ||
if value < 0: | ||
raise ValueError("Raised from cb") | ||
return ValueHolder(1000 - value) | ||
|
||
assert call_callback(cb, 287).value == 713 | ||
|
||
with pytest.raises(ValueError, match="^Raised from cb$"): | ||
call_callback(cb, -1) | ||
|
||
|
||
def test_call_callback_with_pyobject_ptr_arg(): | ||
def cb(obj): | ||
return 300 - obj.value | ||
|
||
assert m.call_callback_with_pyobject_ptr_arg(cb, ValueHolder(39)) == 261 | ||
|
||
|
||
@pytest.mark.parametrize("set_error", [True, False]) | ||
def test_cast_to_python_nullptr(set_error): | ||
expected = { | ||
True: r"^Reflective of healthy error handling\.$", | ||
False: ( | ||
r"^Internal error: pybind11::error_already_set called " | ||
r"while Python error indicator not set\.$" | ||
), | ||
}[set_error] | ||
with pytest.raises(RuntimeError, match=expected): | ||
m.cast_to_pyobject_ptr_nullptr(set_error) | ||
|
||
|
||
def test_cast_to_python_non_nullptr_with_error_set(): | ||
with pytest.raises(SystemError) as excinfo: | ||
m.cast_to_pyobject_ptr_non_nullptr_with_error_set() | ||
assert str(excinfo.value) == "src != nullptr but PyErr_Occurred()" | ||
assert str(excinfo.value.__cause__) == "Reflective of unhealthy error handling." | ||
|
||
|
||
def test_pass_list_pyobject_ptr(): | ||
acc = m.pass_list_pyobject_ptr([ValueHolder(842), ValueHolder(452)]) | ||
assert acc == 842452 | ||
|
||
|
||
def test_return_list_pyobject_ptr_take_ownership(): | ||
vec_obj = m.return_list_pyobject_ptr_take_ownership(ValueHolder) | ||
assert [e.value for e in vec_obj] == [93, 186] | ||
|
||
|
||
def test_return_list_pyobject_ptr_reference(): | ||
vec_obj = m.return_list_pyobject_ptr_reference(ValueHolder) | ||
assert [e.value for e in vec_obj] == [93, 186] | ||
# Commenting out the next `assert` will leak the Python references. | ||
# An easy way to see evidence of the leaks: | ||
# Insert `while True:` as the first line of this function and monitor the | ||
# process RES (Resident Memory Size) with the Unix top command. | ||
assert m.dec_ref_each_pyobject_ptr(vec_obj) == 2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this actually necessary? Seems a bit unintuitive here. When are situations when obj would be a temporary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In a callback (functional.h):
pybind11/include/pybind11/functional.h
Line 109 in 7ab88d2
That is using
object::cast<PyObject *>()
which winds up inpy::cast<PyObject *>(handle)
:pybind11/include/pybind11/cast.h
Lines 1136 to 1139 in 7ab88d2
This actually took me a while to discover and debug.
I don't see an easy way to make the code in functional.h behave differently.
But I'm also thinking:
with the temporary passed as
handle
is a really bad trap in general (especially if thecast
is actually written ascast<T>
buried in the guts of some library).By adopting the
inc_ref()
convention we don't have to do anything special for callbacks and we avoid the trap.Users have to learn once that the refcount is incremented, it always applies.
This is only for users that find it necessary to actually drop down to the level of dealing with raw
PyObject *
s. In that context, having to be super mindful about refcounts is a given, and careful leak checking of the tests is a standard practice for me at least (while True:
& top is the most reliable btw).Another way to look at it:
inc_ref()
).vs
What tipped the scale in my mind was that the
inc_ref()
convention means nothing else needs to change, which is usually a good sign for changes like this (fitting something slightly different into an existing system).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which is why I am advocating for the additional overload below. For C++ calls, we can determine if the object is a temporary (ie. it's an rvalue) and avoid having to increment it under those circumstances.