Sharing Resources Between Python and C++#
Resource management is a key to a healthy program. No one wants to leak resources (or worse, crash the program by resource mismanagement), but taking care of resource life cycles isn’t a cheerful task. One great thing of Python and other high-level languages is that they can gratify us by allowing clean code like this:
def print_value(val):
msg = "the value is %s" % val
print(msg)
# "msg" object gets automatically destroyed when the function returns
But the happy story ends when Python needs to talk to low-level code. If some data demand complex calculations, oftentimes Python is too slow. Low-level C/C++ will be required. The data usually are large, and it’s inefficient and sometimes impractical to make copies between the langauges. We’ll need to manage the resource life cycles across the language barrier.
I will discuss how shared pointers are used to share resources to and from Python in two C++ wrapping libraries boost.python and pybind11. The former is a powerful and stable wrapping tool and has been popular for a decade. The latter brings all the goodness from C++11 to the wrapping land.
Shared Pointers#
Shared pointers use reference counts to track the life cycle of an object. The tracked object is the resource to be managed. As long as a shared pointer exists, the reference count is positive, and then the resource is alive. When all shared pointers are destroyed to make the reference count decrease to zero, the resource object gets deleted.
C++11 provides a shared pointer library std::shared_ptr
, but before the
standardization, boost::shared_ptr
was the most popular one. It probably
still is. So far (version 1.62) boost.python only supports
boost::shared_ptr
. Both libraries allows to get the raw pointer and the
reference count:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 | // compile with -std=c++11 -stdlib=libc++ to get C++11 support
#include <stdlib.h>
#ifdef USE_BOOST_SHARED_PTR
#include <boost/shared_ptr.hpp>
#define SHARED_PTR boost::shared_ptr
#else
#include <memory>
#define SHARED_PTR std::shared_ptr
#endif
struct Resource {
Resource() { printf("Resource %p constructed\n", this); }
~Resource() { printf("Resource %p destructed\n", this); }
};
int main(int argc, char ** argv) {
auto obj = SHARED_PTR<Resource>(new Resource);
printf("address %p, count %lu\n",
obj.get(), // getting the raw pointer
obj.use_count() // getting the reference count
);
// output: address 0x7fa970c03420, count 1
printf("reset obj\n");
obj = nullptr; // this resets the shared_ptr and thus release the resource object.
printf("program ends\n");
return 0;
}
|
Note
I assume readers to be familiar with shared pointers. But if you are not,
keep in mind that it’s not as innocent as it appears to be. It shouldn’t be
used as a substitute for raw pointers. std::unique_ptr
, on the other
hand, is suggested to be used to replace a raw pointer whenever possible.
For example, one common mistake with shared pointers is duplicated reference counters, and the program will crash because of double free:
auto * resource = new Resource();
auto ref1 = SHARED_PTR<Resource>(resource);
// ooops, the duplicated counter will end up with double free
auto ref2 = SHARED_PTR<Resource>(resource);
The right way to do it is:
auto * resource = new Resource();
auto ref1 = SHARED_PTR<Resource>(resource);
// copy the first shared pointer to use its counter
auto ref2 = ref1;
Catches abound in share pointers. Be careful.
PyObject
also uses reference counts in a similar way. So all we need to do
is to let PyObject
and the shared pointer know the reference count of each
other, and we can freely exchange resources in both worlds. This is how things
become interesting. We have two wrapping libraries and two shared pointer
libraries, and their combinations give 3 different treatments: boost.python
discriminates boost::shared_ptr
and std::shared_ptr
, while pybind11 sees
both shared pointers the same.
Boost.Python with boost::shared_ptr
#
Boost.python supports boost::shared_ptr
as a holder type of a C++ object.
Boost.python creates a conversion while class_<Resource,
boost::shared_ptr<Resource>>
is constructed. It works well but the
conversion implementation may give us a bit surprise. To show it, I extend the
shared pointer example with a boost.python wrapper:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 | #include <boost/python.hpp>
#include <stdlib.h>
#include <boost/utility.hpp>
#include <boost/shared_ptr.hpp>
#include <boost/weak_ptr.hpp>
#define SHARED_PTR boost::shared_ptr
#define WEAK_PTR boost::weak_ptr
struct Resource {
private:
struct ctor_passkey{};
Resource(const ctor_passkey &) { printf(" Resource %p constructed\n", this); }
static WEAK_PTR<Resource> g_weak;
public:
Resource() = delete;
Resource(Resource const & ) = delete;
Resource(Resource &&) = delete;
Resource & operator=(Resource const & ) = delete;
Resource & operator=(Resource &&) = delete;
~Resource() { printf(" Resource %p destructed\n", this); }
static SHARED_PTR<Resource> make() {
if (g_weak.use_count()) {
return g_weak.lock();
} else {
auto strong = SHARED_PTR<Resource>(new Resource(ctor_passkey()));
g_weak = strong;
return strong;
}
}
static size_t get_count() {
printf(" Resource.g_weak.use_count() = %ld\n", g_weak.use_count());
return g_weak.use_count();
}
};
WEAK_PTR<Resource> Resource::g_weak = WEAK_PTR<Resource>();
BOOST_PYTHON_MODULE(ex_bpy_to_python)
{
using namespace boost::python;
class_<Resource, SHARED_PTR<Resource>, boost::noncopyable>
("Resource", no_init)
.def("make", &Resource::make)
.staticmethod("make")
.def("get_count", &Resource::get_count)
.staticmethod("get_count")
;
}
|
The Resource
class is turned to a singleton, and a weak pointer tracks the
resource. When we need a (strong) reference to the resource,
Resource::make()
returns a shared pointer. Boost.python converts each of
the returned boost::shared_ptr<Resource>
objects into a PyObject
. The
C++ reference count is the same as the number of Python objects requested from
C++.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 | import ex_bpy_to_python as pymod
def pycount(obj): import sys; return sys.getrefcount(obj) - 3
print("before construction")
pymod.Resource.get_count()
print("construct the resource and return the shared pointers")
# get shared pointers
objs = [pymod.Resource.make() for it in range(10)]
assert 10 == pymod.Resource.get_count()
assert 1 == pycount(objs[0]); print(" pycount(obj) =", pycount(objs[0]))
print("delete Python objects")
del objs
assert 0 == pymod.Resource.get_count()
print("program ends")
"""output:
before construction
Resource.g_weak.use_count() = 0
construct the resource and return the shared pointers
Resource 0x100506b00 constructed
Resource.g_weak.use_count() = 10
pycount(obj) = 1
delete Python objects
Resource 0x100506b00 destructed
Resource.g_weak.use_count() = 0
program ends
"""
|
We’ll see something strange when the Python object is passed back into C++. I
make a method show_count_from_python()
to uncover (or see the
full file
):
1 2 3 4 | static size_t get_count_from_python(const std::string & name, const SHARED_PTR<Resource> & obj) {
printf(" %s.use_count() = %ld .get() = %p\n", name.c_str(), obj.use_count(), obj.get());
return obj.use_count();
}
|
The C++ internal counter says there are 10 shared pointers to the resource, but
the reference count of the boost::shared_ptr
passed into
show_count_from_python()
is only 1!
1 2 3 4 5 6 7 8 9 | import ex_bpy_from_python as pymod
def pycount(obj): import sys; return sys.getrefcount(obj) - 3
# get shared pointers
objs = [pymod.Resource.make() for it in range(10)]; obj = objs[0]
assert 10 == pymod.Resource.get_count()
assert 2 == pycount(obj); print(" pycount(obj) =", pycount(obj))
assert 1 == pymod.Resource.get_count_from_python("obj", obj)
del objs, obj
assert 0 == pymod.Resource.get_count()
|
Is there any chance that boost.python screws up boost::shared_ptr
reference
counts? By no means. What happens is boost.python creates “another type” of
shared pointer pointing to the resource. The boost.python-made shared pointer
doesn’t use the original shared pointer’s reference counter. Instead, it
creates a deleter (boost::python::converter::shared_ptr_delete
) that grabs
a reference to the PyObject
(which contains the original shared pointer) to
be passed into C++:
struct shared_ptr_deleter {
shared_ptr_deleter(handle<> owner);
~shared_ptr_deleter();
void operator()(void const*);
handle<> owner;
};
So the boost::shared_ptr
passed into C++ has a different reference count to
the original shared pointer. The life cycle of the resource is collaboratively
managed by both boost::shared_ptr
and PyObject
.
boost::weak_ptr
doesn’t buy Boost.Python#
So far we are good, as long as the resource is held by boost::shared_ptr
.
The references are consistent, although boost.python makes the reference counts
different depending on whether the shared pointer is returned directly from C++
or converted from Python.
But boost.python will make us miserable, when we want to track the resource
using boost::weak_ptr
and initialize it using the share pointer returned
from boost.python. I create such a holder class (or see the full
file
):
1 2 3 4 5 6 7 8 9 | struct Holder {
Holder(const SHARED_PTR<Resource> & resource) : m_weak(resource) {}
size_t get_count() {
printf(" Holder.m_weak.use_count() = %ld\n", m_weak.use_count());
return m_weak.use_count();
}
private:
WEAK_PTR<Resource> m_weak;
};
|
While feeding it the resource from Python, we’ll lose the reference in the weak pointer:
1 2 3 4 5 6 7 8 9 10 11 | import ex_bpy_hold_weak as pymod
def pycount(obj): import sys; return sys.getrefcount(obj) - 3
# get shared pointers
objs = [pymod.Resource.make() for it in range(10)]; obj = objs[0]
assert 10 == pymod.Resource.get_count()
assert 2 == pycount(obj); print(" pycount(obj) =", pycount(obj))
holder = pymod.Holder(obj)
assert 0 == holder.get_count()
assert 1 == pymod.Resource.get_count_from_python("obj", obj)
del objs, obj
assert 0 == pymod.Resource.get_count()
|
Work around Using boost::enable_shared_from_this
#
Sometimes we just need a weak reference, but shared_ptr_delete
gets in our
way. A right resolution is to get into the tedious business of writing our own
from-python conversion.
But if you have the liberty of changing the C++ code,
boost::enable_shared_from_this
provides a workaround (or see the
full file
):
1 2 3 4 5 6 7 8 9 10 11 | struct Resource : public boost::enable_shared_from_this<Resource> {
private:
struct ctor_passkey{};
Resource(const ctor_passkey &) : boost::enable_shared_from_this<Resource>()
{ printf(" Resource %p constructed\n", this); }
static WEAK_PTR<Resource> g_weak;
};
WEAK_PTR<Resource> Resource::g_weak = WEAK_PTR<Resource>();
struct Holder {
Holder(const SHARED_PTR<Resource> & resource) : m_weak(resource->shared_from_this()) {}
};
|
By using boost::enable_shared_from_this
, the resource object knows where to
find the original reference counter, so we can recover it from the shared
pointer cooked by boost.python:
1 2 3 4 5 6 7 8 9 10 11 12 | import ex_bpy_shared_from_this as pymod
def pycount(obj): import sys; return sys.getrefcount(obj) - 3
# get a shared pointers
objs = [pymod.Resource.make() for it in range(10)]; obj = objs[0]
assert 10 == pymod.Resource.get_count()
assert 2 == pycount(obj); print(" pycount(obj) =", pycount(obj))
holder = pymod.Holder(obj)
assert 10 == holder.get_count()
assert 1 == pymod.Resource.get_count_from_python("obj", obj)
del objs, obj
assert 0 == pymod.Resource.get_count()
assert 0 == holder.get_count()
|
Since boost::enable_shared_from_this
is an expected companion to
boost::shared_ptr
, the hack usually works. It can also be applied in a
custom conversion to make it less hackish.
Boost.Python with std::shared_ptr
?#
std::shared_ptr
isn’t supported by boost.python, so generally we shouldn’t
use it. But because boost.python doesn’t provide special treatment for it, it
can be a specimen to see what would happen if boost.python doesn’t use the
deleter for the shared pointer. Let’s change the weak holder example to use
std::shared_ptr
(or see the full file
):
1 2 3 4 5 | #include <boost/python.hpp>
#include <stdlib.h>
#include <memory>
#define SHARED_PTR std::shared_ptr
#define WEAK_PTR std::weak_ptr
|
Because boost::python::converter::shared_ptr_delete
is not used for
std::shared_ptr
, boost.python transfers the correct reference count:
1 2 3 4 5 6 7 8 9 10 11 | import ex_bpy_std as pymod
def pycount(obj): import sys; return sys.getrefcount(obj) - 3
# get shared pointers
objs = [pymod.Resource.make() for it in range(10)]; obj = objs[0]
assert 10 == pymod.Resource.get_count()
assert 2 == pycount(obj); print(" pycount(obj) =", pycount(obj))
holder = pymod.Holder(obj)
assert 10 == holder.get_count()
assert 10 == pymod.Resource.get_count_from_python("obj", obj)
del objs, obj
assert 0 == pymod.Resource.get_count()
|
Note
Keep in mind that boost.python doesn’t support std::shared_ptr
. There
will be unexpected errors to use std::shared_ptr
as a boost.python holder
type.
Pybind11#
Pybind11 doesn’t have the issue of the missing weak reference, no matter it’s
std::shared_ptr
or boost::shared_ptr
. An equivalent pybind11 wrapper
is like (full code is in std::shared_ptr
and
boost::shared_ptr
):
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | #include <pybind11/pybind11.h>
namespace py = pybind11;
PYBIND11_DECLARE_HOLDER_TYPE(T, SHARED_PTR<T>);
PYBIND11_PLUGIN(ex_pyb_std) {
py::module mod("ex_pyb_std");
py::class_<Holder>(mod, "Holder")
.def(py::init<const SHARED_PTR<Resource> &>())
.def("get_count", &Holder::get_count)
;
py::class_<Resource, SHARED_PTR<Resource>>(mod, "Resource")
.def_static("make", &Resource::make)
.def_static("get_count", &Resource::get_count)
.def_static("get_count_from_python", &Resource::get_count_from_python)
;
return mod.ptr();
}
|
Pybind11 keeps different information through its casters. When
pybind11::return_value_policy::take_ownership
is used (which is what
return_value_policy::automatic
falls back to), it increases the
PyObject
reference count instead of creating a new PyObject
for each
shared pointer returned from C++.
1 2 3 4 5 6 7 8 9 10 11 12 | import ex_pyb_std as pymod
def pycount(obj): import sys; return sys.getrefcount(obj) - 3
# get shared pointers
objs = [pymod.Resource.make() for it in range(10)]; obj = objs[0]
assert 1 == pymod.Resource.get_count()
assert 11 == pycount(obj); print(" pycount(obj) =", pycount(obj))
holder = pymod.Holder(obj)
assert 1 == holder.get_count()
assert 2 == pymod.Resource.get_count_from_python("obj", obj)
del objs, obj
assert 0 == pymod.Resource.get_count()
assert 0 == holder.get_count()
|
Note
pybind11::return_value_policy::take_ownership
is what
pybind11::return_value_policy::automatic
falls back to, so essentially
it’s the default policy.
Shared pointers are a convenient way to transfer ownership of resources between Python and C++. A comprehensive wrapping tool like boost.python and pybind11 supports bi-directional transfer. But things can be tricky when advanced operations are performed, depending on the implementation of the wrapper library and the shared pointer. Sometimes, writing custom conversion or cast code makes it more straight-forward.