Sharing Resources Between Python and C++#

Resource management is a key to a healthy program. No one wants to leak resources (or worse, crash the program by resource mismanagement), but taking care of resource life cycles isn’t a cheerful task. One great thing of Python and other high-level languages is that they can gratify us by allowing clean code like this:

def print_value(val):
  msg = "the value is %s" % val
  print(msg)
  # "msg" object gets automatically destroyed when the function returns

But the happy story ends when Python needs to talk to low-level code. If some data demand complex calculations, oftentimes Python is too slow. Low-level C/C++ will be required. The data usually are large, and it’s inefficient and sometimes impractical to make copies between the langauges. We’ll need to manage the resource life cycles across the language barrier.

I will discuss how shared pointers are used to share resources to and from Python in two C++ wrapping libraries boost.python and pybind11. The former is a powerful and stable wrapping tool and has been popular for a decade. The latter brings all the goodness from C++11 to the wrapping land.

Shared Pointers#

Shared pointers use reference counts to track the life cycle of an object. The tracked object is the resource to be managed. As long as a shared pointer exists, the reference count is positive, and then the resource is alive. When all shared pointers are destroyed to make the reference count decrease to zero, the resource object gets deleted.

C++11 provides a shared pointer library std::shared_ptr, but before the standardization, boost::shared_ptr was the most popular one. It probably still is. So far (version 1.62) boost.python only supports boost::shared_ptr. Both libraries allows to get the raw pointer and the reference count:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
// compile with -std=c++11 -stdlib=libc++ to get C++11 support
#include <stdlib.h>
#ifdef USE_BOOST_SHARED_PTR
  #include <boost/shared_ptr.hpp>
  #define SHARED_PTR boost::shared_ptr
#else
  #include <memory>
  #define SHARED_PTR std::shared_ptr
#endif
struct Resource {
  Resource() { printf("Resource %p constructed\n", this); }
  ~Resource() { printf("Resource %p destructed\n", this); }
};
int main(int argc, char ** argv) {
  auto obj = SHARED_PTR<Resource>(new Resource);
  printf("address %p, count %lu\n",
         obj.get(),      // getting the raw pointer
         obj.use_count() // getting the reference count
  );
  // output: address 0x7fa970c03420, count 1
  printf("reset obj\n");
  obj = nullptr; // this resets the shared_ptr and thus release the resource object.
  printf("program ends\n");
  return 0;
}

Note

I assume readers to be familiar with shared pointers. But if you are not, keep in mind that it’s not as innocent as it appears to be. It shouldn’t be used as a substitute for raw pointers. std::unique_ptr, on the other hand, is suggested to be used to replace a raw pointer whenever possible.

For example, one common mistake with shared pointers is duplicated reference counters, and the program will crash because of double free:

auto * resource = new Resource();
auto ref1 = SHARED_PTR<Resource>(resource);
// ooops, the duplicated counter will end up with double free
auto ref2 = SHARED_PTR<Resource>(resource);

The right way to do it is:

auto * resource = new Resource();
auto ref1 = SHARED_PTR<Resource>(resource);
// copy the first shared pointer to use its counter
auto ref2 = ref1;

Catches abound in share pointers. Be careful.

PyObject also uses reference counts in a similar way. So all we need to do is to let PyObject and the shared pointer know the reference count of each other, and we can freely exchange resources in both worlds. This is how things become interesting. We have two wrapping libraries and two shared pointer libraries, and their combinations give 3 different treatments: boost.python discriminates boost::shared_ptr and std::shared_ptr, while pybind11 sees both shared pointers the same.

Boost.Python with boost::shared_ptr#

Boost.python supports boost::shared_ptr as a holder type of a C++ object. Boost.python creates a conversion while class_<Resource, boost::shared_ptr<Resource>> is constructed. It works well but the conversion implementation may give us a bit surprise. To show it, I extend the shared pointer example with a boost.python wrapper:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
#include <boost/python.hpp>
#include <stdlib.h>
#include <boost/utility.hpp>
#include <boost/shared_ptr.hpp>
#include <boost/weak_ptr.hpp>
#define SHARED_PTR boost::shared_ptr
#define WEAK_PTR boost::weak_ptr
struct Resource {
private:
  struct ctor_passkey{};
  Resource(const ctor_passkey &) { printf("  Resource %p constructed\n", this); }
  static WEAK_PTR<Resource> g_weak;
public:
  Resource() = delete;
  Resource(Resource const & ) = delete;
  Resource(Resource       &&) = delete;
  Resource & operator=(Resource const & ) = delete;
  Resource & operator=(Resource       &&) = delete;
  ~Resource() { printf("  Resource %p destructed\n", this); }
  static SHARED_PTR<Resource> make() {
    if (g_weak.use_count()) {
      return g_weak.lock(); 
    } else {
      auto strong = SHARED_PTR<Resource>(new Resource(ctor_passkey()));
      g_weak = strong;
      return strong;
    }
  }
  static size_t get_count() {
    printf("  Resource.g_weak.use_count() = %ld\n", g_weak.use_count());
    return g_weak.use_count();
  }
};
WEAK_PTR<Resource> Resource::g_weak = WEAK_PTR<Resource>();
BOOST_PYTHON_MODULE(ex_bpy_to_python)
{
  using namespace boost::python;
  class_<Resource, SHARED_PTR<Resource>, boost::noncopyable>
  ("Resource", no_init)
    .def("make", &Resource::make)
    .staticmethod("make")
    .def("get_count", &Resource::get_count)
    .staticmethod("get_count")
  ;
}

The Resource class is turned to a singleton, and a weak pointer tracks the resource. When we need a (strong) reference to the resource, Resource::make() returns a shared pointer. Boost.python converts each of the returned boost::shared_ptr<Resource> objects into a PyObject. The C++ reference count is the same as the number of Python objects requested from C++.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
import ex_bpy_to_python as pymod
def pycount(obj): import sys; return sys.getrefcount(obj) - 3
print("before construction")
pymod.Resource.get_count()
print("construct the resource and return the shared pointers")
# get shared pointers
objs = [pymod.Resource.make() for it in range(10)]
assert 10 == pymod.Resource.get_count()
assert 1 == pycount(objs[0]); print("  pycount(obj) =", pycount(objs[0]))
print("delete Python objects")
del objs
assert 0 == pymod.Resource.get_count()
print("program ends")
"""output:
before construction
  Resource.g_weak.use_count() = 0
construct the resource and return the shared pointers
  Resource 0x100506b00 constructed
  Resource.g_weak.use_count() = 10
  pycount(obj) = 1
delete Python objects
  Resource 0x100506b00 destructed
  Resource.g_weak.use_count() = 0
program ends
"""

We’ll see something strange when the Python object is passed back into C++. I make a method show_count_from_python() to uncover (or see the full file):

1
2
3
4
  static size_t get_count_from_python(const std::string & name, const SHARED_PTR<Resource> & obj) {
    printf("  %s.use_count() = %ld .get() = %p\n", name.c_str(), obj.use_count(), obj.get());
    return obj.use_count();
  }

The C++ internal counter says there are 10 shared pointers to the resource, but the reference count of the boost::shared_ptr passed into show_count_from_python() is only 1!

1
2
3
4
5
6
7
8
9
import ex_bpy_from_python as pymod
def pycount(obj): import sys; return sys.getrefcount(obj) - 3
# get shared pointers
objs = [pymod.Resource.make() for it in range(10)]; obj = objs[0]
assert 10 == pymod.Resource.get_count()
assert 2 == pycount(obj); print("  pycount(obj) =", pycount(obj))
assert 1 == pymod.Resource.get_count_from_python("obj", obj)
del objs, obj
assert 0 == pymod.Resource.get_count()

Is there any chance that boost.python screws up boost::shared_ptr reference counts? By no means. What happens is boost.python creates “another type” of shared pointer pointing to the resource. The boost.python-made shared pointer doesn’t use the original shared pointer’s reference counter. Instead, it creates a deleter (boost::python::converter::shared_ptr_delete) that grabs a reference to the PyObject (which contains the original shared pointer) to be passed into C++:

struct shared_ptr_deleter {
    shared_ptr_deleter(handle<> owner);
    ~shared_ptr_deleter();
    void operator()(void const*);
    handle<> owner;
};

So the boost::shared_ptr passed into C++ has a different reference count to the original shared pointer. The life cycle of the resource is collaboratively managed by both boost::shared_ptr and PyObject.

boost::weak_ptr doesn’t buy Boost.Python#

So far we are good, as long as the resource is held by boost::shared_ptr. The references are consistent, although boost.python makes the reference counts different depending on whether the shared pointer is returned directly from C++ or converted from Python.

But boost.python will make us miserable, when we want to track the resource using boost::weak_ptr and initialize it using the share pointer returned from boost.python. I create such a holder class (or see the full file):

1
2
3
4
5
6
7
8
9
struct Holder {
  Holder(const SHARED_PTR<Resource> & resource) : m_weak(resource) {}
  size_t get_count() {
    printf("  Holder.m_weak.use_count() = %ld\n", m_weak.use_count());
    return m_weak.use_count();
  }
private:
  WEAK_PTR<Resource> m_weak;
};

While feeding it the resource from Python, we’ll lose the reference in the weak pointer:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
import ex_bpy_hold_weak as pymod
def pycount(obj): import sys; return sys.getrefcount(obj) - 3
# get shared pointers
objs = [pymod.Resource.make() for it in range(10)]; obj = objs[0]
assert 10 == pymod.Resource.get_count()
assert 2 == pycount(obj); print("  pycount(obj) =", pycount(obj))
holder = pymod.Holder(obj)
assert 0 == holder.get_count()
assert 1 == pymod.Resource.get_count_from_python("obj", obj)
del objs, obj
assert 0 == pymod.Resource.get_count()

Work around Using boost::enable_shared_from_this#

Sometimes we just need a weak reference, but shared_ptr_delete gets in our way. A right resolution is to get into the tedious business of writing our own from-python conversion.

But if you have the liberty of changing the C++ code, boost::enable_shared_from_this provides a workaround (or see the full file):

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
struct Resource : public boost::enable_shared_from_this<Resource> {
private:
  struct ctor_passkey{};
  Resource(const ctor_passkey &) : boost::enable_shared_from_this<Resource>()
  { printf("  Resource %p constructed\n", this); }
  static WEAK_PTR<Resource> g_weak;
};
WEAK_PTR<Resource> Resource::g_weak = WEAK_PTR<Resource>();
struct Holder {
  Holder(const SHARED_PTR<Resource> & resource) : m_weak(resource->shared_from_this()) {}
};

By using boost::enable_shared_from_this, the resource object knows where to find the original reference counter, so we can recover it from the shared pointer cooked by boost.python:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
import ex_bpy_shared_from_this as pymod
def pycount(obj): import sys; return sys.getrefcount(obj) - 3
# get a shared pointers
objs = [pymod.Resource.make() for it in range(10)]; obj = objs[0]
assert 10 == pymod.Resource.get_count()
assert 2 == pycount(obj); print("  pycount(obj) =", pycount(obj))
holder = pymod.Holder(obj)
assert 10 == holder.get_count()
assert 1 == pymod.Resource.get_count_from_python("obj", obj)
del objs, obj
assert 0 == pymod.Resource.get_count()
assert 0 == holder.get_count()

Since boost::enable_shared_from_this is an expected companion to boost::shared_ptr, the hack usually works. It can also be applied in a custom conversion to make it less hackish.

Boost.Python with std::shared_ptr?#

std::shared_ptr isn’t supported by boost.python, so generally we shouldn’t use it. But because boost.python doesn’t provide special treatment for it, it can be a specimen to see what would happen if boost.python doesn’t use the deleter for the shared pointer. Let’s change the weak holder example to use std::shared_ptr (or see the full file):

1
2
3
4
5
#include <boost/python.hpp>
#include <stdlib.h>
#include <memory>
#define SHARED_PTR std::shared_ptr
#define WEAK_PTR std::weak_ptr

Because boost::python::converter::shared_ptr_delete is not used for std::shared_ptr, boost.python transfers the correct reference count:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
import ex_bpy_std as pymod
def pycount(obj): import sys; return sys.getrefcount(obj) - 3
# get shared pointers
objs = [pymod.Resource.make() for it in range(10)]; obj = objs[0]
assert 10 == pymod.Resource.get_count()
assert 2 == pycount(obj); print("  pycount(obj) =", pycount(obj))
holder = pymod.Holder(obj)
assert 10 == holder.get_count()
assert 10 == pymod.Resource.get_count_from_python("obj", obj)
del objs, obj
assert 0 == pymod.Resource.get_count()

Note

Keep in mind that boost.python doesn’t support std::shared_ptr. There will be unexpected errors to use std::shared_ptr as a boost.python holder type.

Pybind11#

Pybind11 doesn’t have the issue of the missing weak reference, no matter it’s std::shared_ptr or boost::shared_ptr. An equivalent pybind11 wrapper is like (full code is in std::shared_ptr and boost::shared_ptr):

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
#include <pybind11/pybind11.h>
namespace py = pybind11;
PYBIND11_DECLARE_HOLDER_TYPE(T, SHARED_PTR<T>);
PYBIND11_PLUGIN(ex_pyb_std) {
  py::module mod("ex_pyb_std");
  py::class_<Holder>(mod, "Holder")
    .def(py::init<const SHARED_PTR<Resource> &>())
    .def("get_count", &Holder::get_count)
  ;
  py::class_<Resource, SHARED_PTR<Resource>>(mod, "Resource")
    .def_static("make", &Resource::make)
    .def_static("get_count", &Resource::get_count)
    .def_static("get_count_from_python", &Resource::get_count_from_python)
  ;
  return mod.ptr();
}

Pybind11 keeps different information through its casters. When pybind11::return_value_policy::take_ownership is used (which is what return_value_policy::automatic falls back to), it increases the PyObject reference count instead of creating a new PyObject for each shared pointer returned from C++.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
import ex_pyb_std as pymod
def pycount(obj): import sys; return sys.getrefcount(obj) - 3
# get shared pointers
objs = [pymod.Resource.make() for it in range(10)]; obj = objs[0]
assert 1 == pymod.Resource.get_count()
assert 11 == pycount(obj); print("  pycount(obj) =", pycount(obj))
holder = pymod.Holder(obj)
assert 1 == holder.get_count()
assert 2 == pymod.Resource.get_count_from_python("obj", obj)
del objs, obj
assert 0 == pymod.Resource.get_count()
assert 0 == holder.get_count()

Note

pybind11::return_value_policy::take_ownership is what pybind11::return_value_policy::automatic falls back to, so essentially it’s the default policy.

Shared pointers are a convenient way to transfer ownership of resources between Python and C++. A comprehensive wrapping tool like boost.python and pybind11 supports bi-directional transfer. But things can be tricky when advanced operations are performed, depending on the implementation of the wrapper library and the shared pointer. Sometimes, writing custom conversion or cast code makes it more straight-forward.