C++ and C for Python#

Expectation from Python#

Python is the choice of driving scripts for numerical calculations. Before introducing how to connect the low-level C++ and C code to the high-level Python, I would like to introduce how it looks at the high level.

Linear Wave#

Here is the governing equation of propagating linear waves:

\[\frac{\partial u}{\partial t} + \frac{\partial u}{\partial x} = 0\]

Assume a sinusoidal wave is given as the initial condition. Using the following code, we will see it propagating from left to right with the phase velocity of unity.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
import numpy as np
from matplotlib import pyplot as plt

# Import the extension module that is written in C++
import libst

# Build the one-dimensional uniform grid and the corresponding solver
grid = libst.Grid(0, 4*2*np.pi, 4*64)
cfl = 1
dx = (grid.xmax - grid.xmin) / grid.ncelm
dt = dx * cfl
svr = libst.LinearScalarSolver(grid=grid, time_increment=dt)

# Initialize the field using a sinusoidal
for e in svr.selms(odd_plane=False):
    if e.xctr < 2*np.pi or e.xctr > 2*2*np.pi:
        v = 0
        dv = 0
    else:
        v = np.sin(e.xctr)
        dv = np.cos(e.xctr)
    e.set_so0(0, v)
    e.set_so1(0, dv)

# Set up plotting
plt.figure(figsize=(15,10))
plt.xlim((0, 8))
plt.xlabel('$x$ $(\pi)$')
plt.ylabel('$u$')
plt.grid()

# Plot the initial condition
plt.plot(svr.xctr() / np.pi, svr.get_so0(0).ndarray, '-', label='begin')

# Time march
svr.setup_march()
svr.march_alpha2(50)

# Plot the time marched solution
plt.plot(svr.xctr() / np.pi, svr.get_so0(0).ndarray, '-', label='end')

plt.legend()

The full example code is in 01_linear.py. The plotted results are:

../../_images/01_linear.png

Inviscid Burgers Equation#

The second example is a non-linear equation (the inviscid Burgers equation):

\[\frac{\partial u}{\partial t} + u \frac{\partial u}{\partial x} = 0\]

The initial condition is still a sinusoidal wave. But unlike the linear equation, with the inviscid Burgers equation, the non-linear wave propagates in a very different way.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
import numpy as np
from matplotlib import pyplot as plt

# Import the extension module that is written in C++
import libst

# Build the one-dimensional uniform grid and the corresponding solver
res = 32
xcrd = np.arange(res+1) / res * 2 * np.pi

time_stop = 2*np.pi

grid = libst.Grid(xcrd)
cfl_max = 1.0
dx = (grid.xmax - grid.xmin) / grid.ncelm
dt_max = dx * cfl_max
nstep = int(np.ceil(time_stop / dt_max))
dt = time_stop / nstep
cfl = dt / dx

svr = libst.InviscidBurgersSolver(grid=grid, time_increment=dt)

# Initialize the field using a sinusoidal
svr.set_so0(0, np.sin(svr.xctr()))
svr.set_so1(0, np.cos(svr.xctr()))

# Set up plotting
plt.figure(figsize=(15,10))
plt.xlim((0, 2))
plt.xlabel('$x$ $(\pi)$')
plt.ylabel('$u$')
plt.grid()

# Plot the initial condition
plt.plot(svr.xctr() / np.pi, svr.get_so0(0).ndarray, '-', label='begin')

# Time march
svr.setup_march()
svr.march_alpha2(50)

# Plot the time marched solution
plt.plot(svr.xctr() / np.pi, svr.get_so0(0).ndarray, '-', label='end')

plt.legend()

The full example code is in 01_burgers.py. The plotted results are:

../../_images/01_burgers.png

pybind11 Build System#

pybind11 is a header-only C++ template library, that allows calling CPython API and provides C++ friendly semantics to allow Python to call C++ constructs and vise versa.

A header-only library doesn’t have anything to be built. When we say “building” pybind11, we mean to build the project that uses pybind11.

To build pybind11, we need CPython. It optionally depends on numpy and eigen. There are several suggested ways to build. Here list those I think important:

setuptools#

setuptools is an enhancement to Python built-in distutils. Because pybind11 is released on PyPI as a Python package (https://pypi.org/project/pybind11/), both setuptools and distutils can get the header files and use them to build C++ file that use pybind11.

There is an example for using setuptools to build pybind11:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
from setuptools import setup

# Available at setup time due to pyproject.toml
from pybind11.setup_helpers import Pybind11Extension, build_ext
from pybind11 import get_cmake_dir

import sys

__version__ = "0.0.1"

# The main interface is through Pybind11Extension.
# * You can add cxx_std=11/14/17, and then build_ext can be removed.
# * You can set include_pybind11=false to add the include directory yourself,
#   say from a submodule.
#
# Note:
#   Sort input source files if you glob sources to ensure bit-for-bit
#   reproducible builds (https://github.com/pybind/python_example/pull/53)

ext_modules = [
    Pybind11Extension("python_example",
        ["src/main.cpp"],
        # Example: passing in the version to the compiled code
        define_macros = [('VERSION_INFO', __version__)],
    ),
]

setup(
    name="python_example",
    version=__version__,
    author="Sylvain Corlay",
    author_email="sylvain.corlay@gmail.com",
    url="https://github.com/pybind/python_example",
    description="A test project using pybind11",
    long_description="",
    ext_modules=ext_modules,
    extras_require={"test": "pytest"},
    # Currently, build_ext only provides an optional "highest supported C++
    # level" feature, but in the future it may provide more features.
    cmdclass={"build_ext": build_ext},
    zip_safe=False,
)

The full example code is available in setup.py and main.cpp.

cmake with pybind11 in a Sub-Directory#

When the source tree is put in a sub-directory in your project, as mentioned in the document. you can use cmake add_subdirectory to include the pybind11 package:

cmake_minimum_required(VERSION 3.9)
project(example)

add_subdirectory(pybind11)
pybind11_add_module(example example.cpp)

pybind11 provides a cmake command pybind11_add_module. It sets various flags to build your C++ code as an extension module.

cmake with Installed pybind11#

If pybind11 is installed using cmake itself, the *.cmake files that pybind11 supplies are installed to the specified location. It is not necessary to write add_subdirectory in the CMakeLists.txt in your project.

Custom Wrapping Layer#

Wrapper needs to take care of the differences between the dynamic behaviors in Python and the staticity in C++. You can directly call pybind11 API, but a better way is to create an additional wrapping layer between the code that uses pybind11 and your library code. It allows to insert additional code in a systematic way. Since it is not easy to see the point in a small example, I pull in the code for a slightly bigger project (modmesh) for demonstration.

Intermediate Class Template#

Here is one way to implement the additional wrapping layer:

The base class template for custom wrappers.#
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
/**
 * Helper template for pybind11 class wrappers.
 */
template
<
    class Wrapper
  , class Wrapped
    /* The default holder type is a unique pointer. */
  , class Holder = std::unique_ptr<Wrapped>
  , class WrappedBase = Wrapped
>
class
SPACETIME_PYTHON_WRAPPER_VISIBILITY
WrapBase
{

public:

    // These aliases will be used in derived classes.
    using wrapper_type = Wrapper;
    using wrapped_type = Wrapped;
    using wrapped_base_type = WrappedBase;
    using holder_type = Holder;
    using base_type = WrapBase
    <
        wrapper_type
      , wrapped_type
      , holder_type
      , wrapped_base_type
    >;

    // This alias is to help the pybind11 code in this template.
    using class_ = typename std::conditional
    <
        std::is_same< Wrapped, WrappedBase >::value
      , pybind11::class_< wrapped_type, holder_type >
      , pybind11::class_< wrapped_type, wrapped_base_type, holder_type >
    >::type;

    static wrapper_type & commit
    (
        pybind11::module * mod
      , const char * pyname
      , const char * clsdoc
    )
    {
        static wrapper_type derived(mod, pyname, clsdoc);
        return derived;
    }

    WrapBase() = delete;
    WrapBase(WrapBase const & ) = default;
    WrapBase(WrapBase       &&) = delete;
    WrapBase & operator=(WrapBase const & ) = default;
    WrapBase & operator=(WrapBase       &&) = delete;
    ~WrapBase() = default;

// Make a macro for the wrapping functions by using perfect forwarding.
#define DECL_ST_PYBIND_CLASS_METHOD(METHOD) \
    template< class... Args > \
    wrapper_type & METHOD(Args&&... args) { \
        m_cls.METHOD(std::forward<Args>(args)...); \
        return *static_cast<wrapper_type*>(this); \
    }

    DECL_ST_PYBIND_CLASS_METHOD(def)
    DECL_ST_PYBIND_CLASS_METHOD(def_readwrite)
    DECL_ST_PYBIND_CLASS_METHOD(def_property)
    DECL_ST_PYBIND_CLASS_METHOD(def_property_readonly)
    DECL_ST_PYBIND_CLASS_METHOD(def_property_readonly_static)

// Delete the macro after it is not used anymore.
#undef DECL_ST_PYBIND_CLASS_METHOD

protected:

    WrapBase(pybind11::module * mod, const char * pyname, const char * clsdoc)
      : m_cls(*mod, pyname, clsdoc)
    {}

private:

    // This "class_" is the alias we made above, not directly the pybind11::class_.
    class_ m_cls;

}; /* end class WrapBase */

Wrappers for Data Classes#

The “turgon” code is built upon several data classes. The most basic one is the grid definition class Grid. Its wrapper is the simplest:

The custom wrapper class for the class Grid.#
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
class
SPACETIME_PYTHON_WRAPPER_VISIBILITY
WrapGrid
  : public WrapBase< WrapGrid, Grid, std::shared_ptr<Grid> >
{

    // Need this friendship to access the protected constructor in the base class.
    friend base_type;

    WrapGrid(pybind11::module * mod, const char * pyname, const char * clsdoc)
      : base_type(mod, pyname, clsdoc)
    {
        namespace py = pybind11;
        (*this)
            .def
            (
                py::init
                (
                    [](real_type xmin, real_type xmax, size_t nelm)
                    {
                        return Grid::construct(xmin, xmax, nelm);
                    }
                ),
                py::arg("xmin"), py::arg("xmax"), py::arg("nelm")
            )
            .def
            (
                py::init
                (
                    [](xt::pyarray<wrapped_type::value_type> & xloc)
                    {
                        return Grid::construct(xloc);
                    }
                ),
                py::arg("xloc")
            )
            .def("__str__", &detail::to_str<wrapped_type>)
            .def_property_readonly("xmin", &wrapped_type::xmin)
            .def_property_readonly("xmax", &wrapped_type::xmax)
            .def_property_readonly("ncelm", &wrapped_type::ncelm)
            .def_property_readonly("nselm", &wrapped_type::nselm)
            .def_property_readonly
            (
                "xcoord",
                static_cast<wrapped_type::array_type & (wrapped_type::*)()>
                (&wrapped_type::xcoord)
            )
            .def_property_readonly_static
            (
                "BOUND_COUNT"
              , [](py::object const &){ return Grid::BOUND_COUNT; }
            )
        ;
    }

}; /* end class WrapGrid */

When there are overloads in the C++ code, sometimes we may need to specify the function signature using static_cast like that in (highlighted) line 45. An alternate way is to use a lambda expression.

A slightly more complex wrapper is for the class Field. In (highlighted) line 19, a Grid is returned from the wrapper of Field.

The custom wrapper class for the class Field.#
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
class
SPACETIME_PYTHON_WRAPPER_VISIBILITY
WrapField
  : public WrapBase< WrapField, Field, std::shared_ptr<Field> >
{

    // Need this friendship to access the protected constructor in the base class.
    friend base_type;

    WrapField(pybind11::module * mod, const char * pyname, const char * clsdoc)
      : base_type(mod, pyname, clsdoc)
    {
        namespace py = pybind11;
        (*this)
            .def("__str__", &detail::to_str<wrapped_type>)
            .def_property_readonly
            (
                "grid"
              , [](wrapped_type & self){ return self.grid().shared_from_this(); }
            )
            .def_property_readonly("nvar", &wrapped_type::nvar)
            .def_property
            (
                "time_increment"
              , &wrapped_type::time_increment
              , &wrapped_type::set_time_increment
             )
            .def_property_readonly("dt", &wrapped_type::dt)
            .def_property_readonly("hdt", &wrapped_type::hdt)
            .def_property_readonly("qdt", &wrapped_type::qdt)
            .def
            (
                "celm",
                static_cast<Celm (wrapped_type::*)(sindex_type, bool)>
                (&wrapped_type::celm_at<Celm>)
              , py::arg("ielm"), py::arg("odd_plane")=false
            )
            .def
            (
                "selm",
                static_cast<Selm (wrapped_type::*)(sindex_type, bool)>
                (&wrapped_type::selm_at<Selm>)
              , py::arg("ielm"), py::arg("odd_plane")=false
            )
        ;
    }

}; /* end class WrapField */

Hierarchical Wrapper#

The “turgon” code defines a hierarchy of classes and wrapping them does not only require WrapBase, but also other class templates between WrapBase and the concrete wrappers.

For example, the following Solver and uses WrapSolverBase (which is not shown in the notes). Because WrapSolver does not directly inherit from WrapBase, it needs more aliases than the previous use cases.

The custom wrapper class for the class Solver.#
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
class
SPACETIME_PYTHON_WRAPPER_VISIBILITY
WrapSolver
  : public WrapSolverBase< WrapSolver, Solver >
{

    // The base class becomes more complex.
    using base_type = WrapSolverBase< WrapSolver, Solver >;
    using wrapper_type = typename base_type::wrapper_type;
    using wrapped_type = typename base_type::wrapped_type;

    // Need these friendships to access the protected constructor in the base class.
    friend base_type;
    friend base_type::base_type;

    WrapSolver(pybind11::module * mod, const char * pyname, const char * clsdoc)
      : base_type(mod, pyname, clsdoc)
    {
        namespace py = pybind11;
        (*this)
            .def
            (
                py::init
                (
                    static_cast
                    <
                        std::shared_ptr<wrapped_type> (*)
                        (
                            std::shared_ptr<Grid> const &
                          , typename wrapped_type::value_type
                          , size_t
                        )
                    >
                    (&wrapped_type::construct)
                )
              , py::arg("grid"), py::arg("time_increment"), py::arg("nvar")
            )
        ;
    }

}; /* end class WrapSolver */

Wrappers for Element Classes#

The following WrapCelm and WrapSelm are wrapper classes for elements in the “turgon” code. They use WrapCelmBase and WrapSelmBase (which are not shown in the notes), respectively.

The element wrappers are not very different from data wrappers, but we should keep in mind that there may be many more element objects than data objects in the system. The element objects are implemented as handles and their data are stored in the data objects.

The custom wrapper class for the class Celm.#
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
class
SPACETIME_PYTHON_WRAPPER_VISIBILITY
WrapCelm
  : public WrapCelmBase< WrapCelm, Celm >
{

    // The base class becomes more complex.
    using base_type = WrapCelmBase< WrapCelm, Celm >;
    // Need this friendship to access the protected constructor in the base class.
    friend base_type::base_type::base_type;

    WrapCelm(pybind11::module * mod, const char * pyname, const char * clsdoc)
      : base_type(mod, pyname, clsdoc)
    {
        namespace py = pybind11;
        (*this)
            ... wrapper code ...
        ;
    }

}; /* end class WrapCelm */
The custom wrapper class for the class and Selm.#
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
class
SPACETIME_PYTHON_WRAPPER_VISIBILITY
WrapSelm
  : public WrapSelmBase< WrapSelm, Selm >
{

    // The base class becomes more complex.
    using base_type = WrapSelmBase< WrapSelm, Selm >;
    // Need this friendship to access the protected constructor in the base class.
    friend base_type::base_type::base_type;

    WrapSelm(pybind11::module * mod, const char * pyname, const char * clsdoc)
      : base_type(mod, pyname, clsdoc)
    {
        namespace py = pybind11;
        (*this)
            ... wrapper code ...
        ;
    }

}; /* end class WrapCelm */

Define the Extension Module#

So far, we have used WrapperBase to save some duplicated code, but more can be saved. Another important use of it is to reduce the .cpp file used for the Python extension. The function template add_solver() (which is not shown in the notes) takes advantage of the commonality of the wrapper classes and significantly shortens the code.

C++ code to define extension module.#
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
#include "spacetime/python.hpp" // must be first
#include "spacetime.hpp"

PYBIND11_MODULE(_libst, mod)
{
    namespace spy = spacetime::python;
    spy::ModuleInitializer::get_instance()
        .add_solver
        <
            spy::WrapSolver, spy::WrapCelm, spy::WrapSelm
        >
        (&mod, "", "no equation")
        .add_solver
        <
            spy::WrapLinearScalarSolver
          , spy::WrapLinearScalarCelm
          , spy::WrapLinearScalarSelm
        >
        (&mod, "LinearScalar", "a linear scalar equation")
        .add_solver
        <
            spy::WrapInviscidBurgersSolver
          , spy::WrapInviscidBurgersCelm
          , spy::WrapInviscidBurgersSelm
        >
        (&mod, "InviscidBurgers", "the inviscid Burgers equation")
        .initialize(&mod)
    ;
}

pybind11 Wrapping API#

pybind11 provides API to wrap between C++ and Python.

Function and Property#

Let’s use the Grid class as an example to demonstrate how to expose functions and properties. We have a constructor:

.def
(
    py::init
    (
        [](real_type xmin, real_type xmax, size_t nelm)
        {
            return Grid::construct(xmin, xmax, nelm);
        }
    )
  , py::arg("xmin"), py::arg("xmax"), py::arg("nelm")
)

It allows creating the Grid object from Python:

>>> grid = libst.Grid(0, 8, 4*64)

By wrapping for the special function __str__():

.def("__str__", &detail::to_str<wrapped_type>)

It allows to support str for Grid:

>>> print('call str(Grid):', str(grid))
call str(Grid): Grid(xmin=0, xmax=8, ncelm=256)
>>> print('directly call Grid.__str__():', grid.__str__())
directly call Grid.__str__(): Grid(xmin=0, xmax=8, ncelm=256)

Define properties. pybind11 supports both instance properties and static properties:

.def_property_readonly("xmin", &wrapped_type::xmin)
.def_property_readonly("xmax", &wrapped_type::xmax)
.def_property_readonly_static
(
    "BOUND_COUNT"
  , [](py::object const &){ return Grid::BOUND_COUNT; }
)

Check the properties from the instance:

>>> print(grid.BOUND_COUNT)
2
>>> print(grid.xmin)
0.0
>>> print(grid.xmax)
8.0

Check the properties from the class:

>>> print(libst.Grid.BOUND_COUNT)
2
>>> print(libst.Grid.xmin)
<property object at 0x110e9ffb0>
>>> print(libst.Grid.xmax)
<property object at 0x110ea60b0>

Define a pure Python class that can be compared with the pybind11 wrapped class:

class PythonGrid:
    BOUND_COUNT = 2
    @property
    def xmin(self):
        return 0
    @property
    def xmax(self):
        return 8

Compare the execution results with that of the C++ Grid. They are identical:

>>> print(PythonGrid.BOUND_COUNT)
2
>>> print(PythonGrid.xmin)
<property object at 0x1112daad0>
>>> print(PythonGrid.xmax)
<property object at 0x1112dab30>

Here is a list of property-related API:

  • def_property_readonly and def_property_readonly_static for read-only properties with C++ accessors.

  • def_property and def_property_static for read/write properties with C++ accessors.

  • def_readonly and def_readonly_static for read-only access to C++ data members.

  • def_readwrite and def_readwrite_static for read/write access to C++ data members.

See the pybind11 document of Instance and static fields for more information.

Named and Keyword Arguments#

pybind11 allows named arguments. In the above example, we already take the advantage for wrapping the constructor of Grid`:

.def
(
    py::init
    (
        [](real_type xmin, real_type xmax, size_t nelm)
        {
            return Grid::construct(xmin, xmax, nelm);
        }
    )
  , py::arg("xmin"), py::arg("xmax"), py::arg("nelm")
)

It has been shown how the named arguments are used in Python:

>>> grid = libst.Grid(xmin=0, xmax=8, nelm=4*64)

See the pybind11 document of Keyword arguments for more information.

pybind11::arg also allows default value to the arguments (keyword arguments). The wrapper code of the class Solver has an example:

.def
(
    "selms"
  , [](wrapped_type & self, bool odd_plane)
    { return elm_iter_type(self.shared_from_this(), odd_plane, 0, true); }
  , py::arg("odd_plane")=false
)

Before seeing how it is used, we run some setup code:

grid = libst.Grid(0, 4*2*np.pi, 4*64)
cfl = 1
dx = (grid.xmax - grid.xmin) / grid.ncelm
dt = dx * cfl
svr = libst.LinearScalarSolver(grid=grid, time_increment=dt)

The argument odd_plane can be accepted in multiple forms. This uses the default value:

>>> print(svr.selms())
SolverElementIterator(selm, on_even_plane, current=0, nelem=257)

Pass the argument as positional:

>>> print(svr.selms(False))
SolverElementIterator(selm, on_even_plane, current=0, nelem=257)
>>> print(svr.selms(True))
SolverElementIterator(selm, on_odd_plane, current=0, nelem=256)

Pass the argument as keyword:

>>> print(svr.selms(odd_plane=False))
SolverElementIterator(selm, on_even_plane, current=0, nelem=257)
>>> print(svr.selms(odd_plane=True))
SolverElementIterator(selm, on_odd_plane, current=0, nelem=256)

See the pybind11 document of Default arguments for more information.

What Happens in Python Stays in Python (or pybind11)#

When wrapping from C++ to Python, there are constructs only available in the scripting language but not the low-level implementation. When it happens, write the adapting code in the pybind11 layer and do not pollute the low-level implementation.

One example is the Python iterator protocol. To adapt the C++ iterator to Python, an adapting class is created in the Python wrapping layer, along with other code that calls pybind11 API, and above the low-level C++ library in “turgon”.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
// The whole class is defined along with other code that calls pybind11 API
// and includes Python.h.
template< typename ST >
class SolverElementIterator
{

public:

    using solver_type = ST;

    SolverElementIterator() = delete;
    SolverElementIterator
    (
        std::shared_ptr<ST> sol
      , bool odd_plane
      , size_t starting
      , bool selm
    )
      : m_solver(std::move(sol))
      , m_odd_plane(odd_plane)
      , m_current(starting)
      , m_selm(selm)
    {}

    typename ST::celm_type next_celm()
    {
        size_t ncelm = m_solver->grid().ncelm();
        if (m_odd_plane) { --ncelm; }
        // Use pybind11 API:
        if (m_current >= ncelm) { throw pybind11::stop_iteration(); }
        typename ST::celm_type ret = m_solver->celm(m_current, m_odd_plane);
        ++m_current;
        return ret;
    }

    typename ST::selm_type next_selm()
    {
        size_t nselm = m_solver->grid().nselm();
        if (m_odd_plane) { --nselm; }
        // Use pybind11 API:
        if (m_current >= nselm) { throw pybind11::stop_iteration(); }
        typename ST::selm_type ret = m_solver->selm(m_current, m_odd_plane);
        ++m_current;
        return ret;
    }

    bool is_selm() const { return m_selm; }
    bool on_odd_plane() const { return m_odd_plane; }
    size_t current() const { return m_current; }
    size_t nelem() const
    {
        size_t ret = is_selm() ? m_solver->grid().nselm() : m_solver->grid().ncelm();
        if (m_odd_plane) { --ret; }
        return ret;
    }

private:

    std::shared_ptr<solver_type> m_solver;
    bool m_odd_plane;
    size_t m_current = 0;
    bool m_selm = false;

}; /* end class SolverElementIterator */

The wrapping code is:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
using elm_iter_type = SolverElementIterator<wrapped_type>;
std::string elm_pyname = std::string(pyname) + "ElementIterator";
pybind11::class_< elm_iter_type >(*mod, elm_pyname.c_str())
    .def("__str__", &detail::to_str<elm_iter_type>)
    .def("__iter__", [](elm_iter_type & self){ return self; })
    .def(
        "__next__"
      , [](elm_iter_type & self)
        {
            py::object ret;
            if (self.is_selm()) { ret = py::cast(self.next_selm()); }
            else                { ret = py::cast(self.next_celm()); }
            return ret;
        }
    )
;

Here we use a concrete solver of linear wave (governing equation is \(u_t + u_x = 0\)) to demonstrate how it works in Python:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
import numpy as np
from matplotlib import pyplot as plt

# Import the extension module that is written in C++
import libst

# Build the one-dimensional uniform grid and the corresponding solver
grid = libst.Grid(0, 4*2*np.pi, 4*64)
cfl = 1
dx = (grid.xmax - grid.xmin) / grid.ncelm
dt = dx * cfl
svr = libst.LinearScalarSolver(grid=grid, time_increment=dt)

# Initialize
# NOTE: 'selms' returns a template instance of SolverElementIterator
for e in svr.selms(odd_plane=False):
    if e.xctr < 2*np.pi or e.xctr > 2*2*np.pi:
        v = 0
        dv = 0
    else:
        v = np.sin(e.xctr)
        dv = np.cos(e.xctr)
    e.set_so0(0, v)
    e.set_so1(0, dv)

# Set up plotting
plt.figure(figsize=(15,10))
plt.xlim((0, 8))
plt.xlabel('$x$ $(\pi)$')
plt.ylabel('$u$')
plt.grid()

# Plot the initial condition
plt.plot(svr.xctr() / np.pi, svr.get_so0(0).ndarray, '-')

The code shows the initial condition of the linear wave:

../../_images/04_iter.png

The full example code is in 04_iter.py (which is the part of 01_linear.py that skips the final time marching).

pybind11 Operating API#

pybind11 does not only provide API to wrap between C++ and Python, but also C++ API for operating the Python interpreter and the some Python containers: tuple, list, and dict. See the document of Python types and the unit tests for more information.

Python Objects in C++#

pybind11 provides C++ API for manipulating Python object (the C struct PyObject) using the generic object protocol, so that we don’t need to dig into the Python C API and worry about the reference counting by hand.

The first example is to create a None object from C++:

#include "pybind11/pybind11.h"

PYBIND11_MODULE(code_none, m)
{
    namespace py = pybind11;

    m
        .def
        (
            "create_none", []() { return py::none(); }
        )
    ;
}

The results:

>>> print(type(create_none()))
<class 'NoneType'>
>>> assert None is create_none()
>>> print(create_none())
None

pybind11::object is the C++ counterpart of the C struct PyObject, and it does reference counting for us. The following example shows how to use pybind11::object to hold a None object:

#include "pybind11/pybind11.h"

PYBIND11_MODULE(code_object, m)
{
    namespace py = pybind11;

    m
        .def
        (
            "return_none"
          , []()
            {
                py::object ret = py::none();
                return ret;
            }
        )
    ;
}

The result:

>>> print(return_none, return_none())
<built-in method return_none of PyCapsule object at 0x1111b4300> None

pybind11 allows to use pybind11::object::attr() to assign attribute to a Python object:

#include "pybind11/pybind11.h"

PYBIND11_MODULE(code_attr, m)
{
    namespace py = pybind11;

    m.attr("string_name") = "string_content";
}

The result:

>>> print(type(string_name), string_name)
<class 'str'> string_content

Import Module Using pybind11#

pybind11 provides a helper, pybind::module::import(), to import Python module and access attributes of every Python object, including a Python module:

#include "pybind11/pybind11.h"

PYBIND11_MODULE(code1, m)
{
    namespace py = pybind11;

    m
        .def
        (
            "return_numpy_version"
          , []()
            {
                py::object numpy = py::module::import("numpy");
                return numpy.attr("__version__");
            }
        )
    ;

    m.attr("alias_to_return_numpy_version") = m.attr("return_numpy_version");
}

The results in the Python side are:

>>> print(return_numpy_version())
1.17.0.dev0+3c3ba10
>>> import numpy as np
>>> print(np.__version__)
1.17.0.dev0+3c3ba10
>>> assert np.__version__ is return_numpy_version()
>>> print(return_numpy_version)
<built-in method return_numpy_version of PyCapsule object at 0x1111b4060>
>>> print(alias_to_return_numpy_version)
<built-in method return_numpy_version of PyCapsule object at 0x1111b4060>

pybind11 for tuple#

To support Python tuple, pybind11 provides the C++ class pybind11::tuple. Since tuple is immutable, its creation should use pybind11::make_tuple():

#include "pybind11/pybind11.h"

PYBIND11_MODULE(code_tuple, m)
{
    namespace py = pybind11;

    py::tuple my_tuple = py::make_tuple("string_data_in_tuple", 10, 3.1415926);

    m.attr("my_tuple") = my_tuple;
}

The results in the Python side are:

>>> print(type(my_tuple), my_tuple)
<class 'tuple'> ('string_data_in_tuple', 10, 3.1415926)

pybind11 for list#

To support Python list, pybind11 provides the C++ class pybind11::list. It is mutable and the function pybind11::list::append() can be used for populating the container in the C++ side:

#include "pybind11/pybind11.h"

PYBIND11_MODULE(code2, m)
{
    namespace py = pybind11;

    py::list my_list = py::list();

    my_list.append("string_data_in_list");
    my_list.append(11);
    my_list.append(2.71828);

    py::list my_list2 = py::make_tuple("string_data_in_list2", 12);

    m.attr("my_list") = my_list;
    m.attr("my_list2") = my_list2;
}

The results in the Python side are:

>>> print(type(my_list), my_list)
<class 'list'> ['string_data_in_list', 11, 2.71828]

>>> print(type(my_list2), my_list2)
<class 'list'> ['string_data_in_list2', 12]

pybind11 for dict#

To support Python dict, pybind11 provides the C++ class pybind11::dict. The example in the C++ side:

#include "pybind11/pybind11.h"

PYBIND11_MODULE(code2, m)
{
    namespace py = pybind11;

    py::dict my_dict;

    my_dict["key_string"] = "string_data_in_dict";
    my_dict["key_int"] = 13;
    my_dict["key_real"] = 1.414;

    m.attr("my_dict") = my_dict;
}

The results in the Python side are:

>>> print(type(my_dict), my_dict)
<class 'dict'> {'key_string': 'string_data_in_dict', 'key_int': 13, 'key_real': 1.414}

CPython API with pybind11#

It is possible to use Python C API along with pybind11 and we will see how to do it. Please keep in mind that the examples here omit a lot of error checking code that is necessary for a system to run correctly. Consult the manual (Python/C API Reference Manual) when you need to use the C API.

When importing pybind11/pybind11.h, we don’t need to import Python.h, because the former does it for us. But please note that pybind11/pybind11.h or Python.h should be included before every other inclusion. The example code in the C++ side is:

#include "pybind11/pybind11.h"
#include "Python.h" // Unnecessary

using namespace pybind11;

PYBIND11_MODULE(ex_long, m)
{
    PyObject * v = PyLong_FromLong(2000000);
    m.attr("integer_value") = v;
    Py_DECREF(v);
}

The results in the Python side are:

>>> print(type(integer_value), integer_value)
<class 'int'> 2000000

Reference Counting#

The Python C API is more convenient for inspecting or debugging the PyObject reference counting than the pybind11 class pybind11::object that handles the reference count automatically:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
#include "pybind11/pybind11.h"

using namespace pybind11;

static PyObject * s;

PYBIND11_MODULE(ex_str, m)
{
    s = PyUnicode_FromString("string_from_c");
    m.attr("string_value") = s;
    Py_DECREF(s);
    m
        .def
        (
            "show_ref_count_with_handle"
          , [](handle const & h)
            {
                return Py_REFCNT(h.ptr());
            }
        )
        .def
        (
            "show_ref_count_with_object"
          , [](object const & o)
            {
                return Py_REFCNT(o.ptr());
            }
        )
        .def
        (
            "show_string_value_ref_count"
          , [&]()
            {
                return Py_REFCNT(s);
            }
        )
    ;
}

The test code in the Python side is:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
def check_string_value():
    print(type(string_value), string_value)
    print('before aliasing')
    print(show_ref_count_with_object(string_value), 'refcnt by object')
    print(show_ref_count_with_handle(string_value), 'refcnt by handle')
    print(sys.getrefcount(string_value), 'refcnt by sys')
    print(show_string_value_ref_count(), 'refcnt from c++')
    string_value_aliasing = string_value
    print('after aliasing')
    print(show_ref_count_with_object(string_value), 'refcnt by object')
    print(show_ref_count_with_handle(string_value), 'refcnt by handle')
    print(sys.getrefcount(string_value), 'refcnt by sys')
    print(show_string_value_ref_count(), 'refcnt from c++')

Check the results:

>>> check_string_value()
<class 'str'> string_from_c
before aliasing
7 refcnt by object
6 refcnt by handle
5 refcnt by sys
4 refcnt from c++
after aliasing
8 refcnt by object
7 refcnt by handle
6 refcnt by sys
5 refcnt from c++

pybind11 also offers two low-level short-hands for reference counting: handle::inc_ref() and handle::dec_ref(). If we don’t want to go so low-level, it provides function templates reinterpret_borrow() and reinterpret_steal().

Cached Value#

Python caches small (-5 to 256) integers (see the code). Don’t get surprised when you see a large reference count for some of them integers:

>>> print('ref counts of 0:', sys.getrefcount(0))
ref counts of 0: 10198
>>> print('ref counts of 257:', sys.getrefcount(257))
ref counts of 257: 3

Real number doesn’t have that cache:

>>> print(sys.getrefcount(0.0))
3

Python interns strings consisting of alphanumerical and underscore characters.

>>> print('' is '')
True
>>> print(sys.getrefcount(''))
5552

More examples for the string interning:

def check_string_intern():
    s1 = 'numerical'
    print(sys.getrefcount('numerical'))
    print(s1 is 'numerical')
    s2 = 'num' + 'erical'
    print(s1 is s2)
    print(sys.getrefcount('numerical'))

The results are:

>>> check_string_intern()
4
True
True
5

Attribute Access#

The Python object protocol defines a set of API for accessing object attributes. Here is a simple example that sets and gets an attribute of an object using the PyObject_SetAttr() and PyObject_GetAttr() API:

int PyObject_SetAttr(PyObject *o, PyObject *attr_name, PyObject *v);
PyObject* PyObject_GetAttr(PyObject *o, PyObject *attr_name);

Use pybind11 to write test code for the two API:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
#include "pybind11/pybind11.h"

using namespace pybind11;

void attach_attr(PyObject * o, PyObject * name, PyObject * attr)
{
    /*int ret =*/
    PyObject_SetAttr(o, name, attr);
}

PyObject * retrieve_attr(PyObject * o, PyObject * name)
{
    PyObject * ret = PyObject_GetAttr(o, name);
    return ret;
}

PYBIND11_MODULE(ex_attr, m)
{
    m
        .def
        (
            "attach_attr"
          , [](object & o, object & name, object & attr)
            {
                attach_attr(o.ptr(), name.ptr(), attr.ptr());
            }
        )
        .def
        (
            "retrieve_attr"
          , [](object & o, object & name)
            {
                handle(retrieve_attr(o.ptr(), name.ptr()));
            }
        )
    ;
}

Use a Python sample class:

class Cls():
    pass

First, build the test objects and show the reference count:

>>> obj = Cls()
>>> val = 'attached value'
>>> print(sys.getrefcount(val))
3

Second, attach val to obj and print the reference count:

>>> attach_attr(obj, 'name', val)
>>> print(sys.getrefcount(val))
4

Check the identity of the attached object (as name):

>>> print(obj.name is val)
True
>>> print(sys.getrefcount(val))
4

Test the C++ retrieval code:

>>> val2 = retrieve_attr(obj, 'name')
>>> print(sys.getrefcount(val))
5

There are shorthand versions of the API that takes C string for the attribute name: PyObject_SetAttrString() and PyObject_GetAttrString(). The example code is:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
#include "pybind11/pybind11.h"

#include <string>

using namespace pybind11;

void attach_attr(PyObject * o, char const * name, PyObject * attr)
{
    /*int ret =*/
    PyObject_SetAttrString(o, name, attr);
}

PyObject * retrieve_attr(PyObject * o, char const * name)
{
    PyObject * ret = PyObject_GetAttrString(o, name);
    return ret;
}

PYBIND11_MODULE(ex_attr_by_string, m)
{
    m
        .def
        (
            "attach_attr_by_string"
          , [](object & o, object & name, object & attr)
            {
                std::string name_str = cast<std::string>(name);
                attach_attr(o.ptr(), name_str.c_str(), attr.ptr());
            }
        )
        .def
        (
            "retrieve_attr_by_string"
          , [](object & o, object & name)
            {
                std::string name_str = cast<std::string>(name);
                handle(retrieve_attr(o.ptr(), name_str.c_str()));
            }
        )
    ;
}

Test again and the results are the same. First, build the test objects and show the reference count:

>>> obj = Cls()
>>> val = 'attached value'
>>> print(sys.getrefcount(val))
3

Second, attach val to obj and print the reference count:

>>> attach_attr_by_string(obj, 'name', val)
>>> print(sys.getrefcount(val))
4

Check the identity of the attached object (as name):

>>> print(obj.name is val)
True
>>> print(sys.getrefcount(val))
4

Test the C++ retrieval code:

>>> val2 = retrieve_attr_by_string(obj, 'name')
>>> print(sys.getrefcount(val))
5

Function Call#

Python C API allows to make Python function call from C. The follow C++ code takes a Python callable and use PyObject_Call():

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
#include "pybind11/pybind11.h"

#include <string>

using namespace pybind11;

PyObject * function_call(PyObject * callable, PyObject * args, PyObject * kw)
{
    PyObject * ret = PyObject_Call(callable, args, kw);
    return ret;
}

PYBIND11_MODULE(ex_call, m)
{
    m
        .def
        (
            "function_call"
          , [](object & o, tuple & t, dict & kw)
            {
                return handle(function_call(o.ptr(), t.ptr(), kw.ptr()));
            }
        )
    ;
}

Use the example Python function:

def my_func(arg1, kw1='default'):
    return 'results: {}, {}'.format(arg1, kw1)

See the results by calling using only positional arguments:

>>> print('(direct call)  ', my_func('first argument'))
(direct call)   results: first argument, default
>>> print('(function_call)', function_call(my_func, ('first argument',), {}))
(function_call) results: first argument, default

See the results by calling using both positional and keyword arguments:

>>> print('(direct call)  ', my_func('first argument', kw1='non default'))
(direct call)   results: first argument, non default
>>> print('(function_call)', function_call(my_func, ('first argument',),
...       {'kw1': 'non default'}))
(function_call) results: first argument, non default

Import#

The Python C API for import a Python module is PyImport_ImportModule(). The C++ test code:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
#include "pybind11/pybind11.h"

#include <string>

using namespace pybind11;

PyObject * get_modules()
{
    PyObject * sysmod = PyImport_ImportModule("sys");
    PyObject * modules = PyObject_GetAttrString(sysmod, "modules");
    Py_DECREF(sysmod);
    return modules;
}

PYBIND11_MODULE(ex_import, m)
{
    m
        .def
        (
            "get_modules"
          , []()
            {
                PyObject * ret = get_modules();
                return handle(ret);
            }
        )
    ;
}

The results in the Python side are:

>>> modules = get_modules();
>>> print(type(modules), len(modules))
<class 'dict'> 1146

Python C API for tuple#

Here we use a simple C++ example to show how to create and operate tuple using the following Python C API in the tuple protocol:

The example code returns a new tuple that has the order reversed:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
#include "pybind11/pybind11.h"

#include <string>

using namespace pybind11;

PyObject * reverse_tuple(PyObject * tup)
{
    PyObject * ret = PyTuple_New(PyTuple_Size(tup));

    for (Py_ssize_t i = 0 ; i < PyTuple_Size(tup) ; ++i)
    {
        PyObject * item = PyTuple_GetItem(tup, i);
        Py_INCREF(item);
        PyTuple_SetItem(ret, i, item); // This only works when 1 == Py_REFCNT(ret)
    }

    return ret;
}

PYBIND11_MODULE(ex_tuple, m)
{
    m
        .def
        (
            "reverse_tuple"
          , [](tuple & t)
            {
                return handle(reverse_tuple(t.ptr()));
            }
        )
    ;
}

The results in the Python side are:

>>> tv0 = "value0"
>>> tv1 = object()
>>> tup = (tv0, tv1)
>>> print(sys.getrefcount(tv1))
3
>>> rtup = reverse_tuple(tup)
>>> print(sys.getrefcount(tv1))
4

Note

It is interesting to read the code implementing PyTuple_SetItem() for tuple that is immutable.

Python C API for list#

Here we use a simple C++ example to show how to create and operate list using the following Python C API in the list protocol:

It also shows the some C API of the iterator protocol:

  • PyObject_GetIter() obtains a Python iterator

  • python:PyObject_Next() obtains the next element from a Python iterator

The following C++ example code iterates through each element of the input list and return a shallow copy of that list:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
#include "pybind11/pybind11.h"

#include <string>

using namespace pybind11;

PyObject * make_list_from_iterator(PyObject * o)
{
    PyObject * iter = PyObject_GetIter(o);
    PyObject * ret = PyList_New(0);
    PyObject * item = nullptr;
    while (nullptr != (item = PyIter_Next(iter)))
    {
        PyList_Append(ret, item);
        Py_DECREF(item);
    }
    Py_DECREF(iter);
    return ret;
}

PYBIND11_MODULE(ex_list, m)
{
    m
        .def
        (
            "make_list_from_iterator"
          , [](object & o)
            {
                PyObject * ret = make_list_from_iterator(o.ptr());
                return handle(ret);
            }
        )
    ;
}
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
>>> v0 = 'first value'
>>> v1 = 'second value'
>>> tup = (v0, v1)
>>> print(sys.getrefcount(v1))
4
>>> lst = make_list_from_iterator(tup)
>>> print(sys.getrefcount(v1))
5
>>> print(lst)
['first value', 'second value']

Python C API for dict#

Here we use a simple C++ example to show how to create and operate dict using the following Python C API in the dict protocol:

The C++ example code create a dict and provides an alternate function for adding a key-value pair in it:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
#include "pybind11/pybind11.h"

#include <string>

using namespace pybind11;

PyObject * make_dict()
{
    PyObject * ret = PyDict_New();
    return ret;
}

void add_dict_item(PyObject * d, PyObject * k, PyObject * v)
{
    /*int ret =*/
    PyDict_SetItem(d, k, v);
}

PYBIND11_MODULE(ex_dict, m)
{
    m
        .def
        (
            "make_dict"
          , []()
            {
                return handle(make_dict());
            }
        )
        .def
        (
            "add_dict_item"
          , [](dict & d, object & k, object & v)
            {
                add_dict_item(d.ptr(), k.ptr(), v.ptr());
            }
        )
    ;
}

The results in the Python side are:

>>> d0 = {}
>>> d1 = make_dict()
>>> print(d0 is d1)
False
>>> print(d0 == d1)
True
>>> d0['k1'] = 'value1'
>>> print(d0)
{'k1': 'value1'}
>>> add_dict_item(d1, 'k1', 'value1')
>>> print(d1)
{'k1': 'value1'}
>>> print(d0 == d1)
True

Exception#

Here is a simple example for using Python exceptions from C++ (see also Exceptions and Exception Handling):

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
#include "pybind11/pybind11.h"

#include <string>

using namespace pybind11;

PyObject * function_with_exception(PyObject * o)
{
    // Normal processing code.
    PyObject * ret = nullptr;
    PyObject * item = nullptr;
    PyObject * iter = PyObject_GetIter(o);
    if (nullptr == iter) { goto error; }
    ret = PyList_New(0);
    if (nullptr == ret) { goto error; }
    while (nullptr != (item = PyIter_Next(iter)))
    {
        int status = PyList_Append(ret, item);
        Py_DECREF(item);
        if (0 != status) { goto error; }
    }
    Py_DECREF(iter);

    // Exception.
    PyErr_SetString(PyExc_RuntimeError, "intentional exception");

error: // A good use of goto: clean up.
    Py_XDECREF(iter);
    Py_XDECREF(ret);
    return nullptr;
}

PYBIND11_MODULE(ex_except, m)
{
    m
        .def
        (
            "function_with_exception"
          , [](object & o)
            {
                PyObject * ret = function_with_exception(o.ptr());
                if (nullptr == ret) { throw error_already_set(); }
                return handle(ret);
            }
        )
    ;
}

The exception-free results in the Python side are:

>>> try:
>>>     function_with_exception(1)
>>> except TypeError as e:
>>>     print(e)
>>> else:
>>>     print('error not raised')
'int' object is not iterable

The exception results in the Python side are:

>>> tup = ('first value', 'second value')
>>> try:
>>>     function_with_exception(('first value', 'second value'))
>>> except RuntimeError as e:
>>>     print(e)
>>> else:
>>>     print('error not raised')
intentional exception

Python Memory Management#

Python has its own memory manager. When writing Python extension, they should be used for PyObject. The memory managing system has three levels:

  1. Raw memory interface: wrapper to the C standard memory managers. It allows distinct addressed returned when requesting 0 byte. GIL is not involved.

  2. Normal memory interface: ‘pymalloc’ with small memory optimization. GIL is required when calling.

  3. Object memory interface: allocate for PyObject. GIL is required when calling.

The public API are:

void * PyMem_RawMalloc(size_t size);
void * PyMem_RawCalloc(size_t nelem, size_t elsize);
void * PyMem_RawRealloc(void *ptr, size_t new_size);
void   PyMem_RawFree(void *ptr);

void * PyMem_Malloc(size_t size);
void * PyMem_Calloc(size_t nelem, size_t elsize);
void * PyMem_Realloc(void *ptr, size_t new_size);
void   PyMem_Free(void *ptr);

void * PyObject_Malloc(size_t size);
void * PyObject_Calloc(size_t nelem, size_t elsize);
void * PyObject_Realloc(void *ptr, size_t new_size);
void   PyObject_Free(void *ptr);

In [Include/cpython/pymem.h](https://github.com/python/cpython/blob/v3.8.0/Include/cpython/pymem.h#L53), Python provides a struct and a set of API to switch to custom memory managers:

typedef struct {
    /* user context passed as the first argument to the 4 functions */
    void *ctx;

    /* allocate a memory block */
    void* (*malloc) (void *ctx, size_t size);

    /* allocate a memory block initialized by zeros */
    void* (*calloc) (void *ctx, size_t nelem, size_t elsize);

    /* allocate or resize a memory block */
    void* (*realloc) (void *ctx, void *ptr, size_t new_size);

    /* release a memory block */
    void (*free) (void *ctx, void *ptr);
} PyMemAllocatorEx;

/* Get the memory block allocator of the specified domain. */
PyAPI_FUNC(void) PyMem_GetAllocator(PyMemAllocatorDomain domain,
                                    PyMemAllocatorEx *allocator);

/* Set the memory block allocator of the specified domain.

   The new allocator must return a distinct non-NULL pointer when requesting
   zero bytes.

   For the PYMEM_DOMAIN_RAW domain, the allocator must be thread-safe: the GIL
   is not held when the allocator is called.

   If the new allocator is not a hook (don't call the previous allocator), the
   PyMem_SetupDebugHooks() function must be called to reinstall the debug hooks
   on top on the new allocator. */
PyAPI_FUNC(void) PyMem_SetAllocator(PyMemAllocatorDomain domain,
                                    PyMemAllocatorEx *allocator);

See the official documentation Customize Memory Allocators. The public API is wrappers to the functions populated in the struct, e.g.:

void *
PyMem_RawMalloc(size_t size)
{
    /*
     * Limit ourselves to PY_SSIZE_T_MAX bytes to prevent security holes.
     * Most python internals blindly use a signed Py_ssize_t to track
     * things without checking for overflows or negatives.
     * As size_t is unsigned, checking for size < 0 is not required.
     */
    if (size > (size_t)PY_SSIZE_T_MAX)
        return NULL;
    return _PyMem_Raw.malloc(_PyMem_Raw.ctx, size);
}

Also see the code.

Small Memory Optimization#

Take a look at the documentation in the code. This is the ‘pymalloc’, and it uses 256 KB for allocation not greater than 512 bytes.

Tracemalloc#

Tracemalloc is a standard library that uses the custom memory manager to profile and debug Python memory use: tracemalloc — Trace memory allocations. We can follow the implementation to create more specific analysis.

Exercises#

  1. Use pybind11 to expose a memory buffer to Python as a numpy ndarray.

References#