numpy.rst revision 11986:c12e4625ab56
1.. _numpy: 2 3NumPy 4##### 5 6Buffer protocol 7=============== 8 9Python supports an extremely general and convenient approach for exchanging 10data between plugin libraries. Types can expose a buffer view [#f2]_, which 11provides fast direct access to the raw internal data representation. Suppose we 12want to bind the following simplistic Matrix class: 13 14.. code-block:: cpp 15 16 class Matrix { 17 public: 18 Matrix(size_t rows, size_t cols) : m_rows(rows), m_cols(cols) { 19 m_data = new float[rows*cols]; 20 } 21 float *data() { return m_data; } 22 size_t rows() const { return m_rows; } 23 size_t cols() const { return m_cols; } 24 private: 25 size_t m_rows, m_cols; 26 float *m_data; 27 }; 28 29The following binding code exposes the ``Matrix`` contents as a buffer object, 30making it possible to cast Matrices into NumPy arrays. It is even possible to 31completely avoid copy operations with Python expressions like 32``np.array(matrix_instance, copy = False)``. 33 34.. code-block:: cpp 35 36 py::class_<Matrix>(m, "Matrix") 37 .def_buffer([](Matrix &m) -> py::buffer_info { 38 return py::buffer_info( 39 m.data(), /* Pointer to buffer */ 40 sizeof(float), /* Size of one scalar */ 41 py::format_descriptor<float>::format(), /* Python struct-style format descriptor */ 42 2, /* Number of dimensions */ 43 { m.rows(), m.cols() }, /* Buffer dimensions */ 44 { sizeof(float) * m.rows(), /* Strides (in bytes) for each index */ 45 sizeof(float) } 46 ); 47 }); 48 49The snippet above binds a lambda function, which can create ``py::buffer_info`` 50description records on demand describing a given matrix. The contents of 51``py::buffer_info`` mirror the Python buffer protocol specification. 52 53.. code-block:: cpp 54 55 struct buffer_info { 56 void *ptr; 57 size_t itemsize; 58 std::string format; 59 int ndim; 60 std::vector<size_t> shape; 61 std::vector<size_t> strides; 62 }; 63 64To create a C++ function that can take a Python buffer object as an argument, 65simply use the type ``py::buffer`` as one of its arguments. Buffers can exist 66in a great variety of configurations, hence some safety checks are usually 67necessary in the function body. Below, you can see an basic example on how to 68define a custom constructor for the Eigen double precision matrix 69(``Eigen::MatrixXd``) type, which supports initialization from compatible 70buffer objects (e.g. a NumPy matrix). 71 72.. code-block:: cpp 73 74 /* Bind MatrixXd (or some other Eigen type) to Python */ 75 typedef Eigen::MatrixXd Matrix; 76 77 typedef Matrix::Scalar Scalar; 78 constexpr bool rowMajor = Matrix::Flags & Eigen::RowMajorBit; 79 80 py::class_<Matrix>(m, "Matrix") 81 .def("__init__", [](Matrix &m, py::buffer b) { 82 typedef Eigen::Stride<Eigen::Dynamic, Eigen::Dynamic> Strides; 83 84 /* Request a buffer descriptor from Python */ 85 py::buffer_info info = b.request(); 86 87 /* Some sanity checks ... */ 88 if (info.format != py::format_descriptor<Scalar>::format()) 89 throw std::runtime_error("Incompatible format: expected a double array!"); 90 91 if (info.ndim != 2) 92 throw std::runtime_error("Incompatible buffer dimension!"); 93 94 auto strides = Strides( 95 info.strides[rowMajor ? 0 : 1] / sizeof(Scalar), 96 info.strides[rowMajor ? 1 : 0] / sizeof(Scalar)); 97 98 auto map = Eigen::Map<Matrix, 0, Strides>( 99 static_cat<Scalar *>(info.ptr), info.shape[0], info.shape[1], strides); 100 101 new (&m) Matrix(map); 102 }); 103 104For reference, the ``def_buffer()`` call for this Eigen data type should look 105as follows: 106 107.. code-block:: cpp 108 109 .def_buffer([](Matrix &m) -> py::buffer_info { 110 return py::buffer_info( 111 m.data(), /* Pointer to buffer */ 112 sizeof(Scalar), /* Size of one scalar */ 113 /* Python struct-style format descriptor */ 114 py::format_descriptor<Scalar>::format(), 115 /* Number of dimensions */ 116 2, 117 /* Buffer dimensions */ 118 { (size_t) m.rows(), 119 (size_t) m.cols() }, 120 /* Strides (in bytes) for each index */ 121 { sizeof(Scalar) * (rowMajor ? m.cols() : 1), 122 sizeof(Scalar) * (rowMajor ? 1 : m.rows()) } 123 ); 124 }) 125 126For a much easier approach of binding Eigen types (although with some 127limitations), refer to the section on :doc:`/advanced/cast/eigen`. 128 129.. seealso:: 130 131 The file :file:`tests/test_buffers.cpp` contains a complete example 132 that demonstrates using the buffer protocol with pybind11 in more detail. 133 134.. [#f2] http://docs.python.org/3/c-api/buffer.html 135 136Arrays 137====== 138 139By exchanging ``py::buffer`` with ``py::array`` in the above snippet, we can 140restrict the function so that it only accepts NumPy arrays (rather than any 141type of Python object satisfying the buffer protocol). 142 143In many situations, we want to define a function which only accepts a NumPy 144array of a certain data type. This is possible via the ``py::array_t<T>`` 145template. For instance, the following function requires the argument to be a 146NumPy array containing double precision values. 147 148.. code-block:: cpp 149 150 void f(py::array_t<double> array); 151 152When it is invoked with a different type (e.g. an integer or a list of 153integers), the binding code will attempt to cast the input into a NumPy array 154of the requested type. Note that this feature requires the 155:file:``pybind11/numpy.h`` header to be included. 156 157Data in NumPy arrays is not guaranteed to packed in a dense manner; 158furthermore, entries can be separated by arbitrary column and row strides. 159Sometimes, it can be useful to require a function to only accept dense arrays 160using either the C (row-major) or Fortran (column-major) ordering. This can be 161accomplished via a second template argument with values ``py::array::c_style`` 162or ``py::array::f_style``. 163 164.. code-block:: cpp 165 166 void f(py::array_t<double, py::array::c_style | py::array::forcecast> array); 167 168The ``py::array::forcecast`` argument is the default value of the second 169template parameter, and it ensures that non-conforming arguments are converted 170into an array satisfying the specified requirements instead of trying the next 171function overload. 172 173Structured types 174================ 175 176In order for ``py::array_t`` to work with structured (record) types, we first need 177to register the memory layout of the type. This can be done via ``PYBIND11_NUMPY_DTYPE`` 178macro which expects the type followed by field names: 179 180.. code-block:: cpp 181 182 struct A { 183 int x; 184 double y; 185 }; 186 187 struct B { 188 int z; 189 A a; 190 }; 191 192 PYBIND11_NUMPY_DTYPE(A, x, y); 193 PYBIND11_NUMPY_DTYPE(B, z, a); 194 195 /* now both A and B can be used as template arguments to py::array_t */ 196 197Vectorizing functions 198===================== 199 200Suppose we want to bind a function with the following signature to Python so 201that it can process arbitrary NumPy array arguments (vectors, matrices, general 202N-D arrays) in addition to its normal arguments: 203 204.. code-block:: cpp 205 206 double my_func(int x, float y, double z); 207 208After including the ``pybind11/numpy.h`` header, this is extremely simple: 209 210.. code-block:: cpp 211 212 m.def("vectorized_func", py::vectorize(my_func)); 213 214Invoking the function like below causes 4 calls to be made to ``my_func`` with 215each of the array elements. The significant advantage of this compared to 216solutions like ``numpy.vectorize()`` is that the loop over the elements runs 217entirely on the C++ side and can be crunched down into a tight, optimized loop 218by the compiler. The result is returned as a NumPy array of type 219``numpy.dtype.float64``. 220 221.. code-block:: pycon 222 223 >>> x = np.array([[1, 3],[5, 7]]) 224 >>> y = np.array([[2, 4],[6, 8]]) 225 >>> z = 3 226 >>> result = vectorized_func(x, y, z) 227 228The scalar argument ``z`` is transparently replicated 4 times. The input 229arrays ``x`` and ``y`` are automatically converted into the right types (they 230are of type ``numpy.dtype.int64`` but need to be ``numpy.dtype.int32`` and 231``numpy.dtype.float32``, respectively) 232 233Sometimes we might want to explicitly exclude an argument from the vectorization 234because it makes little sense to wrap it in a NumPy array. For instance, 235suppose the function signature was 236 237.. code-block:: cpp 238 239 double my_func(int x, float y, my_custom_type *z); 240 241This can be done with a stateful Lambda closure: 242 243.. code-block:: cpp 244 245 // Vectorize a lambda function with a capture object (e.g. to exclude some arguments from the vectorization) 246 m.def("vectorized_func", 247 [](py::array_t<int> x, py::array_t<float> y, my_custom_type *z) { 248 auto stateful_closure = [z](int x, float y) { return my_func(x, y, z); }; 249 return py::vectorize(stateful_closure)(x, y); 250 } 251 ); 252 253In cases where the computation is too complicated to be reduced to 254``vectorize``, it will be necessary to create and access the buffer contents 255manually. The following snippet contains a complete example that shows how this 256works (the code is somewhat contrived, since it could have been done more 257simply using ``vectorize``). 258 259.. code-block:: cpp 260 261 #include <pybind11/pybind11.h> 262 #include <pybind11/numpy.h> 263 264 namespace py = pybind11; 265 266 py::array_t<double> add_arrays(py::array_t<double> input1, py::array_t<double> input2) { 267 auto buf1 = input1.request(), buf2 = input2.request(); 268 269 if (buf1.ndim != 1 || buf2.ndim != 1) 270 throw std::runtime_error("Number of dimensions must be one"); 271 272 if (buf1.size != buf2.size) 273 throw std::runtime_error("Input shapes must match"); 274 275 /* No pointer is passed, so NumPy will allocate the buffer */ 276 auto result = py::array_t<double>(buf1.size); 277 278 auto buf3 = result.request(); 279 280 double *ptr1 = (double *) buf1.ptr, 281 *ptr2 = (double *) buf2.ptr, 282 *ptr3 = (double *) buf3.ptr; 283 284 for (size_t idx = 0; idx < buf1.shape[0]; idx++) 285 ptr3[idx] = ptr1[idx] + ptr2[idx]; 286 287 return result; 288 } 289 290 PYBIND11_PLUGIN(test) { 291 py::module m("test"); 292 m.def("add_arrays", &add_arrays, "Add two NumPy arrays"); 293 return m.ptr(); 294 } 295 296.. seealso:: 297 298 The file :file:`tests/test_numpy_vectorize.cpp` contains a complete 299 example that demonstrates using :func:`vectorize` in more detail. 300